Preface
in the neurosciences and behavioral sciences and informs about relevant theory, methods, and research in these...
151 downloads
2084 Views
89MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Preface
in the neurosciences and behavioral sciences and informs about relevant theory, methods, and research in these two increasingly synergistic disciplines. The Handbook is designed to make neuroscience accessible to psychologists and other behavioral scientists with minimal background in biology or neuroscience, while at the same time offering information, constructs and approaches that will enhance the knowledge, teaching and research of active investigators at the intersection of psychology and neuroscience. In addition, the Handbook is designed to provide an accessible background in and to highlight currently active areas within the behavioral sciences for neuroscientists, who may have a minimal background in behavioral sciences. To accomplish the dual purposes of exposing behavioral scientists to neuroscience and neuroscientists to psychology, we have adopted a unique organization in this two volume Handbook. The first volume includes a Foundations section that features a set of chapters that provide brief introductions to major questions and approaches at distinct levels of analysis. These include topics such as the logical basis of integrative neuroscience, developmental processes, comparative approaches, biological rhythms, neuropharmacology, neuroendocrinology, neuroimmunology, neuroanatomy, neuropsychology, and functional neuroimaging. These chapters are overviews that provide the reader with a basic conceptual orientation to the types of approaches and measures available and how they can be applied and interpreted. These chapters provide the basic background to allow the reader to comprehend and evaluate the subsequent chapters of the handbook, as well as the broader neuroscience literature. The subsequent sections are comprised of groups of chapters organized around major psychological themes. In Volume 1, these include: Sensation and Perception, Attention and Cognition, and Learning and Memory. This organization carries over to Volume 2, with major sections being: Motivation and Emotion, Social Processes, Psychological Disorders and Health and Aging.
The notion that 100 billion neurons give rise to human behavior proved daunting up through the twentieth century because neuroscientists were limited by existing technologies to studying the properties of single neurons or small groups of neurons. Characterizing simple neural circuits has led to an understanding of a variety of sensory processes, such as the initial stages in vision, and relatively simple motor processes, such as the generation of locomotion patterns. However, unraveling the neural substrates of more complex behaviors, such as the ability of an animal to navigate in its environment, to pay attention to relevant events in its surroundings, to perceive and communicate mental states including the beliefs and desires of others and to form and maintain interpersonal and group relationships remains one of the major challenges for the neurosciences in the twenty-first century. In contrast to more elementary behaviors, these complex behavioral processes depend on interactions within elaborate networks extending across distinct brain structures. Elucidating the neural bases of complex behaviors, therefore, may require sophisticated approaches and methods that have only recently, or have yet to be, developed. These include the ability to record electrical brain activity with multi-electrode arrays in freely behaving animals or humans, neuroimaging methods that can noninvasively monitor brain activity, and an increasing cornucopia of technologies from molecular biology and genetics that allow investigators to analyze the cellular bases of behaviors. These approaches are not only revealing the underlying neurobiology of behavior, but are establishing the foundation for an understanding for the biological bases of a variety of physical and mental health problems. As a result of these developments, the neurosciences are reshaping the landscape of the behavioral sciences, and the behavioral sciences are of increasing importance to the neurosciences, especially for the rapidly expanding investigations into the highest level functions of the brain. The Handbook of Neurosciences for the Behavioral Sciences provides an introduction to graduate students and scholars xi
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. fpref.indd xi
8/17/09 1:44:30 PM
xii
Preface
Throughout the Handbook, the goal has been to integrate information across disciplines and levels of organization/ analyses, from the cellular/molecular to systems to behavioral/social levels. The organization of the Handbook thus avoids artificial dichotomies such as lower level vs. higher level processes, or neuroscience vs. behavioral science sections. Coverage of motor systems, for example, is included in the section on Attention and Cognition and discussion of somatovisceral function is included in the context of Motivation and Emotion, providing the reader with the broadest and most interdisciplinary perspectives on a given topic. Cross referencing across chapters has been emphasized, to underscore the fact that a neuroscientific perspective may illuminate connections across areas of study in psychology, where none may have traditionally been recognized.
fpref.indd xii
The intended audience for the Handbook is broad, including graduate students, psychologists and other behavioral scientists who seek knowledge and understanding about neuroscience. It is a resource for active behavioral neuroscientists as well as those with minimal background in the biological sciences. It should also be of interest to neuroscientists who want an introduction to contemporary psychological issues, presented in a neuroscientific context. None of this would be possible without the tremendous efforts and high quality of the chapter authors. We thank them all for their contributions and we hope you will find this Handbook of value in understanding the behavioral neurosciences. GARY G. BERNTSON JOHN T. CACIOPPO
8/17/09 1:44:30 PM
Chapter 1
Integrative Neuroscience for the Behavioral Sciences: Implications for Inductive Inference JOHN T. CACIOPPO AND GARY G. BERNTSON
Charles Darwin (1873) concluded his seminal treatise on the expression of emotions by noting the adaptive value of the structures and behaviors that are observed in humans and the utility of comparative studies to discern the origin of these structures and behaviors. Darwin’s emphasis on the biological and functional underpinnings of behavior influenced William James (1890), who opined as follows:
is associated with the behavioral element, but it does not address whether the behavioral element is caused by the neural element (Cacioppo & Tassinary, 1990). The second approach is illustrated by the case of a patient who had suffered a neurosyphilitic lesion in the front part of his brain and was being attended by the physician Paul Pierre Broca. The patient was known as “ Tan” because this was the only word he was left able to speak, but in other regards his mental processes and behavior appeared relatively normal. In a postmortem autopsy, Broca determined that Tan’s lesion was in the posterior third of the inferior frontal gyrus. This region became known as Broca’s area and was surmised to be the speech center of the brain based on the changes in behavior associated with damage to this region. This case illustrates the methodological approach of comparing differences in or manipulating neural elements to investigate their effects on cognition, emotion, or behavior—that is, the study of psychological or behavioral processes () as a function of neural processes (). In Bayesian terms, this can be specified as P(/). Stating it in this way highlights a limitation of this approach: It provides evidence that a neural element is sufficient to influence a behavioral element, but it does not address whether the neural element is necessary. Thus, these two approaches provide unique nonredundant information (Sarter, Bernston, & Cacioppo, 1996). For much of the twentieth century, the notion that 100 billion neurons gave rise to the human mind and behavior proved daunting, especially when one tried to say anything specific about this feat. To make this problem tractable, neuroscientists initially studied simple circuits and behaviors. The notion, illustrated for instance by Sherrington’s (1906) work on the integrative action of the nervous system based on his studies of spinal cord reflexes, was that fundamental principles governing the operation of neural circuits and how such mechanisms relate to behavior could be understood as well in simpler systems,
A science of the mind must reduce such complex manifestations {of behavior} to their elements. A science of the brain must point out the functions of its elements. A science of the relations of the mind and brain must show how the elementary ingredients of the former correspond to the elementary functions of the latter. (p. 28)
Several celebrated clinical cases of the nineteenth century illustrate, at least at a gross level, two distinct approaches to investigation of these elements and relationships. An Italian named Bertino suffered a head injury that left his frontal lobes partially exposed (Raichle, 2000). Angelo Mosso (1881), an Italian physiologist, observed a sudden increase in the magnitude of pulsations over the frontal lobes with the ringing of local church bells and the chiming of a clock signaling the time for required prayer. Based on these observations, Mosso posited that changes in blood flow were associated with changes in cognition. To test this hypothesis, Mosso asked Bertino to multiply 8 by 12, a task that was accompanied by an increase in brain pulsation. These observations set the stage for contemporary functional brain mapping using hemodynamic measurements (Raichle, 2000), and illustrate the general approach of manipulating cognition, emotion, or behavior to determine their effects on neural functions—() as a function of psychological or behavioral processes (). In Bayesian terms, this can be specified as P(/). Stating it in this way highlights a limitation of this approach: It provides evidence that a neural element or mechanism 3
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c01.indd 3
8/17/09 1:57:27 PM
4
Integrative Neuroscience for the Behavioral Sciences: Implications for Inductive Inference
assuming that the difference between simple and complex systems could be extrapolated from an established set of theories. The organization and function of simpler circuits can inform and be informed by the study of the behavior of the more complex circuits and organisms of which they are a part, but complex circuits cannot be explained on the basis of simpler ones alone. For instance, the phenotypic expression (e.g., behavior) of strains of mice with specific genes inactivated (i.e., knockout mice) has been known to depend on the genetic background (e.g., Gerlai, 1996); the effects of the social context, in contrast, were thought to be unimportant. Crabbe, Wahlsten, and Dudek (1999) demonstrated that the specific behavioral effects associated with a given knockout could vary dramatically across environmental contexts (e.g., experimenters, testing environments, laboratories). When these authors expanded the traditional approach to include multilevel integrative analyses of genetics, neural processes, and behavior, they observed new patterns of data that were not predictable based on what was known about the component elements. Stephen Jay Gould (1985) noted, “We often think, naively, that missing data are the primary impediments to intellectual progress—just find the right facts and all problems will dissipate. But barriers are often deeper and more abstract in thought” (p. 151). This observation is especially true in the investigation of complex systems, where patterns of data rather than single data points are important to grasp. Human behavior (including mental behavior) is the most complex system science has investigated. In this endeavor, as in all complex sciences, theory and empirical investigation must proceed together, each informing the other. As William James (1890) suggested, the need for theoretical models applies to psychological and behavioral elements (), biological and neural elements (), and the relations between the former and the latter. Whether manipulating or measuring neural mechanisms, or both, mapping these changes to behavior requires that tasks be well designed, which is to say the functional component processes must be well specified by some behavioral theory and body of empirical work. Theory and research in the neurosciences, however, are just as crucial to discerning patterns of integrative relations. Being inattentive to the properties, constraints, and operating possibilities of the neural mechanisms that underlie cognition, emotion, and behavior is inefficient, at best, because one misses the opportunity to build more plausible theories and to focus efforts more precisely on the constructs and interpretations that need to be considered. The resulting theories within and bridging across the behavioral sciences and the neurosciences are themselves only evolving approximations, but they provide an important means for advancing
c01.indd 4
understanding of the complex neural and behavioral mechanisms at work. The evolution of these theories is promoted by the open theoretical dialogue between the neurosciences and behavioral sciences. This Handbook is designed to make theory and research in the neurosciences accessible to psychologists and other behavioral scientists while also highlighting for neuroscientists currently active areas within the behavioral sciences and discussing information, constructs, and approaches at the intersection of the behavioral sciences and the neurosciences. Complex states tend to be multiply determined, and human behavior is certainly no exception. Nineteenthcentury neurologist John Hughlings Jackson emphasized the hierarchical structure of the central nervous system and the re-representation of functions at multiple levels of this neuraxis (Jackson, 1884/1958). Subsequent neuroanatomical investigations have revealed this structure to be more heterarchical than hierarchical (see Berntson et al., this volume), but Jackson’s point remains that information is processed concurrently at multiple levels of the organization within the nervous system. Primitive protective responses to aversive stimuli, for instance, exist at the level of the spinal cord, as in stereotypic flexor withdrawal responses to nociceptive stimulation. These protective reactions are expanded and embellished at higher levels of the nervous system (Berntson, Boysen, & Cacioppo, 1993). The evolutionary development of higher neural systems, such as the limbic system, endowed organisms with an expanded behavioral repertoire including escape reactions, aggressive responses, and even the ability to anticipate and avoid aversive encounters. Humans were not the first bipedal creatures or the first to use tools, but humans, apparently uniquely, contemplate the history of the earth, the reach of the universe, the origin of the species, the genetic blueprint of life, and the physical basis of their own unique mental existence. These feats are the result of evolution endowing humans not only with primitive, lower level adaptive reactions but also with the unmatched information-processing capacities of the cerebrum. At progressively higher levels of neural organization, there is a general expansion in the range and relational complexity of representational operations and contextual controls and in the breadth and flexibility of discriminative and adaptive responses (Berntson et al., 1993). The adaptive flexibility of higher level neural systems comes at a cost, however, given the finite information-processing capacity of neural circuits. Greater flexibility means a less rigid relationship between inputs and outputs; a greater range of information that must be integrated; and a slower, more serial-like mode of processing. Consequently, the evolutionary layering of higher processing levels onto lower substrates has adaptive advantage in that lower
8/17/09 1:57:27 PM
Mapping across Levels of Organization 5
and more efficient processing levels may continue to be expressed. The processes expressed at lower and higher levels of the neuraxis can also interact in myriad ways, as illustrated by the diabetic who can overcome the withdrawal reflex when self-injecting insulin and the professional golfer who chokes when putting on the final green with a major championship at stake. Statistical models now exist that include stochastic error terms at various hierarchical levels of aggregation that are applicable to the data matrices that span biological and behavioral levels of organization. Our goal here is not to review these statistical models (cf. Weinstein, Vaupel, & Wachter, 2008) but to provide a more generic discussion of the conceptual issues that arise when mapping between the neural and behavioral domains. The mappings of elements across levels of organization may begin as associations but must move to mechanisms to explicate the neural basis of psychological and behavioral processes. This progression can be hindered by inattention to the logic underlying inferences about the functional import of neural structures or processes simply because one is dealing with observable biological events, as doing so can yield simple and restricted descriptions of empirical relationships or erroneous interpretations of these relationships. In the remainder of this chapter, therefore, we review logical issues involved in mapping constructs across levels of organization.
MAPPING ACROSS LEVELS OF ORGANIZATION One of the simplest methods of mapping across neural () and behavioral () levels of organization is the correlative approach. There are notable success stories to illustrate this approach. For instance, Suomi and colleagues found that peer-raised monkeys, compared to monkeys raised by their biological mothers, exhibit more aggression and less grooming behaviors, and typically remain at the bottom of the social hierarchy (Suomi, 1999). These monkeys are further characterized by lower cerebrospinal fluid concentrations of 5-hydroxyindoleacetic acid (5-HIAA, a serotonin metabolite) than their mother-reared counterparts. Impulsive rhesus monkeys in the wild are also characterized by high cerebrospinal fluid 5-HIAA concentration levels and the behavioral tripartite of aggressiveness, infrequent grooming, and low standing in the social hierarchy (cf. Suomi, 1999). These correlational studies have contributed to productive research on serotonin transporter gene (5-HTT) polymorphisms, environmental influences, and behavior in monkeys and humans. Of course, not every association identified in a correlative approach proves robust or informative. The associations
c01.indd 5
uncovered in correlative research, especially atheoretical correlative investigations, run the risk of yielding false discoveries (i.e., nonreplicable associations), and this risk increases with the number of possible associations that are examined. False discovery rate techniques have been developed to help mitigate this problem, but these techniques do not eliminate the problem (cf. Munafo et al., 2003). The development and adoption of false discovery rate methods represents an advance in dealing with Type I error rates, however, because the cost of near-zero false discovery rates is a high false-negative rate. The ratio of the number of missed small discoveries to false discoveries can be substantially greater than 1 in studies of complex behavioral outcomes, and small associations can carry large theoretical ramifications. Therefore, the cost of missing important but small associations (Type II errors) can sometimes be greater than the cost of a Type I error. Independent replication, therefore, is of paramount importance. In addition, bioinformatics tools and multivariate techniques permit a reduction in the number of measures to more meaningful functional sets of measures. For instance, microarray studies can now be performed on hundreds of thousands of gene transcripts, but the upstream transcription control pathways are typically of greater interest than the individual gene expressions in these studies. Cole, Yan, Galic, Arevalo, and Zack (2005) introduced the Transcription Element Listening System (TELiS), which combines sequence-based analysis of gene regulatory regions with statistical prevalence analyses to identify transcription factor binding motifs that are overrepresented among the promoters of up- or downregulated genes. Cortisol can regulate a wide variety of physiological processes via nuclear hormone receptor–mediated control of gene transcription. Cortisol activation of the glucocorticoid receptor exerts broad anti-inflammatory effects by inhibiting pro-inflammatory signaling pathways. In longitudinal research on middle-aged and older adults, we found that perceptions of social isolation predict higher morning rises in cortisol the following day (Adam, Hawkley, Kudielko, & Cacioppo, 2006). Social isolation is also associated with increased risk of inflammation-mediated diseases. One possible explanation for inflammation-related disease in individuals with high cortisol levels involves impaired glucocorticoid receptor–mediated signal transduction that prevents the cellular genome from effectively “hearing” the anti-inflammatory signal sent by circulating glucocorticoids (Cole et al., 2007). Consistent with this hypothesis, a systematic examination of genome-wide transcriptional alterations in circulating leukocytes using TELiS showed increased expression of genes carrying pro-inflammatory elements and decreased expression of genes carrying anti-inflammatory glucocorticoid response
8/17/09 1:57:28 PM
6
Integrative Neuroscience for the Behavioral Sciences: Implications for Inductive Inference
elements in lonely relative to nonlonely middle-aged adults (Cole et al., 2007). Impaired transcription of glucocorticoid response genes and increased activity of pro-inflammatory transcription control pathways provide a functional genomic explanation for elevated risk of inflammatory disease in individuals who chronically perceive high levels of social isolation. In sum, a strength of the correlative approach is often in identifying associations that might be replicable and worthy of further study rather than in post hoc hypothesis testing. An important goal of scientific theory is to describe the causal interrelationships among factors, thereby explicating the mechanism responsible for an association. The correlative approach may generate elements (e.g., genes, neurophysiological circuits, cognitive processes) or contextual moderators that are candidates for this causal mechanism. The correlative approach may not indicate the nature of the specificity of the association across levels of organization. For convenience, consider the constructs or measures at each level of organization as elements within a domain or set, as William James (1890) suggested. The mapping between elements across such sets can take one of the following forms (see Figure 1.1): • A one-to-one relation, such that an element in one set or level of organization is associated with one and only one element in another set, and vice versa. An example of a one-to-one relation is neurons in V1 tuned to a given stimulus feature such as line orientation (see Hubel & Wiesel, 2005). • A one-to-many relation, meaning that an element of interest in one set is associated with multiple elements in another set. An example is visual perception, which proceeds along ventral and dorsal streams (see Chapter 11). • A many-to-one relation, meaning that two or more elements in one set are associated with one element in another set. (This differs from a one-to-many relation only when the order of the mapping across levels of organization—e.g., behavioral [e.g., cognitive] to biological—is specified.) An example is the finding that a particular movement or the observation of that particular movement activates the same neurons (termed mirror neurons) in area F5 of the monkey brain (see Chapter 16). • A many-to-many relation, meaning two or more elements in one set are associated with the same (or an overlapping) subset of elements in another set. Faces and objects may differ in terms of the region of maximal response in functional magnetic resonance imaging studies, but both faces and objects are associated with activation of multiple regions in the ventral temporal lobe (e.g., fusiform gyrus, parahippocampal gyrus; Haxby et al., 2001).
c01.indd 6
• A null relation, meaning there is no association between the specified element in a neural set and those observed in a behavioral set. Even though all elements in map into one or more elements in , not all elements in map into elements in , and a particular element of interest in may not map into the set of elements known or measured in . Association studies involving elements with a one-toone relation (absent confoundings and measurement error) produce high correlations, whereas association studies involving elements characterized by a null relation yield an essentially zero correlation. The strength of the association between elements across levels of organization can vary a great deal, however, for one-to-one, one-to-many, and many-to-one mappings, and a many-to-many mapping between two elements across levels of organization can produce correlation coefficients that are quite small, making them difficult to distinguish from a null relation unless the sample size is large or one or more elements are manipulated. Thus, the initial establishment of an association between elements across levels of organization through a correlative approach is typically not sufficient to determine the specificity of the mapping.
Psychological
Biological
One-to-One
One-to-Many
Many-to-One
Many-to-Many
Null
Figure 1.1 Possible relationships between elements in two adjacent levels of organization (domains). For illustrative purposes, these domains have been labeled Psychological and Biological.
8/17/09 1:57:28 PM
Toward Stronger Inferences in the Interpretation of Brain–Behavior Relationships 7
Why might it be important to go beyond thinking of associations to considering the nature of the relationship between elements at different levels of organization? First, it is important if one is to move efficiently from association to the specification of mechanisms. Second, the nature of the mappings between elements at different levels of organization determines the limits of interpretation one can draw about an association. Consider research in which a biological measure (e.g., activation of the anterior cingulate cortex as measured by fMRI) is shown to covary with a behavioral task (e.g., lying; Langleben et al., 2002). This established association may then be used to justify an interpretation of differences in the neural element (i.e., activation of the anterior cingulate) as evidence of differences in the behavioral element (i.e., lying). This form of inference can be problematic, however. Even if one knew that variations in lying were associated with corresponding variations in anterior cingulate activity, inferring lying based on anterior cingulate activity ignores the possibility that other antecedent conditions could also produce variations in anterior cingulate activity. That is, it ignores the specificity of the association or mapping to the construct about which one would like to draw the inference. Such errors, in turn, can slow theoretical development. It is tempting to suggest that these issues do not apply to genetics (or brain processes) because there is no doubt that they play a causal role in the production of complex behaviors. To say that genes are causal is not equivalent, however, to specifying which gene or set of genes is associated with and causal in a particular phenotypic expression or, for that matter, to specifying the mechanism by which associated genes might influence a particular phenotype. Gottesman and Gould (2003) suggested that the number of genes involved in a phenotype is directly related to both the complexity of the phenotype and the difficulty of genetic analysis (see also Butcher, Kennedy, & Plomin, 2006). Although difficult to discern, such causal linkages will be more easily resolved if attention is paid to the implications of the many-to-many mapping problem. The mapping between elements across levels of organization may become more complex (e.g., many-to-many) as the number of intervening levels of organization increases. The exception to this statement is when mappings among elements across adjacent levels of organization is one-to-one, but such mappings are atypical. Accordingly, the likelihood of complex and potentially obscure mappings increases as one fails to consider intervening levels of organization. Admittedly, it is not always obvious which of several levels of organization might be “adjacent,” except perhaps when level of organization refers to a temporal rather than
c01.indd 7
spatial scope. This caveat that mapping across levels of organization may be fostered by the incremental mapping of elements between proximal levels nevertheless may have heuristic value. For instance, endophenotypes such as neurocognitive deficits have proven to be valuable explanatory constructs between genes and psychiatric diseases (e.g., Gottesman & Gould, 2003; Nuechterlein, Robbins, & Einat, 2005), and in theory the same situation should apply to any mapping that goes from neural elements to complex behaviors. For this reason, we focus here on the mappings between two adjacent levels of organization. The issues raised about the mappings between adjacent levels of organization can be extended to any number of adjacent levels of organization.
TOWARD STRONGER INFERENCES IN THE INTERPRETATION OF BRAIN–BEHAVIOR RELATIONSHIPS There is an intuitive appeal to the view that a proper understanding of the neural substrates of cognition and behavior may be couched in terms of the selective activation of regions of the brain during particular behavioral tasks. Although progress has been made in this regard, the manner in which many inferences are drawn about the behavioral significance of localized brain activity is more complex than is sometimes assumed. A major goal of neuroscientific studies of behavior can be expressed as f(). That is, ultimately one wishes to specify the biological mechanisms responsible for various behavioral (including mental) phenomena. Contrasting tasks that are thought to differ in only one or more cognitive or behavioral operations () between the associated neural events () are sometimes interpreted as showing that neural structure (or process) is associated with behavioral operation . These data are also sometimes treated as revealing much the same information that would have been obtained had neural structure (or process) been stimulated or ablated and a consequent change in behavioral function observed. This form of interpretation reflects the explicit assumption that there is a fundamental localizability of specific behavioral operations and the implicit assumption that there is an isomorphism between and . Such observations may represent a starting point for a series of internally consistent propositions that lead to a general conclusion, but problems with the logic of this approach can lead this conclusion astray (Cacioppo & Tassinary, 1990). In particular, research has shown that when a behavioral element varies when a neural element is manipulated (or vice versa), this does not necessarily imply the existence of an isomorphism between these elements (Sarter, et al., 1996).
8/17/09 1:57:28 PM
8
Integrative Neuroscience for the Behavioral Sciences: Implications for Inductive Inference
For instance, causal hypotheses regarding a specific neural structure or process () underlying a cognitive or behavioral operation () are of the form f(). They necessarily imply that is always followed by but do not necessarily imply that is always preceded by . This is what it means for to have multiple (parallel) determinants. Furthermore, brain events () may be of interest to the extent that they index a cognitive operation or state so that the inferences are of the form f() rather than not (not ). Thus, one important aim of neural measurements can be specified by the conditional probability of given , or P(/) 1. The typical structure of investigations in which neural measurements are made, in contrast, can be written as the conditional probability of given , or P(/) x. For example, brain imaging techniques provide information about as a function of , but the conditional probabilities P(/) and P(/) are not equivalent unless there is a 1:1 relationship between brain structure and cognitive function . Most of the early discoveries of functional locationism that have had lasting impact involved the ablation or direct electrical stimulation of specific nuclei (i.e., P[/]), whereas the evidence for Gall’s localization theory was based primarily on relating cranial features to extreme behaviors (i.e., P[/]), which was interpreted to mean that individuals with these cranial features were destined toward these behaviors (i.e., P[/]). This latter interpretation does not follow from the form of the data on which it was based because P(/) does not equal, or even approximate, P(/) unless there is an isomorphism between and . Careful attention to the structure of inference, therefore, may contribute to a more productive dialogue between neuroscientists and behavioral scientists. Approaches such as stimulation and ablation studies and brain imaging research provide complementary rather than redundant information about the relationship between brain structures (or events) and cognitive functions. This is because stimulation and ablation studies bear on the relationship P(/), whereas brain imaging studies provide information about P(/). Despite the formal parallelism between these expressions, there is a fundamental asymmetry in the heuristic power of studies aimed at the demonstration of P(/) versus P(/). The causal role of in process can be examined in a straightforward fashion by direct experimental manipulation of . The loss of cognitive function by inactivation of neural processes can serve to establish as necessary for function . Moreover, addressing a more complex avenue of research, facilitation of cognitive functions by electrical or neurochemical brain activation can further establish that is a sufficient condition for function .
c01.indd 8
THE IMPORTANCE OF SPECIFYING CONTEXT: TAXONOMY OF MAPPINGS Tests, assays, or measured biological responses (i.e., a physiological event that exceeds some decision threshold) have two different but related sets of characteristics. Analytic sensitivity is the ability to consistently detect very low levels of the target analyte. Stated another way, sensitivity is the true-positive rate. Considering all true (T) and false (F) outcomes of the test, for both positive (P; analyte present) and negative (N; analyte absent) conditions, sensitivity TP/(TP FN). In contrast, analytic specificity refers to the ability of a test to selectively detect only the target analyte and not others; specificity TN/(TN FP). For instance, blood sugar levels will vary in a predictable fashion for several hours after one ingests a dosage of glucose. Deviations from the normative values in blood sugar level across time mark a possible problem in metabolism because the blood glucose tolerance test (a procedure for mapping the glucose–blood sugar association) is sensitive and specific as long as the appropriate testing procedures are followed (e.g., fasting prior to the test) to eliminate the other known influences on the observed blood sugar excursions over the course of the test. This illustrates how a mapping between elements in and can be simplified by paying attention to potential confounding and contextual factors, by which we mean specifically P(not /) and P(not /). Furthermore, the diagnostic value of any given (i.e., a measured response, as defined by some decision criteria such as corrected p .05) as a measure of depends not only on the sensitivity and specificity of the measured response but on the base rates for true positives and for true negatives. The sensitivity, defined quantitatively above, represents the true detection probability, and the specificity represents 1 minus the false detection probability. The chance that the measured response correctly indexes the targeted state is called the positive predictive value (PPV) and equals the fraction of detections that are true hits, that is, PPV TP/(TP FP). In contrast, the chance that the absence of this measured response (termed a negative screen in medicine) is correct is called the negative predictive value (NPV), where NPV TN/(TN FN). The properties of sensitivity and specificity and the PPV and NPV, of course, depend in part on the elements involved in the mapping. For instance, the adrenocortical hormone cortisol is released by the adrenal cortex under conditions of stress and hence is considered a stress hormone and is often used as a marker of stress. Assays with high sensitivity and specificity for cortisol are available and can be used to measure this hormone in plasma, urine, or saliva; these assays may provide an accurate measure
8/17/09 1:57:29 PM
The Importance of Specifying Context: Taxonomy of Mappings
The scientist is usually looking for invariance whether he knows it or not. Whenever he discovers a functional relation between two variables his next question follows naturally: under what conditions does it hold? In other words, under what transformation is the relation invariant? The quest for invariant relations is essentially the aspiration toward generality, and in psychology, as in physics, the principles that have wide application are those we prize. (p. 20)
Is the mapping between two elements across levels of organization universally generalizable, or is it moderated by other factors? If it is generalizable without qualification, then the association requires no attention to characteristics of the context or sample population; that is, the mapping has external validity. If external validity is absent, then the reason for this becomes a theoretically interesting question regarding P(not-/) or P(not-/). Invariant associations were once assumed, but statistical methods are now sufficiently developed to test for potential moderators (e.g., Baron & Kenny, 1986), and increasing attention is being paid to the operation of moderator variables. A taxonomy of associations between elements across levels of organization is summarized in Figure 1.2. The initial step is often to establish that variations in an element in
c01.indd 9
Context-Bound
Context-Free
Maker
Invariant
Outcome
Concomitant
Specificity
One-to-One
Generality
Many-to-One Many-to-Many
of adrenocortical activity. This represents a proximal mapping, as the adrenal cortex is the primary source of cortisol, although other factors (e.g., clearance) can also affect the measure. When cortisol is used as a marker of stress, however, the sensitivity, specificity, PPV, and NPV may all be quite different. In this case the mapping is more distal, as there are several mediating links between stress and cortisol secretion. Although potent stressors generally yield a cortisol response, the sensitivity of a cortisol assay for stress is considerably lower, as minor stresses may not trigger a measurable cortisol response. That is, there may be many more false negatives in the equation sensitivity TP/(TP FN). Moreover, the selectivity of a cortisol assay for stress is also considerably lower, as other variables may also impact cortisol release. Cortisol levels vary across the day and with activity, among other variables, so there may be more false positives in the equation specificity TN/(TN FP). The PPV and NPV depend on the threshold used to define a response, but it should be obvious that these values, too, will be lowered by poor sensitivity and specificity. Consequently, the utility of a cortisol assay for stress may be limited to more significant stressors, and to enhance specificity, extraneous variables that can impact cortisol must be taken into account statistically. Another dimension is the generality of the mapping. In his influential Handbook of Experimental Psychology, S. S. Stevens (1951) advised the following:
9
Figure 1.2 Taxonomy of mappings among elements between adjacent levels of organization.
one domain are associated with variations in an element in another, thereby establishing an association. An outcome is defined as a mapping in which multiple elements at one level of organization (e.g., biological) are related to an element at another level of organization (e.g., behavioral), and this many-to-one mapping may change across contexts. Initial association studies typically do not address issues of specificity or generality, and the treatment of such associations as invariants is premature. An invariant relationship refers to a universal isomorphic (one-to-one) mapping between elements across levels or organization (see Figure 1.2). Invariant mappings permit the inference of an element at one level of organization based on the measurement of its isomorphic element at another. A marker is defined as a one-to-one, nonuniversal (e.g., context-dependent) relationship between elements across levels of organization (see Figure 1.2). Many medical diagnostic tests that have sensitivity and specificity only if explicit procedures are followed to eliminate other influences are examples of markers. Inferences based on markers are similar to those for invariants as long as all other elements involved in the mapping are either experimentally or statistically controlled. Finally, a concomitant refers to a many-to-one but universal association between elements across levels of organization and is similar to outcomes, except that the latter is not universal. Outcome and concomitant mappings enable strong inferences to be drawn about theoretical constructs based only on hypothetico-deductive logic (Platt, 1964). Specifically, when two theoretical models differ in predictions regarding one or more outcomes or concomitants, then the logic of the experimental design allows theoretical
8/17/09 1:57:29 PM
10
Integrative Neuroscience for the Behavioral Sciences: Implications for Inductive Inference
inferences to be drawn about elements at one level of organization based on the measured elements in another. When a new effect or association is found not to generalize to specific contexts or individuals, concerns are typically expressed about the methodological differences between the studies. Such a finding raises several important questions, including whether the original association is replicable; and, if replicable, whether the diminution in effect size attributable to measurement issues (e.g., reliability, construct validity) or to the operation of a moderator variable. The latter is an important theoretical question. It is for this reason that careful attention to the psychometric properties of all measures, regardless of their level of organization, to ensure their reliability and validity (including construct validity) is especially important in the design and analysis of neuroscientific studies of behavioral processes.
SUMMARY Given the complexity of neural and behavioral processes, attention to the form of scientific reasoning can be especially important. Many behavioral processes are multiply determined. To the extent that this is the case, investigators who assume rather than establish an invariant relationship between elements in the behavioral and neural domains are at risk for predictably faulty interpretations. That is, the sensitivity and specificity of the mapping of neural elements into behavioral levels of organization may be context dependent, and paying attention to these issues improves the quality of inductive inferences. Interdisciplinary research that crosses neural and behavioral levels of organization raises issues about how might one productively think about concepts, hypotheses, theories, theoretical conflicts, and theoretical tests across levels of organization. Abstract constructs such as those developed by behavioral scientists provide a means of understanding highly complex activity without needing to specify each individual action of the simplest components, thereby providing an efficient means of describing the behavior of a complex system (e.g., working memory). Chemists who work with the periodic table on a daily basis nevertheless use recipes rather than the periodic table to cook, not because food preparation cannot be reduced to chemical expressions but because it is not cognitively efficient to do so. Reductionism, in fact, is one of several approaches for bettering science based on the value of data derived from distinct levels of organization to constrain and inspire the interpretation of data derived from other levels of organization. In reductionism, the whole is as important to study as are the parts, for only in examining the interplay across levels of organization can the underlying principles and mechanisms be ascertained.
c01.indd 10
Our goal in this chapter has been to outline a simple model to aid in thinking about elements from different levels of organization and the inferences drawn from observations of these elements. Contemporary work has demonstrated that theory and methods in the neurosciences can constrain and inspire behavioral hypotheses, foster experimental tests of otherwise indistinguishable theoretical explanations, and increase the comprehensiveness and relevance of behavioral theories. Several principles further suggest that comprehensive theories of behavior will be advanced by the joint consideration of multilevel integrative analyses that span neural and behavioral processes. One that we have discussed is the principle of multiple determinism, which specifies that a target event at one level of organization, but especially at molar or abstract (e.g., cognitive or behavioral) levels of organization, can have multiple antecedents within or across levels of organization. A corollary to this principle, the corollary of proximity, is that the mapping between elements across levels of organization becomes more complex (e.g., many-to-many) as the number of intervening levels of organization increases. An important implication of this corollary is that the likelihood of complex and potentially obscure mappings increases as one skips levels of organization. The principle of nonadditive determinism specifies that properties of the whole are not always readily predictable from the properties of the parts. For instance, the behavior of nonhuman primates following the administration of amphetamine or placebo can appear similar unless each primate’s position in the social hierarchy is considered. When this behavioral factor is taken into account, amphetamine is found to increase dominant behavior in primates high in the social hierarchy and to increase submissive behavior in primates low in the social hierarchy. Thus, the effects of physiological changes on behavior can appear unreliable until the analysis is extended across levels of organization. A strictly physiological (or behavioral) analysis, regardless of the sophistication of the measurement technology, may not reveal the orderly relationship that exists. The emergence of an orderly pattern of data when spanning levels of organization is one of the unique opportunities of neuroscientific investigations of behavior. Finally, the principle of reciprocal determinism specifies that there can be mutual influences between microscopic (e.g., biological) and macroscopic (e.g., social) factors in determining behavior. For example, not only has the level of testosterone in nonhuman male primates been shown to promote sexual behavior, but the availability of receptive females influences the level of testosterone in nonhuman primates. These principles illustrate that the mechanisms underlying mind and behavior may not be fully explained
8/17/09 1:57:30 PM
References 11
by a biological or a behavioral approach alone but rather may require a multilevel integrative analysis. The contributions to this volume are designed with this in mind.
Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001, September 28). Distributed and overlapping representations of face and objects in ventral temporal cortex. Science, 293, 2425–2430. Hubel, D. H., & Wiesel, T. N. (2005). Brain and visual perception: The story of a 25-year collaboration. Oxford, England: Oxford University Press.
REFERENCES Adam, E. K., Hawkley, L. C., Kudielka, B. M., & Cacioppo, J. T. (2006). Day-to-day dynamics of experience-cortisol associations in a population-based sample of older adults. Proceedings of the National Academy of Sciences, 103, 17058–17063. Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. Berntson, G. G., Boysen, S. T., & Cacioppo, J. T. (1993). Neurobehavioral organization and the cardinal principle of evaluative bivalence. Annals of the New York Academy of Sciences, 702, 75–102. Butcher, L. M., Kennedy, J. K. J., & Plomin, R. (2006). Generalist genes and cognitive neuroscience. Current Opinion in Neurobiology, 16, 1–7. Cacioppo, J. T., & Tassinary, L. G. (1990). Inferring psychological significance from physiological signals. American Psychologist, 45, 16–28. Cole, S. W., Hawkley, L. C., Arevalo, J. M., Sung, C. Y., Rose, R. M., & Cacioppo, J. T. (2007). Social regulation of gene expression in human leukocytes. Genome Biology, 8(9), R189.
James, W. (1890). The principles of psychology (Vol. 1). New York: Holt. Langleben, D. D., Schroeder, L., Maldjian, J. A., Gur, R. C., McDonald, S., Ragland, J. D., et al. (2002). Brain activity during simulated deception: An event-related functional magnetic resonance study. Neuroimage, 15, 727–732. Mosso, A. (1881). Ueber den Kreislauf des Blutes im menschlichen Gehirn [About the circulation of the blood in the human brain]. Leipzig, Germany: Veit. Munafo, M. R., Clark, T. G., Moore, L. R., Payne, E., Walton, R., & Fint, J. (2003). Genetic polymorphisms and personality in healthy adults: A systematic review and meta-analysis. Molecular Psychiatry, 8, 471–484. Nuechterlein, K. H., Robbins, T. W., & Einat, H. (2005). Distinguishing separable domains of cognition in human and animal studies: What separations are optimal for targeting interventions? Schizophrenia Bulletin, 31, 870–874. Platt, J. R. (1964, October 16). Strong inference. Science, 146, 347–353.
Cole, S. W., Yan, W., Galic, Z., Arevalo, J., & Zack, J. A. (2005). Expressionbased monitoring of transcription factor activity: The TELiS database. Bioinformatics, 21, 803–810.
Raichle, M. E. (2000). A brief history of human functional brain mapping. In A. W. Toga & J. C. Mazziotta (Eds.), Brain mapping: The systems (pp. 33–77). San Diego, CA: Academic Press.
Crabbe, J. C., Wahlsten, D., & Dudek, B. C. (1999, June 4). Genetics of mouse behavior: Interactions with laboratory environment. Science, 284, 1670–1672.
Sarter, M., Berntson, G. G., & Cacioppo, J. T. (1996). Brain imaging and cognitive neuroscience: Towards strong inference in attributing function to structure. American Psychologist, 51, 13–21.
Darwin, C. (1873). The expression of the emotions in man and animals. New York: Appleton.
Sherrington, C. S. (1906). The integrative action of the nervous system. New York: Scribner ’s.
Gerlai, R. (1996). Gene-targeting studies of mammalian behavior: Is it the mutation or the background genotype? Trends in Neurosciences, 19, 177–181.
Stevens, S. S. (1951). Handbook of experimental psychology. New York: Wiley.
Gottesman, I. I., & Gould, T. D. (2003). The endophenotype concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160, 636–645. Gould, S. J. (1985). The flamingo’s smile: Reflections in natural history. New York: Norton.
c01.indd 11
Jackson, J. H. (1958). Evolution and dissolution of the nervous system (Croonian Lectures). In J. Taylor (Ed.), Selected writings of John Hughlings Jackson. New York: Basic Books. (Original work published 1884.) Vol 2, pp. 3–92.
Suomi, S. (1999). Attachment in rhesus monkey. In J. Cassidy & P. Shaver (Eds.), Handbook of attachment: Theory, research, and clinical applications (pp. 181–197). New York: Guilford Press. Weinstein, M., Vaupel, J. W., & Wachter, K. W. (2008). Biosocial surveys. Washington, DC: The National Academies Press.
8/17/09 1:57:30 PM
Chapter 2
Developmental Neuroscience MYRON A. HOFER
revealed how changes in the expression patterns of genes provide a common basis for both developmental and evolutionary changes (S. Carroll, 2005). Developmental biology (as embryology) had long been isolated from evolutionary biology, and most scientists thought that differences between species were accounted for by differences between those species’ genes. But with the advent of rapid gene sequencing in the past few years, it is now appreciated that differences between species lie in how a common set of highly conserved genes is regulated during development to produce the extraordinary differences in animal body plans and behaviors ranging from those of flies and worms to humans. Evolution is now viewed as taking place through changes in the processes and course of development, and development, in turn, is now seen as a major source of novelty and of variation between individuals upon which selection can act in the course of evolution. Rapid progress in the molecular genetic mechanisms of early development in the past few years has revealed an unexpected plasticity in the regulation of gene expression that enables a relatively few conserved cellular processes to be linked together in a variety of potentially adaptive patterns in response to genetic mutation or to environmental change. This “facilitated” variation (Kirschner & Gerhart, 2005) is capable of generating the useful novelties that random mutation and selection by themselves have seemed far less capable of producing in the course of evolution. New findings have also shown that the plasticity of behavior development in response to environmental interactions and genetic change (mutation, recombination) is capable of generating novel forms of adaptive variation upon which selection can act. By exposing previously hidden (genomically silenced) genes to selection and by genetic “accommodation” through selection acting at multiple genetic sites over generations, novelties that were at first environmentally induced can gradually become independent of their initiating environments (West-Eberhard, 2003). This focus on changes in gene regulation as a central mechanism of development has led to a wave of new
The development of an adult organism from a single cell is one of the most familiar processes of nature. It has been studied by scientists for more than two centuries without the emergence of a generally accepted explanatory theory. Ironically, evolution, the other great historical process in biology, has been far more difficult to study, yet Darwin’s simple but powerful theory has guided scientists for 150 years. One reflection of the lack of an agreed-on set of developmental principles has been the persistence, in scientific as well as lay circles, of the seemingly endless nature versus nurture debate. Another is the gap that exists between the language, methods, and concepts used by scientists studying development at the psychological, behavioral, and the cellular/molecular levels. Less discussed than these issues, but more fundamental, has been the uncertainty about how evolution and development are related. The topic was opened with Ernest Haeckel’s (1892) resilient nineteenth-century formulation that “ontogeny recapitulates phylogeny” (pp. 422–544) that was finally laid to rest in 1977 with the publication of Ontogeny and Phylogeny by Stephen Jay Gould. In the past few decades, developmental psychobiology and, more recently, developmental neuroscience have made considerable progress in advancing an interdisciplinary approach to the study of development. New methods for the study of early behavior in animal model systems and in human fetuses have begun to build connections between events and concepts at the cellular/molecular, physiological, behavioral, and psychological levels. For it is in early development that one can best see how each of these levels of biological function emerge in sequence and come to work together.
EVOLUTION AND DEVELOPMENT In the past few years a new field, evolutionary developmental biology, or evo-devo, has emerged as new methods of functional genetic analysis and manipulation have 12
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c02.indd 12
8/18/09 4:57:20 PM
Evolution and Development 13
discoveries in the area of “epigenetic” processes involving the remodeling of chromatin, the protein matrix surrounding, supporting, and controlling the access of molecular regulatory signals to the DNA strands within chromosomes (Baylin & Schuebel, 2007). These molecular rearrangements, which take place through simple chemical processes such as methylation and acetylation, provide, for the first time, a direct cellular/molecular link between the outside environment and the genes of the organism during development, a concrete and specifiable locus for the much-sought-after gene–environment interaction. Developmental Selection: The Evolution of Development The emphasis in evo-devo is on the proposed role of development in evolution, but what implications does this new evolutionary role have for the concept of development? What competitive advantages did the evolution of multicellular development convey? What selective pressures have shaped development during its own evolution, and what can these tell scientists about the adaptive functions that they should expect modern-day developmental processes to carry out? The question of how and when development evolved is essentially the question of the origins of multicellularity. It leads to one of the last remaining mysteries in the fossil record of evolution, the Cambrian explosion (Gould, 1989; Knoll, 2003). In this abrupt change in the number and variety of marine fossils, examples of all major animal groups or phyla that exist today (e.g., vertebrates, mollusks,
Cell division
Sexual recombination
arthropods, worms) appeared abruptly approximately 500 million years ago. For the more than 3 billion years before that, only fossils of single-celled organisms have been found. Modern-day representatives of these ancient organisms, such as algae and amoebae, seem little changed in form since that time, but they have been found to possess all the characteristics of developing cells in multicellular animals (see Figure 2.1) such as rapid multiplication, migration, adhesion, the capacity for sexual as well as asexual reproduction, and the capacity for differentiation (e.g., from an amoeba to a flagellate form). These adaptive cellular/molecular mechanisms evolved in response to “signal” molecules in their environments that accompanied or produced changing conditions. What is missing in these protozoa is the organization of these cellular/molecular processes into a linear series of events involving many such cells so as to produce structures that grow in size, shape, and complexity of form and function (see Figure 2.2). This is called development, and it must have emerged very rapidly (in geologic time), as it was crucial for the appearance of multicellular life. The prerequisite for this emergence seems to have been a gradual increase in the complexity of the genome of unicellular animals. Evidence for this lies in the progressive appearance of unicellular lineages whose living members showed larger and larger numbers of introns (the noncoding segments of DNA that facilitate “shuffling” of gene positions along the DNA strand), a multiplication of promoters of gene activation, and the appearance of novel regulatory elements leading to a greater degree of interaction between genes. These genomic changes allowed for
Adhesion Migration
Sporulation
Apoptosis
Differentiation
Molecular signals
Environmental signals
Figure 2.1 Protozoan precursors of developmental processes. Note: Unicellular organisms evolved nearly all of the cellular/molecular mechanisms needed for multicellular development over the 3 billion years of life prior to the Cambrian explosion of diverse multicellular organisms
c02.indd Sec1:13
500 million years ago. These processes were selected for their capacity to enhance the adaptive capacity of these single-cell organisms to changes in the surrounding environment.
8/18/09 4:57:20 PM
14
Developmental Neuroscience
Note: The rapid evolution of metazoan (multicellular) organisms took place through a reorganization of the various individual cellular/molecular mechanisms of unicellular organisms (represented on the left; see also Figure 2.1) into the integrated, linear processes of development, a process driven by the selective advantage of larger and more complex multicellular
organisms capable of creating and exploiting novel ecological niches. During evolution, cells continued to be regulated by signal molecules in the intercellular environment of the developing metazoan organism, as they had been in unicellular organisms. A return to single cells marks the beginning of the next multicellular generation of metazoans, depicted on the far right.
the rapid emergence of a novel system of genetic control for all multicellular animals in which only a limited set of genes within each cell is expressed at any one time while the vast majority are silenced. Genes in the fertilized egg, or zygote, express signal proteins that diffuse into the intercellular space and then into neighboring cells, where they act as transcription factors regulating the genetic control of growth and function as well as the expression of a further wave of different intercellular signal proteins. The timing, amount, and structure of these waves of signal proteins create a virtual cascade of intercellular communication, and with it a complex pattern of growth and function over time. These developmental cascades specify and organize the construction of a multicellular organism and its complex array of functions, including behavior. Signal molecules in the intercellular environment continue to act to regulate the timing and nature of cellular changes, as they had in unicellular organisms. The development of all multicellular organisms is strikingly similar in its early stages. An egg, fertilized by one other smaller cell, divides repeatedly, forming a ball of similarly appearing cells, the zygote. Gradually, the zygote becomes a blastocyst, with two layers of cells: the endoderm inside and the ectoderm outside. These two sheets of cells become bent and folded, developing a radial symmetry. The development of the most primitive multicellular animals, such as sponges, hydra, and jellyfish, stops here. Behaviorally, these simple creatures show primarily local cellular “irritability” and generalized “flowing” body movements coordinated by nerve nets. But in the metazoan animals of the Cambrian period, an invagination of the hollow blastocyst occurs and a third layer of
cells, the mesoderm, forms between the two other layers. These layers bend to form shapes with symmetry in three planes, and they expand to form groups and sheets of cells that interact with groups of different cells deriving from the other layers. As these cells change in form and location, they differentiate into radically different cell types in different regions to become organs with multiple different functions. One of those organs, the nervous system, becomes the basis for the evolving complexity of animal behavior that so interests neuroscientists. There are good reasons to think that the selective advantage responsible for the evolution of this developmental plan for multicellular life was that it made possible the construction of novel, much larger, and more complex animals. These animals not only were able to use the previously dominant unicellular species as food but were capable of finding and even creating new ecological niches: territories, food sources, and ways of life that bacteria and protozoa were incapable of inhabiting, defending, or utilizing. Bonner (1993) defined development as the growth phase of the life cycle and demonstrated that its duration (and complexity) is highly correlated with the size (and complexity) of any given life form, ranging from bacteria to whales, each enabled to exploit larger and more complex niches. This view of development as construction is widely accepted and has too often led to an assumption that all developmental mechanisms are organized to carry out a single function: the building of a successful adult. The rapid evolution of development, however, was shaped not only by the selective advantages provided by the size and functional complexity of the adults that were constructed. Other developmental processes and functions
Figure 2.2 The evolution of development.
c02.indd Sec1:14
8/18/09 4:57:21 PM
Evolution and Development 15
had to be selected in order to adapt immature forms to the changing environments they inhabited as embryos, larvae, fetuses, newborns, and weanlings on their path to adulthood. The unique characteristics of larval forms in amphibians, and of intrauterine/placental structures and early nursing interactions in mammals, are examples of the “ontogenetic adaptations” that play such a prominent role in the development of animals that one can study today (Oppenheim, 1984). Furthermore, as the processes mediating the developmental functions of construction and adaptation evolved, they were also being shaped and modified by selection for their “evolvability” (R. Carroll, 2002). That is, processes that created potentially useful variations in the course of development and others that promoted the replication of a successful variant developmental path in the next generation were differentially selected. In the same way that genetic mechanisms for variation (recombination and mutation) and for heritability (the copying of DNA during cell division) were shaped by selection in the evolution of single cells, so a set of developmental mechanisms for “facilitating” variation and for transmitting successful variant paths to the next generation was shaped by selection in the evolution of development in multicellular organisms. Thus, the developmental processes that scientists study today have been organized and shaped in their evolution by the four components of what I have called developmental selection: construction, adaptation, variation, and inheritance (Bateson, Hofer, Oppenheim, & Wiedenmayer, 2007; Hofer, 2005). As with many other features of organisms, developmental processes have been co-selected; that is, processes selected for their contribution to the construction and ontogenetic adaptation of young animals have also been shaped by selection for their heritability and their capacity for variation (see Figure 2.3). For example, the embedding of embryos and fetuses in the internal environments of their mothers, and the sustained close physical interactions of newborns and infants with their parents (Rosenblum & Moltz, 1983), gain a new significance and interpretation if one thinks of them in terms of developmental selection. These features of development can now be viewed as inherited, transgenerational environments that act as a protective scaffolding and matrix for the construction of a descendant in the next generation and as a source of potentially useful variation in the future. In this way, long and complex early developmental pathways can be organized in a linear fashion, regulated by interactions with the previous generation so as to be re-created, with further variations, in the next generation. The usefulness of the concept of developmental selection, it seems to me, is that it defines a set of functions that developmental processes are organized to carry out.
c02.indd Sec1:15
Figure 2.3 Developmental selection. Note. During evolution, developmental processes are co-selected not only for (a) construction of an adult, but also for their capacity to (b) adapt the immature organism to its age-specific environments, (c) produce potentially useful variations in the course and/or outcome of development, and (d) facilitate the inheritance of successful variants in the next generation.
Knowing the ultimate (i.e., evolutionary) functions of developmental processes should be useful for forming new hypotheses about how a given process works. For example, in the field of sociobiology, the evolutionary principles of kin selection led to the discovery of new processes at work within the parent–offspring relationship, such as parent– offspring conflict. In the case of development, asking how specific events and processes contribute to age-specific adaptation, variation, and inheritance of a selected developmental path in the next generation should increase understanding of development considerably beyond the current approach of asking questions that are limited to the function of constructing an adult. Examples are given in the sections on attachment and the regulation of development. Levels of Organization Developmental neuroscience involves a journey in time across levels of organization from the molecular genetics of an embryonic cell to the mental experiences of a conscious mind (see Figure 2.4). The gap that exists between biology and psychology is slowly narrowing as new biological methods have become increasingly able to approach the brain systems underlying more and more complex cognitive and emotional processes. In the study of early behavior development, the conceptual models of psychologists and the working models of biologists converge, as they are being applied to the same behavioral phenomena. Observations and insights from these different levels of organization are beginning to contribute to one another ’s understanding, as I illustrate with regard to the area of early attachment in the next section. Evolutionary principles offer a conceptual common ground that can be shared by both neuroscience and psychology,
8/18/09 4:57:22 PM
16
Developmental Neuroscience
Size Complexity
Time/Age Figure 2.4 Levels of organization in developmental neuroscience. Note: During development, new properties emerge at each level that are not reducible to lower-level events. A full understanding that extends across levels requires repeated efforts at “translation” as well as extensive knowledge of events at each level. Early development affords significant simplification of this process at all levels.
providing answers to questions about how the human mind and brain have come into being and why they have their present forms. The historical nature of both development and evolution bridges the gap that exists between the reductionist emphasis of the molecular/cellular neurosciences and the holistic emphasis on inner experience and meaning that is the central focus of some branches of psychology. Early human development traverses a series of levels of scale and organization (as illustrated in Figure 2.4) from the molecular and intercellular interactions of the embryo, to the integrated systems and behavior of the fetus, to the emerging cognitive and affective capacities of the child, and finally to the inner experience that even language cannot fully describe. The biological, behavioral, and psychological processes at work at those levels of organization seem very different. But the new properties that emerge at higher levels arise from the combined operation of simpler component processes taking place at the lower level. Understanding those transitions, and the emergence of new properties at higher levels, is one of the central questions for research in early animal and human development as well as for attempts to integrate neuroscience and psychology in general. In early development, behavior provides a crucial link between the levels of brain systems and of psychological constructs, as is illustrated in the following sections.
THE EARLY DEVELOPMENT OF ATTACHMENT In this section I outline how a strategy of attempting to understand the component processes underlying the psychological constructs created in studying early human
c02.indd Sec1:16
attachment can provide new and potentially useful ways of thinking in a more general way about development as it occurs at multiple levels of organization. The idea of “early attachment” exists in a number of forms in people’s ways of speaking and thinking. In its most general sense, the phrase refers to a set of behaviors observed in infants and the feelings and thought processes (conscious and/ or unconscious) that we suppose infants have with their mothers or caretakers, based on our own experiences and memories and the psychological concepts we have formed for ourselves or learned from others (see also Volume 2, Chapters 36–42 and Chapters 47–49). Within this range of use of the word attachment, several different schools of thought have coalesced within psychology. Common to all, however, are three themes: (1) some sort of emotional tie or bond that is inferred to develop between the infant and his or her caretaker that keeps the infant physically close, (2) a series of responses to separation that constitute the infant’s emotional response to interruption or rupture of that bond, and (3) the existence of different patterns or qualities of the interaction between infants and mothers that persist over time and lead to longterm effects of this experience on the social and emotional functioning of offspring throughout life, even extending to a repetition of specific patterns of mothering by daughters in the next generation. These observations and the psychological concepts of attachment have been extremely useful in human developmental psychology, but they leave a number of observations unexplained and questions unanswered. Furthermore, when my colleagues and I found close similarities in the behavior and separation responses of a far less evolved mammal, the laboratory rat pup, this suggested that we were missing a deeper layer of biological processes underlying the psychological concepts of attachment theory. The unanswered questions left open by developmental attachment theory are posed in the sections that follow. The answers that came from our laboratory research illustrate how evolutionary developmental theory and the concept of developmental selection help to better explain the nature and functions of early attachment at the psychological as well as the biological levels. Deconstructing the First Attachment Bond Infants of mammalian species that are born in an immature state, such as the human and the laboratory rat, face a daunting task. They must find a way to identify, remember, and prefer their own mother, and they must use these capacities to reorganize their simple motor repertoires, long adapted to the uterine environment, so as to be able to approach, remain close to, and orient themselves to their mothers.
8/18/09 4:57:22 PM
The Early Development of Attachment
Until recently it was assumed that these bonding processes were well beyond the capacities of newborn mammals (except in precocial species such as the sheep) and that the closeness of the relationship depended almost entirely on maternal behavior until well into the nursing period (Bowlby, 1982; Kraemer, 1992; Volume 2, Chapter 48). It was supposed that an attachment bond builds up slowly in the weeks or days after birth through repeated mother– infant interactions, starting with stereotyped reflexes in the newborn. But the past decade has produced a number of studies revealing evidence of earlier and earlier learning, extending even into the prenatal period, as is described here. In addition, coordinated motor acts have been demonstrated experimentally in fetuses in response to specific stimuli that will not be encountered until after birth. Thus, the solutions for the infant’s tasks appear to be found much earlier than previously thought and appear to take place through novel developmental processes that had not been imagined until recently. These developmental processes clearly function as age-specific adaptations to the unique environments of early development, playing only a supportive role in growth of size and complexity, for many of these behaviors (e.g., nursing) have been shown not to be precursors of later behaviors. Prenatal Origins The first strong evidence for fetal learning came from studies on early voice recognition in humans, in which it was found that babies recognize and prefer their own mother ’s voice, even when tested within hours after birth (De Casper & Fifer, 1980). Bill Fifer continued these studies in our department using an ingenious device through which newborns can choose between two tape-recorded voices by sucking at different rates on a pacifier rigged to control an audiotape player (Fifer & Moon, 1995). He has found that newborn infants, in the first hours after birth, prefer human voices to silence, female voices to male, their native language to another language, and their own mother to another mother reading the same Dr. Seuss story. In order to obtain more direct evidence for the prenatal origins of these preferences (rather than very rapid postnatal learning), Fifer filtered the high-frequency components from the tapes to make the mother ’s voice resemble recordings of maternal voice by hydrophone placed within the amniotic space of pregnant women. This altered recording, in which the words were virtually unrecognizable to adults, was preferred to the standard mother ’s voice by newborns in the first hours after birth, a preference that tended to wane in the second and third postnatal days. Furthermore, there is evidence that newborns prefer familiar rhythmic phrase sequences to which they have been repeatedly exposed prenatally when pregnant mothers daily read out loud a specific text in a quiet place (De Casper & Spence, 1988).
c02.indd Sec2:17
17
In a striking interspecies similarity, rat pups were shown to discriminate and prefer their own dams’ amniotic fluid to that of another dam when offered a choice in a head-turning task (Hepper, 1987). Newborn pups were also shown to require amniotic fluid on a teat in order to find and attach to it for their first nursing attempt (Blass, 1990). Robinson and Smotherman (1995) directly tested the hypothesis that pups begin to learn about their mothers’ scent in utero. They were able to demonstrate one trial taste aversion learning and classical conditioning in late-term rat fetuses using intraoral cannula infusions and perioral stimulation. Taste aversions learned in utero were expressed in the free-feeding responses of weanling rats nearly 3 weeks later. The authors went on to determine that aversive responses to vibrissa stimulation were attenuated or blocked by intraoral milk infusion, a prenatal “comfort” effect they found to be mediated by a central kappa-opioid receptor system. These forms of fetal learning, involving maternal voice in humans and amniotic fluid in rodents, appear to play an adaptive role in preparing the infant for its first extrauterine encounter with its mother. They are thus the earliest origins as yet found for attachment to the mother. The spontaneous motor acts needed for an attachment system also appear to be developing prior to birth. Rat fetuses engage in a number of spontaneous behaviors in utero, including curls, stretches, and trunk and limb movements. These acts were observed to increase markedly in frequency with progressive removal of intrauterine space constraints as pups were observed first through the uterine wall, then through the thin amniotic sac, and finally unrestrained in a warm saline bath (Smotherman & Robinson, 1986). When newborn pups are observed prior to their first nursing bout they resemble exteriorized fetuses, until the mother lowers her ventrum over them. Their behavior then changes rapidly over the first few nursing bouts into a complex repertoire as described in the next section. How Newborns Approach and Orient to Their Mothers When newborn pups in their first extrauterine experience are stimulated gently by soft surfaces from above, as when the mother hovers over them, they show a surprisingly vigorous repertoire of behaviors (Polan & Hofer, 1999). These include the spontaneous curling and stretching seen prenatally but also locomotor movement toward the suspended surface, directed wriggling, audible vocalizations, and, most strikingly, turning upside down toward the surface above them. Evidently these behaviors propel the pup into close contact with the ventrum, maintaining it in proximity and keeping it oriented toward the surface. They thus appear to be very early attachment behaviors. In a series of experiments my colleague and I found that these are not
8/18/09 4:57:23 PM
18
Developmental Neuroscience
stereotyped reflex acts but organized responses that are graded according to the number of maternal-like stimulus modalities present on the surface presented experimentally (e.g., texture, warmth, odor; Polan & Hofer, 1999). Furthermore, they were enhanced by periods of prior maternal deprivation, suggesting the rapid development of a motivational component. We found that by 2 days of age, pups discriminate their own mother ’s odor and prefer it to equally familiar nest odors (Polan & Hofer, 1998). Hepper (1987) showed that by the first postnatal week, pups discriminate and prefer their own mother, father, and siblings to other lactating females, males, or agemates. Recent work in humans inspired by these findings in lower animals has shown that human newborns, too, are capable of slowly locomoting across the bare surface of their mother ’s abdomen and locating the breast scented with amniotic fluid in preference to the untreated breast (Varendi, Porter, & Winberg, 1996). Although newborns are attracted to natural breast odors even before the first nursing bout (Makin & Porter, 1989), amniotic fluid can override this effect. Apparently, human newborns are not as helpless as previously thought but possess approach and orienting behaviors that anticipate the recognized onset of specific maternal comfort responses at 6 to 8 months. These events and processes can best be understood in terms of the four primary functions of development (construction, adaptation, inheritance, and variation). More and more complex behavior patterns are constructed, beginning with simple local movements and progressing through organized graded responses to specific stimuli. Then with a unique rapid learning process, behavior becomes organized into an adaptive repertoire of nursing-related behaviors. These developmental steps function in part as transient adaptations first to the confines of the uterus and then to the requirements of staying close, finding a nipple, and initiating sucking. One can also see how the inherited developmental environment of the uterus and the maternal nest supply a matrix and a template for the formation and organization of species-specific mother–infant interaction patterns. Thus, the developmental environments and events of the uterine cavity and maternal ventrum function to ensure that the same developmental path will be repeated in the next generation (i.e., inherited). As differences occur in the kinds and intensities of maternal stimulation of pups or in other aspects of the interaction, variation from one generation to the next will also take place. It is in the process of fulfilling these functions of development (adaptation, inheritance, and variation) that the organization and construction of more complex repertoires of behavior takes place. Looking for novel development processes that fulfill the functions of variation and inheritance should lead to a deeper understanding of how development works.
c02.indd Sec2:18
A Novel Postnatal Learning System Bowlby (1969) was uncertain about exactly how a behavioral attachment system or bond developed in slowmaturing mammals and hypothesized that some learning mechanism must exist that is similar to the phenomenon of “imprinting” in birds, made famous by Konrad Lorenz (1996). Scientists are now beginning to gain an understanding of how such a specific proximity–maintenance system develops in animals and humans at the levels of basic learning mechanisms and the brain systems mediating them. Regina Sullivan and Steven Brake in our lab discovered that within 2 to 3 days of birth, neonatal rat pups were capable of learning to discriminate, prefer, approach, and maintain proximity to an odor that had been associated with forms of stimulation that naturally occurred within the early mother– infant interaction (e.g., milk or stroking; Sullivan, Hofer, & Brake, 1986). Random presentations of the two stimuli had no such effect, a control procedure that identified the change in behavior as being due to associative conditioning and not some nonspecific effect of repeated stimulation. Because the learning required only two or three paired presentations and because the preference was retained for many days, it seemed to qualify as the long-sought “imprinting-like process” that is likely central to attachment in slow-developing mammals. Indeed, a human analogue of this process was found by Sullivan, who showed that when human newborns were presented with a novel odor and were then rubbed repeatedly along their torsos to simulate maternal care, the next day they became activated and turned their head preferentially toward that odor (Sullivan et al., 1991). This suggests that rapid learning of orientation to olfactory cues is an evolutionarily conserved process in mammalian newborns. Early attachment-related odors appear to retain value into adulthood, although the role of the odor in modifying behavior appears to change with development. Work done independently in the labs of Celia Moore (Moore, Jordan, & Wong, 1996) and Elliot Blass (Fillion & Blass, 1986) has demonstrated that adult male rats showed evidence of enhanced sexual performance when exposed to females scented with the artificial odors with which their mothers had been scented during the male rats’ infancy. Aversive Learning of the Attachment Bond Clinical observations have taught that not only does attachment occur to supportive caretakers, but children can endure considerable pain and even injury while becoming strongly attached to an abusive caretaker. Although it may initially appear to be counterproductive from an evolutionary perspective to form and maintain an attachment to an abusive caretaker, it may be better for a slow-developing mammalian infant to have a bad caretaker than none at all.
8/18/09 4:57:23 PM
The Early Development of Attachment
This aspect of human attachment is also represented in the infant rat. We found that during the first postnatal week, a surprisingly broad spectrum of stimuli can function as reinforcers to produce an odor preference in rat pups (Sullivan, Brake, Hofer, & Williams, 1986; Sullivan, Hofer, et al., 1986). These stimuli range from apparently rewarding ones such as milk and access to the mother (Alberts & May, 1984; Brake, Sager, Sullivan, & Hofer, 1982; Pedersen, Williams, & Blass, 1982; Wilson & Sullivan, 1994) to apparently aversive ones such as moderate shock and tailpinch (Camp & Rudy, 1988; Sullivan, Hofer, et al., 1986), stimuli that elicit immediate escape responses from the pups. It should be noted that threshold to shock (Stehouwer & Campbell, 1978) and the pup’s behavioral response (Emerich, Scalzo, Enters, Spear, & Spear, 1985) do not change between the ages of 9 and 11 days. As pups mature and reach an age when leaving the nest becomes more likely, olfactory learning comes to more closely resemble learning in adults. Specifically, odor aversions are easily learned by 2-week-olds, and acquisition of odor preferences is limited to odors paired with stimuli of positive value (Sullivan & Wilson, 1995). Thus, the learning that underlies early attachment develops through a transiently adaptive “paradoxical” phase during which positive associations take place in response to a very broad range of contingent events (including painful stimulation) while pups are confined to the nest. It becomes more selective at a time in development when pups begin leaving the nest and encountering novel odors not associated with the mother. Brain Substrates for Two Kinds of Early Attachment Learning Early rapid aversive learning has been traced to focal odor-specific areas in the olfactory bulb by Sullivan and Donald Wilson. Certain cell types alter their firing rates in response to a specific odor as a result of learning experience (Sullivan, Wilson, Wong, Correa, & Leon, 1990; Wilson & Sullivan, 1994). This altered firing rate is the result of activation of norepinephrine pathways leading from the locus coeruleus. Indeed, behavioral learning can be driven by electrical stimulation of the locus or norepinephrine injection in the olfactory bulb in association with the novel odor, without the association of any maternal stimuli with the novel odor. The period during which aversive learning of position associations takes place ends about 10 days postnatal, but this “sensitive period” can be extended by a few repeated brief associations of odor with shock each day until weaning begins about a week later. The period from 12 to 15 days of age is an interesting one because during this time, if the mother is present during the association of odor and aversive stimulation, preference
c02.indd Sec2:19
19
learning takes place, but in the mother ’s absence, the odor is subsequently avoided. Recent studies of neural and hormonal substrates has helped to explain this unusual developmental pattern (Moriceau, Wilson, Levine, & Sullivan, 2006). As illustrated in Figure 2.5, the primary brain substrate activated by the later developing avoidance learning in 12- to 15-day-old pups with the mother absent was the amygdala, not the locus coeruleus and olfactory bulb. However, in the mother ’s presence, there was no amygdala activation. Instead, positive association learning took place, mediated by the locus coeruleus. Corticosterone levels were previously known to be reduced by maternal presence at this age. Moriceau and Sullivan administered increased corticosterone to the pups learning in the mother ’s presence and less corticosterone to those learning in the mother ’s absence. The results showed that corticosterone levels mediated the maternal “switch” between the formation of positive and negative association learning and did so by switching the neural substrates mediating the response between the locus coeruleus/olfactory bulb and the amygdala. It is relatively easy to see how these seemingly complex paths of development in early systems for learning are likely to have evolved. Only outside the nest, and in the absence of the mother, is it crucial to learn an avoidance response to all painful stimuli. In interactions with the mother, learning to avoid her as a result of painful or uncomfortable associated conditions would likely lead to far more dangerous circumstances. The specialized environmental contingencies of early life have selected, during evolution, a sequence of the development of learning that would be hard to explain except within the concept of developmental selection. The next steps in this research will be to learn which genetic, experiential, and developmental processes were involved in the creation of these developmental paths during evolution and how they
Early
Corticosterone levels low
Late
Corticosterone levels high
Figure 2.5 Summary of transition between two learning systems and their neural substrates in the formation of early attachment in rat pups. Note. Early ⫽ birth to 10 days of age. Late ⫽ 11 to 15 days. Differences in properties and known mechanisms of the two forms of learning can be shifted in time across the 10-day transition point according to corticosterone levels and/or presence or absence of mother during learning.
8/18/09 4:57:23 PM
20
Developmental Neuroscience
became integrated into the developmental sequences and mechanisms this work has revealed. Parallel Processes in Human Maternal Attachment Learning Successful mother–infant interactions require the reciprocal responding of both individuals in the mother–infant dyad. Human mothers rapidly learn about their baby’s characteristics and can identify their baby’s cry, odor, and facial features within hours of the baby’s birth (Eidelman & Kaitz, 1992; Kaitz, Lapidot, & Bronner, 1992; Porter, Cernoch, & McLaughlin, 1983). An animal model for this rapid learning by mothers has received considerable attention (Brennen & Kaverne, 1997; Fleming, O’Day, & Kraemer, 1999). Indeed, there are interesting parallels between the early attachment behavior of infants and the attachment behavior of the newly parturient mother. In rats and sheep, a temporally restricted period of postpartum olfactory learning in the mother involving norepinephrine facilitates the mother ’s learning about her young (Levy, Gervais, Kindermann, Orgeur, & Piketty, 1990; Moffat, Suh, & Fleming, 1993). It is possible that mammalian mothers and their young use similar neural circuitry to form their reciprocal attachments, both abusive and normal.
EARLY SEPARATION AS LOSS OF REGULATION In Bowlby (1969, 1982) and Harlow’s (1958) work, as well as in the clinical observations of Anna Freud and Dorothy Burlingham (1943) a generation before, it was maternal separation that revealed the existence of an “animal tie” between mother and infant and a deeper layer of processes beneath the apparently simple interactions of mother and infant. Bowlby (1969, 1982) viewed these processes as primarily psychological. The behavioral and physiological responses of the infant to separation, in their conception, were a consequence of the rupture of a psychological bond that was formed as part of an integrated psychophysiological organization that Bowlby called the attachment system. More recent research, however, has revealed a network of simple behavioral and biological processes that underlie this and other psychological concepts used to understand early human social relationships (see Volume 2, Chapter 47). The Separation Cry One of the best known responses to maternal separation is the infant’s separation cry, a behavior that occurs in a wide variety of species (Lester & Boukydis, 1985; Newman,
c02.indd Sec2:20
1998). In the rat, this call is in the ultrasonic range (40 kHz) and appears in the first or second postnatal day. Pharmacological studies in a number of labs (reviewed in Hofer, 1996) have shown that the ultrasonic vocalization (USV) response to isolation is attenuated or blocked in a dose-dependent manner by clinically effective anxiolytics that act at benzodiazepine and serotonin receptors. Conversely, USV rates are increased by compounds known to be anxiogenic in humans, such as benzodiazepine receptor inverse agonists (beta-carboline, FG 1742) and GABA-A receptor ligands such as pentylenetetrazol (Brunelli & Hofer, 2001; Miczek, Tornatsky, & Vivian, 1991). Within serotonin and opioid systems, receptor subtypes known to have opposing effects on experimental anxiety in adult rats and humans also have opposing effects on infant USV calling rates (see Figure 2.6). Neuroanatomical studies in infant rats have shown that stimulation of the periaquaductal grey area produces USV calls, and chemical lesions of this area prevent calling (Goodwin & Barr, 1998). The more distal motor pathway is through nucleus ambiguus and both laryngeal branches of the vagus nerve. The engagement of higher centers known to be involved in cats and primates suggests a neural substrate for isolation calls involving primarily the hypothalamus, amygdala, thalamus, and hippocampus, and cingulate cortex, brain areas known to be involved in adult human and animal anxiety responses (Newman, 1998). This evidence strongly suggests that separation produces an early affective state resembling anxiety in rat pups, one that is expressed by the rate of infant calling. This calling behavior, and its inferred underlying affective state, develops as a communication system between mother and pup. The evolution of such a response is clarified by the finding that infant rat USV is a powerful stimulus for the lactating rat, capable of causing her to interrupt an ongoing nursing bout, initiate searching outside the nest, and direct her search toward the source of the calls (Smotherman, Bell, Hershberger, & Coover, 1978). The mother ’s retrieval response to the pup’s vocal signals then results in renewed contact between pup and mother. This contact, in turn, quiets the pup (as represented along the bottom right of Figure 2.6). This entire behavioral system fades out as infants cease to vocalize when isolated during the weaning period, showing that it represents a developmental ontogenetic adaptation. In the psychological concept of attachment, vocal separation and comfort responses are conceptualized as emotional expressions of interruption and reestablishment of a social bond. Such a formulation would predict that because pups recognize their own mothers by the mothers’ scents (as previously described), pups made acutely anosmic would fail to show a comfort response. But anosmic pups
8/18/09 4:57:24 PM
Early Separation as Loss of Regulation
Figure 2.6 The regulatory control system for infant ultrasonic vocalization. Note. BZi ⫽ benzodiazepine inverse agonist; CRH ⫽ corticotropinreleasing hormone; DA ⫽ dopamine; 5-HT ⫽ 5-hydroxytryptamine (serotonin); GABA ⫽ gamma-amino butyric acid; NA ⫽ noradrenaline; NMDA ⫽ N-methyl-D-aspartic acid; OT ⫽ oxytocin; VP ⫽ vasopressin. Moving counterclockwise from the far right in the diagram, interactions of a rat pup with its mother (proximity, warmth, etc.) act over multiple
show comfort responses that are virtually unaffected by the loss of their capacity to recognize their mothers in this way (Hofer & Shair, 1991). Instead Harry Shair, Susan Brunelli, and I have found multiple regulators of infant USV within the contact between mother and pup: warmth, tactile stimuli, and milk as well as her scent (Hofer, 1996). Provision of stimulation in these modalities separately (e.g., artificial fur lacking warmth or scent) and then progressively combining modalities elicited graded responses. The full “comfort” quieting response was elicited only when all modalities were presented together, and maximum calling rates occurred when all were withdrawn at once. In essence, we found parallel regulatory systems involving different sensory modalities (see Figure 2.6). These functioned in a cumulative or additive way, with the rate of infant calling reflecting the sum total of effective regulatory stimuli present at any given point in time. These processes at the behavioral/biological level underlie the psychological concept of separation anxiety. They do not supplant it. They operate at a different level of organization; the psychological concepts can be thought of as emerging from the lower-level component processes. Searching for What Was Lost Experiments in our laboratory have shown that infant rats also have more complex and lasting responses to maternal
c02.indd Sec3:21
21
pathways (olfactory, thermal, etc.) to activate infant sensory systems (center top) and then brain neurotransmitter receptor systems (in box on the left). Those receptors in the (⫹) column, when activated, increase calling rate (ultrasonic vocalization), and those in the (–) column suppress calling. Maternal separation (lower center) results in a rapid burst of calling that gradually subsides. Continuing counterclockwise from the center of the diagram, ultrasonic vocalization by the pup stimulates maternal retrieval, licking, and so on, reinstating contact and maternal interactions (far right), thus closing the circle.
separation, similar to primates in a number of different physiological and behavioral systems. A number of years ago colleagues and I found slower developing changes following maternal separation, similar to those of Bowlby’s (1969, pp. 27–28) “despair” phase (see Figure 2.7A). This was not an integrated psychophysiological response as Bowlby had supposed but the result of a novel mechanism (see Figure 2.7B). As separation continued beyond the initial vocal response, each of the individual systems of the infant rat responded to the loss of one or another of the components of the infant’s previous interaction with its mother. Providing one of these components to a separated pup (e.g., maternal warmth) maintained the level of brain biogenic amine function underlying the pup’s general activity level for up to 3 days (Stone, Bonnet, & Hofer, 1976) but had no effect on other systems. For example, the pup’s cardiac rate continued to fall to 60% of its normal level over 24 hours regardless of whether supplemental heat was provided (Hofer, 1971). We found that the heart rate, normally maintained by sympathetic autonomic tone, was regulated by provision of milk acting on receptors in the lining of the pup’s stomach (Hofer & Weiner, 1975). With loss of the maternal milk supply, sympathetic tone fell and cardiac rate was reduced by 40% in 12 to 18 hours. By studying a number of additional systems—such as those controlling sleep–wake states, activity level, sucking pattern, and blood pressure (Brake et al., 1982; Hofer, 1975,
8/18/09 4:57:24 PM
22
Developmental Neuroscience
(A)
(B)
Growth Hormone
Activity Heart Rate REM Sleep
Figure 2.7 Two schematic representations of the dynamics of early separation. Note. A: as conceptualized in the framework of attachment theory by John Bowlby and B: as found to result from loss of regulatory interactions within the mother–infant relationship.
1976; Shear, Brunelli, & Hofer, 1983)—we found different components of the mother–infant interaction such as olfaction, taste, touch, warmth, and texture that normally either upregulated or downregulated each of these functions (or, in the case of sleep states, regulated rhythmic patterning). Thus, we concluded that in maternal separation, all of these regulatory components of the mother–infant interaction are withdrawn at once (see Figure 2.7). This widespread loss creates a pattern of increases or decreases in level of function of the infant’s systems, depending on whether the particular system had been up- or downregulated prior to separation by specific components of the previous mother– infant interaction. We called these components hidden regulators because they were not evident when one simply observed the ongoing mother–infant relationship. These studies revealed a novel mechanism for the infant response to separation, an event that previously could be
c02.indd Sec3:22
conceptualized only at the psychological level, as an emotional response. In younger infants this was conceived to be the result of rupture of the bond between mother and infant, and in older children the stress resulting from perception of the loss of the mother ’s emotional, physical, and nutritional support. At the biological level of organization we found that separation resulted in the withdrawal or loss of a number of different physical interactions between mother and infant that normally regulated infant physiology and behavior on an ongoing basis. Thus, we found that infant and mother compose a partial fusion of two individual homeostatic units into a single common homeostatic organization. When separated, each returns to its individual set points at lower or higher levels of function. The regulation of maternal milk letdown by infant sucking is a wellknown example of a hidden regulatory system present in all adult female mammals. When the suckling interaction
8/18/09 4:57:24 PM
Early Separation as Loss of Regulation
is interrupted, the mother ’s milk production and periodic letdown become greatly reduced. The observations of psychologists on the evident changes in mental state of infants during separation and the psychological constructs used to explain them remain valid and useful. However, the processes at the biological level described here can enlarge psychological understanding and support newer psychological concepts of emotional and perceptual regulation. It is not that rat pups respond to loss of regulatory processes, whereas humans respond to emotions of love, sadness, anger, and grief. Human infants, as they mature, can respond at the level of complex affective responses and symbols as well as at the level of regulatory interactions. And rat pups respond at the level of an affective state expressed through separation calling as described previously. Even adult humans continue to respond in important ways at the sensorimotor– physiologic level in their social interactions (Hofer, 1984; Stern & McClintock, 1998). Examples include the role of social interactions in entraining sleep–wake and menstrual rhythms, the disorganizing effects of sensory deprivation, and the remarkable effects of social support on the course of medical illness. This extended homeostatic system of mother and infant represents an aspect of mammalian early development similar to the intertwined nature of the fetus and mother, essentially an extension of the symbiosis recognized in the intrauterine period of development. By applying the concept of developmental selection, it is evident that these evolved adaptations function to provide a living environment that is inherited from one generation to the next to serve as a guiding matrix, enabling a specific early developmental course to be repeated (see Figure 2.8). Variations in maternal behavior are capable of producing variations in infant physiology and behavior through the regulatory processes described. These variations can be immediately adaptive. For example, the major thermoregulatory, cardiovascular, and behavioral responses of pups separated for a period of hours slow metabolism, shunt blood away from the digestive tract to the heart and brain, and profoundly reduce spontaneous behavior, inducing a hibernation-like state that promotes survival until the mother returns. Furthermore, variations in infant development that are repeated over generations (e.g., because of increased maternal foraging demands during climate change creating long absences) become targets for selection and can become increasingly efficient and finally genetically fixed, producing an alternative infant phenotype. In this way one can better understand why these widespread and complex regulatory process evolved to become a feature of mammalian early development, serving the developmental functions of inheritance and variation as well as construction and adaptation.
c02.indd Sec3:23
23
Hidden Regulators of Early Development Other investigators, using this approach, have since discovered other maternal regulatory systems of the same sort. For example, Saul Schanberg, Cynthia Kuhn, and colleagues found that separation of the dam from rat pups produced a rapid (30-min) fall in the pup’s growth hormone levels, and vigorous tactile stroking of maternally separated pups (mimicking maternal licking) prevented the fall in growth hormone (Kuhn & Schanberg, 1991). Brain substrates for this effect were then investigated, and it now appears that growth hormone levels are normally maintained by maternal licking, acting through serotonin (5HT) 2A and 2C receptor modulation of the balance between growthhormone-releasing factor and somatostatin that together act on the anterior pituitary release of growth hormone (Katz, Nathan, Kuhn, & Schanberg, 1996). The withdrawal of maternal licking by separation allows growth-hormonereleasing factor to fall and somatostatin to rise, resulting in a precipitous fall in growth hormone and cessation of growth processes generally for several days. However, a parallel process in prematurely born human infants showed longer-term regulation. There are several biological similarities between this maternal deprivation effect in rats and the growth retardation that occurs in some variants of human reactive attachment disorders of infancy. Applying this new knowledge about the regulation of growth hormone to lowbirth-weight, prematurely born babies, Tiffany Field and coworkers (1986) joined the Schanberg group. They used a combination of stroking and limb movement, administered 3 times a day for 15 min each time and continuing throughout the babies’ 2 weeks of hospitalization. This intervention increased weight gain, head circumference, and behavior development test scores in relation to a randomly chosen control group, with earlier discharge from the intensive care unit and other enhanced maturational effects discernible 6 months later. Clearly, early regulators are also effective in humans, and over time periods as long as weeks to months. As experimenters began to realize that infants’ separation responses revealed a network of individual regulatory processes within the mother–infant interaction, an important implication of this finding emerged: These ongoing regulatory interactions could, over the long term, act to shape the development of an infant’s brain and behavior throughout the preweaning period when mother and infant remained in close proximity. And when maternal behavior changed in response to changes in her environment, this could change the course of her offspring’s development. We could now think of the mother–infant interaction as a long-term regulator of development, with variations in
8/18/09 4:57:25 PM
24
Developmental Neuroscience
the intensity and patterning of mother–infant interactions gradually shaping the development of behavior and physiology. These processes go far beyond the adaptive evolutionary role of attachment as a protection against predators proposed by Bowlby (1969). They also go beyond the role described in the previous section of a flexible adaptive response of the infant to the environmental conditions that caused maternal separation. Here, development is regulated along a different path for a substantial period of time. With the mother absent for even longer time periods, perhaps never to return, and without another effective source of regular stimulation and caretaking, an overall slowing of growth occurs to the point that a smaller adolescent and adult with lower nutritional needs will develop. Subsequent research has shown long-term effects on offspring even from briefer and less-extreme mother–infant experiences.
LONG-TERM MATERNAL REGULATION OF DEVELOPMENT In clinical work and in attachment theory, the psychological construct of an enduring mental representation is generally used to denote an internal working model that is formed early in the infant’s developing mind, through his or her particular interaction with parents or consistent caretakers. This conceptual model helps to organize and explain how the nature of peoples’ later relationships and responses to stress seem to be shaped by their experiences as infants and children and even transmitted to their offspring in the next generation through particular patterns of mothering behavior expressive of the mother ’s mental representation of how mothers behave toward their infants. In research with a simple mammalian working model system, we found evidence of similar long-lasting and transgenerational effects that can be attributed to the processes of the maternal regulatory interactions described previously. For example, Sigurd Ackerman, Herbert Weiner, and I found that permanent maternal separation of juvenile rat pups early in the weaning period (early weaning) produced a greatly increased vulnerability to stress-produced gastric ulcer in adolescence and early adulthood (Skolnick, Ackerman, Hofer, & Weiner, 1980). To our great surprise, we found that this effect persisted in the next generation in the normally reared offspring of mothers that had been separated early as infants. We also found that, as adult mothers, early-weaned females spent less time and interacted less with their infants. However, a cross-fostering study revealed that this was not the result of the reduced maternal behaviors of the early weaned mothers. For if the pups of normally reared mothers were substituted for the
c02.indd Sec3:24
early-weaned mother ’s own at birth, these cross-fostered pups were not vulnerable as adolescents. Instead, the pups born to early-weaned mothers but reared by normally reared mothers inherited the vulnerability. This suggested an unknown intrauterine or germ cell line transmission mechanism. In a different study on the development of hypertension in a strain of rat that had been selected over generations for high blood pressure (SHR), Michael Myers, Susan Brunelli, and I found that naturally occurring variation in 3 out of the 12 maternal behaviors we observed in the hypertensive and control strains (WKY) of inbred (genetically homogenous) rats did indeed appear to regulate the development of the physiological trait in the next generation: the levels of blood pressure in their offspring. Because the animals in each strain were genetically identical, and because the levels of maternal behaviors were significantly correlated with the magnitude of effect on adult offspring blood pressure both within and between strains, it seemed very likely that it was the variation in maternal behaviors acting on the pups that produced these long-term effects (Myers, Brunelli, Squire, Shindeldecker, & Hofer, 1989). The findings of these two studies led us to realize that both the hidden postnatal maternal regulators that we had discovered in our separation studies and a different class of early-weaning-induced prenatal or germ line influences could have long-term regulatory effects on later development, even into the next generation. In their implications for human development, they appeared to represent a level of biological developmental processes that underlie and coexist with the psychological processes inferred to mediate the lasting effects clinically observed between early relationships and their later mental representations in patients (Hofer, 1980, 2005). However, the biology of these effects was most unusual and difficult to explain in terms of the familiar processes studied in developmental psychobiology at the time. In addition, the evolutionary basis for such long-term developmental effects was unclear. Now, with a new understanding of the role of development in evolution, one can begin to make some sense of them (see Figure 2.8). The concept of developmental selection states that developmental processes have been selected, in part, to function as creators of potentially useful variation and to provide developmental mechanisms of inheritance necessary for selection of the most successful of these variations. The regulatory processes hidden within early mother–infant interactions have adaptive value for the infant within the unique environment of young mammals, but they also can function to regulate the early course of development. Because development consists of series of cascades of gene regulation patterns, an early diversion is capable of setting the pattern of downstream
8/18/09 4:57:25 PM
Long-Term Maternal Regulation of Development
Figure 2.8 Overlapping life cycles provide a template for inheritance and a matrix for creating variation throughout early development in humans and other mammals. Note. In a single generation, the origin of germ cells (lower left) takes place during the early embryonic period of the mother while she is in the grandmother ’s womb. A number of years later, these germ cells unite at
regulation onto different paths. This can extend the early effects of variations in maternal behavior on infants into adolescence and even adulthood. Moreover, because some of the neural substrates for adult behavior, such as maternal behavior, are present even in infancy, maternal regulation that affects early gene expression in these neural precursors can likewise be modified, with the result that when the systems mature, their patterns of function will be different. In the case of maternal behavior, this inherited environment can result in transmission (inheritance) of an early experience effect from one generation to the next. As evolution continues, if an environmental or social change persists over several generations and can be transmitted across generations by maternally driven developmental processes, then this developmental variation will gradually become genetically modified by repeated selection. Variants in the structure of genes (alleles) responsible for expression of the transcription factors that activate the cascades of gene expression patterns will be selected to the extent that survival and reproductive success is enhanced in these variant individuals. In this way, new and even more adaptive developmental paths will be gradually created over generations because of this “genetic accommodation”
c02.indd Sec4:25
25
conception, and development continues within the mother ’s womb and throughout a long period of postnatal proximity and interactions with the mother, grandparents, and other caretakers. Development has been shown to be affected by such extended influences as the age of the father prior to conception and the proximity of grandparents.
of developmentally created novelties. Often an environmental or maternal behavior variable will be maintained by selection as a “trigger” for one or another developmental path, creating alternative phenotypes to be expressed by a singe genotype, greatly enhancing the range of adaptability of the evolving species. The plausibility of this developmental–evolutionary scenario is supported by the widespread existence of alternative developmental phenotypes (reviewed in WestEberhard, 2003) in organisms ranging from phage viruses to humans. In our experiments described above, we revealed the existence of such alternative developmental paths. In the early weaning experiments, we found that the susceptibility to gastric erosions was greatly increased in adolescent and young adults, but older adults were actually less vulnerable than normally reared animals. The whole life trajectory of the trait had been shifted by the social/ environmental trigger of early weaning. Physiologically, the response to early weaning may have been adaptive. Instead of sleeping throughout most of the 24-hr immobilization stress used to induce gastric ulcer as the normally weaned animals did, the early-weaned juveniles remained awake. This altered response could prevent them from
8/18/09 4:57:25 PM
26
Developmental Neuroscience
being surprised by predators in the absence of protection by their mother. The physiological price paid was the risk of gastric ulceration if immobilization was prolonged. Early weaning seems to have also produced long-term changes that altered the uterine environment mothers provided to their young (possibly a hormonal or epigenetic change) that induced the alternative developmental phenotype in their fetuses, through a different mechanism, as a “predictive” response pre-adapting offspring for the expected environment in the next generation. Here, as in the section on maternal regulation of early development, the mother is revealed as a potential transducer of environmental change effects on future generations. Changes in the environment have long been demonstrated to affect maternal behavior, and it is highly adaptive if their young can begin to adapt to changed conditions early in their development (enabling them to become effectively preadapted as adults). Such preadaptations have been studied for several decades by ecologists and evolutionary biologists and are referred to simply as maternal effects (Mousseau & Fox, 1998). A well-known example in insects is that the location in which the female lays her eggs differs according to the season. Females lay their eggs in warmer locations in response to changes in the light in the fall. This reduces the time required before hatching and thus increases survival of the young before the onset of winter. Epigenetic Mechanisms of Long-Term Maternal Regulation as Preadaptation The work of Michael Meaney and his colleagues over the past decade has greatly increased understanding of the biological processes at work in these lasting effects of early relationships, as described in Cameron et al. (2005). These researchers discovered that normal variation among mothers in a colony in the same maternal behaviors implicated in our earlier studies (Myers et al, 1989) of offspring blood pressure development (the level of maternal licking and grooming of pups) systematically modified (regulated) the development of a number of different traits in adult offspring (i.e., adrenocortical stress response, behavioral fear response, measures of learning, and sexual and maternal behavior in adult offspring). Furthermore, these different phenotypes can be produced in offspring as a result of adverse environmental events occurring in the parental generation, such as repeated immobilization of pregnant females. This stress was shown to affect maternal behavior, resulting in physiological and behavioral changes in offspring that preadapted them to more challenging environments (Champagne & Meaney, 2006). The cellular/molecular mechanisms mediating these complex transgenerational effects of variations in maternal
c02.indd Sec4:26
care have been analyzed in detail by the Meaney group (Weaver et al., 2007). These studies have revealed a novel layer of long-term epigenetic modification of gene expression in offspring caused by maternal care differences acting on the molecular configuration of the chromatin matrix surrounding certain specific genes in the offspring, permanently altering their expression levels and causing longterm effects. Revealing the Structure of Alternative Developmental Pathways In all of these studies just described, alternative developmental pathways appear to be potentially available within the genetic potential of a given strain of rat. One or another of these can be expressed, depending on certain specific social/environmental eliciting or triggering experiences. It occurred to Susan Brunelli and me that such alternative developmental paths or trajectories might also be revealed by repeated selection for high levels of a particular trait in infancy. If the trait were part of a genetically organized and extended developmental pathway, the later stages of that pathway, and any associated traits, should be revealed in adults of the selected line after a number of generations of repeated selection based on the level of the infantile trait. The trait we chose was the infant separation call described previously (see “The Separation Cry”), and we picked a differential selection procedure for the laboratory that was based on certain evolutionary considerations. It is known that sensory and perceptual adaptations to infant USV have evolved in female rodents, for example, an auditory frequency response threshold tuned to the exact frequency of infant USV (45 kHz) that enables the mother ’s sensitive and specific search, retrieval, and caregiving responses (Ehret, 1992). But the infant’s isolation call can also be used by predators to locate the infant. Not surprisingly, predator odors dramatically suppress USV in isolated pups, a specific fear response (Takahashi, 1992). The evolution of the infant separation cry has involved what is known as an evolutionary trade-off, a ratio of risk to benefit that is thought to have shaped many behaviors. In this case, the theory predicts that in environments with many predators, infants that show less separation-induced vocalization will gradually increase in the population. However, when nest disruption and scattering of pups occurs frequently (e.g., through flooding) and fewer predators exist, high rates of isolation calling would be advantageous. To explore some of these hypothetical evolutionary processes and the role of development in them, we have been conducting an experimental model of evolution in the laboratory. We selectively bred adult rats that had shown relatively high or relatively low rates of USV responses to
8/18/09 4:57:25 PM
Long-Term Maternal Regulation of Development
27
separation as 10-day-old infants (Brunelli & Hofer, 2007). We found that in as few as five generations, two distinct lines emerged that differed widely on this infantile trait. Cross-fostering showed no evidence that being reared from birth by a mother from the other line changed the level of isolation calling (Brunelli, Vinocur, Soo-Hoo, & Hofer, 1997). Clearly, the selected trait has a strong genetic basis. But because there have been no systematic studies of selective breeding for an infant behavior trait (that we have been able to discover), we did not know how repeated selection might affect the developmental course of separation calling rates in descendents. Furthermore, we wondered whether other traits related to vocalization or to an underlying early anxiety state might also be affected through their genetic, physiological, or behavioral links to the systems controlling the infants’ vocal response to separation. We found that infants’ calling during isolation was elevated in high-line pups over a randomly bred control line starting as early as the response develops at 3 days of age, whereas low-line pups had already decreased their calling rates below controls at this early age (Hofer, Brunelli, & Shair, 2001). The greatest difference between high and low lines was at the age of repeated selection, 10 days postnatal. Response differences were much less evident at 14 days, and the three lines converged as the response ceased to occur in all weanling pups at 18 to 20 days postnatal. Thus, selection resulted in high-line pups showing a marked increase in the already high rates of newborn pups and maintaining this level up to and including the age of selection. However, the low-line pups showed the opposite, a more rapid decline than normal from 3 to 10 days of age. In short, selection at 10 days of age appeared to be acting on the whole developmental trajectory of the vocal response to separation, shifting it in time: either delaying
or hastening the normal gradual decline in isolation calling with age. In more recent studies, as summarized in Figure 2.9, Brunelli found a number of other behaviors and physiological responses at different ages in the high and low lines that had been altered by selection at 10 days of age (Brunelli & Hofer, 2007). Those differences appeared to form two coherent groups of traits. In high-line juveniles tested in isolation at 18 days (when isolation calling no longer occurs), both defecation/urination and (sympathetically mediated) heart rate acceleration were greater than in controls. As adolescents, rough-and-tumble play behavior and the short high-frequency vocalizations that accompany these interactions were reduced in the high-line compared to randomly bred controls in the first few play bouts. Highline adults were significantly slower to emerge into an open test arena and avoided the center region more completely than the low lines. In addition, high-line adults showed a much more passive response to the Porsolt swim test, a pattern associated with depression-like states in rodents. The low-line rats, as they developed into juveniles, showed the greatest heart rate acceleration of the three lines during isolation testing and a much-delayed return to baseline due to major vagal withdrawal. Low-line adolescents were deficient on all play behaviors on all days of testing and emitted the fewest play calls. As adults, the lowline animals were quicker to emerge into the open area, explored its center more than the highs, and were more active in the swim test. When confronted with an unfamiliar male, 70% of low-line males engaged in aggressive behavior compared to 30% of randomly bred controls. These groups of traits suggest a characterization of the high line as anxious and passive, whereas the lows were exploratory, active, and aggressive. Apparently, selection
Figure 2.9 Summary of developmental effects created in two lines of rats by repeated selection, over more than 25 generations, for an infant anxiety trait: either high or low levels of isolationinduced ultrasonic vocalization (USV).
are widely used as representing close equivalents in animals to these two human affective states. Plus signs (⫹) denote increased levels of behaviors, and minus signs (–) denote decreased levels.
Note.“Depression-like” and “anxiety-like” refer to behavioral responses observed in tests that have been validated pharmacologically and that
c02.indd Sec4:27
8/18/09 4:57:25 PM
28
Developmental Neuroscience
for high and low rates of separation calling in infancy, as would occur during evolution under different ecological conditions, selects for a lifelong developmental path involving associated traits. Two alternative phenotypes, or temperaments (as illustrated in Figure 2.9; see also Volume 1, Chapter 15), were created by repeated selection acting at a single time point early in development, suggesting that a genetic potential exists for more than one organized developmental path. A pup could be set on one or another path when alleles for associated traits that enhanced one or another level of the infantile separation calling trait are gradually accumulated by repeated selection. These results resemble the studies from the Meaney group (reviewed in Cameron et al., 2005) in which early environmental change and different maternal behavior patterns had long-term effects on specific patterns of associated traits in adult offspring. Thus, the developmental structures for two or more behavioral phenotypes with adaptive potential are present and can be realized either through selection over 15 or 20 generations, or in a more rapid but shorter-lived transgenerational response to certain types of stress from the parental to the offspring generations.
SUMMARY Scientists’ understanding of development has reached a new phase with the rapid advances in molecular genetics that for the first time allow them to see in detail how development takes place through changes in gene expression over time. These advances have led to an integration of evolutionary and developmental biologies (evo-devo) based in large part on the insight that novel forms are generated in evolution as much through variation in the regulation of genes during development as by changes in gene frequency in populations through natural selection. It is generally agreed that there is still no general theory of development comparable to evolutionary theory (Bateson et al., 2007), but scientists do have a new understanding of the relationship between those two great historical processes of biology. Multicellular development has its origin in the Cambrian explosion of major animal groups (phyla) appearing suddenly in the fossil record 500 million years ago. During and after this major transition in evolution, developmental processes were selected that promoted the construction of larger and more complex organisms capable of inhabiting and exploiting new ecological niches. But these construction processes were also shaped by selection for their capacity to create ontogenetic adaptations that enabled survival of early developmental forms in their own unique and transient environments. Furthermore, developmental
c02.indd Sec4:28
processes that enabled novel and potentially useful variations in developmental paths that enabled the re-creation of successful paths in the next generation were also selected for their capacity to facilitate the evolution of successful multicellular forms. Identifying these evolutionary selection pressures for construction, ontogenetic adaptation, variation, and inheritance (developmental selection) provides an understanding of the functions being carried out by developmental processes, similar to the understanding gained through identifying the evolutionary functions of stress responses or of social behaviors. In this chapter, I have used the principles of developmental selection in discussing what has been learned recently about the development of early attachment: its prenatal origins, approach and contact maintenance systems, learning of specific maternal discrimination and preference, novel mechanisms for separation responses, and hidden regulators of early development. Finally, I have described the evolution and development of long-term effects of early attachment and the resulting role of mother–infant interaction in the shaping of alternative pathways of development and in the formation of temperament. Another theme of this chapter has been the concept of levels of organization, a way of thinking that is extremely useful in trying to understand how events and processes that are studied by neuroscientists in cells and in brain circuits are related to the observed behavior and inner experiences studied by psychologists. I have tried to illustrate how psychological constructs such as the bond, emotional responses to separation, mental representations and long-term developmental effects deriving from parent–infant interactions can be better understood by learning more about the component neurobiological processes that underlie and have given rise to the psychological constructs and theory. Both the evolutionary developmental approach and the research revealing the component developmental processes of attachment are in the early stages of their own development; there is a great deal still to learn about them. Perhaps the most important gap in this understanding of development is in the area of the self-organizing processes that hold together and guide the seemingly endless series of cascades and networks of gene regulation that underlie development at its deepest level. This “construction” component of development not only embodies as-yet unknown principles of self-organization but is clearly supported, directed, and organized also from the “outside,” by the environment inherited from the previous generation, in ways scientists are only beginning to understand. This environmental matrix begins with the early life of germ cells developing within the parents when they are still embryos, and with the signal proteins and transcription
8/18/09 4:57:26 PM
References 29
factors formed in the maternal egg prior to fertilization. It continues through the long-term maternal effects exerted by the symbiotic relationship of the embryonic, fetal, and infant stages of development all the way to the “given” properties of the outside environment within which the young must grow up. In addition to these general areas in which scientists know far too little, specific new biological processes are being discovered that experts never knew existed, such as the controls over gene regulation maintained within the genome itself (e.g., the intricate regulation of DNA transcription and translation by the protein “skeleton” of chromatin). Lastly, many mechanisms for variation have evolved within all of these developmental processes as a result of their having enhanced the opportunities for selection. These processes for facilitating variation are just beginning to be studied. The perspective I see is that there is no end of exciting opportunities for research, that the approaches for investigating developmental neuroscience suggested in this chapter are only a beginning, and that these approaches will soon be modified in interesting ways, perhaps by some of the readers of this chapter.
REFERENCES Alberts, J. R., & May, B. (1984). Nonnutritive, thermotactile induction of filial huddling in rat pups. Developmental Psychobiology, 17, 161–181. Bateson, P., Hofer, M., Oppenheim, R., & Wiedenmayer, C. (2007). Developing a framework for development: A discussion. Developmental Psychobiology, 49, 77–86. Baylin, S. B., & Schuebel, K. E. (2007, August 2). Genomic biology: The epigenomic era opens. Nature, 448, 548–549. Blass, E. M. (1990). Suckling: Determinants, changes, mechanisms, and lasting impressions. Developmental Psychology, 26(4), 520–533. Bonner, J. T. (1993). Life cycles: Reflections of an evolutionary biologist. Princeton, NJ: Princeton University Press. Bowlby, J. (1969). Attachment: Attachment and loss (Vol. 1). New York: Basic Books. Bowlby, J. (1982). Attachment and loss: Retrospect and prospect. American Journal of Orthopsychiatry, 52, 664–678. Brake, S. C., Sager, D. J., Sullivan, R., & Hofer, M. (1982). The role of intraoral and gastrointestinal cues in the control of sucking and milk consumption in rat pups. Developmental Psychobiology, 15, 529–541. Brennen, P. A., & Kaverne, E. B. (1997). Neural mechanisms of mammalian olfactory learning. Progress in Neurobiology, 51, 457–451. Brunelli, S. A., & Hofer, M. A. (2001). Selective breeding for an infantile phenotype (isolation calling): A window on developmental processes. In E. Blass (Ed.), Handbook of behavioral neurobiology (pp. 433–482). New York: Plenum Press. Brunelli, S. A., & Hofer, M. A. (2007). Selective breeding for infant rat separation-induced ultrasonic vocalizations: Developmental precursors of passive and active coping styles. Behavioural Brain Research, 182, 193–207.
c02.indd Sec5:29
Brunelli, S. A., Vinocur, D. D., Soo-Hoo, D., & Hofer, M. A. (1997). Five generations of selective breeding for ultrasonic vocalization (USV) responses in N:NIH strain rats. Developmental Psychobiology, 31, 255–265. Cameron, N., Parent, C., Champagne, F., Fish, E., Kuroda, K., & Meaney, M. (2005). The programming of individual differences in defensive responses and reproductive strategies in the rat through variations in maternal care. Neuroscience and Biobehavioral Review, 29, 843–865. Camp, L. L., & Rudy, J. W. (1988). Changes in the categorization of appetitive and aversive events during postnatal development of the rat. Developmental Psychobiology, 21, 25–42. Carroll, R. (2002). Evolution of the capacity to evolve. Journal of Evolutionary Biology, 15, 911–921. Carroll, S. (2005). Endless forms most beautiful: The new science of evo-devo. New York: Norton. Champagne, F. A., & Meaney, M. J. (2006). Stress during gestation alters postpartum maternal care and the development of the offspring in a rodent model. Bio Psychiatry, 59, 1227–1235 De Casper, A. J., & Fifer, W. P. (1980, June 6). Of human bonding: Newborns prefer their mothers’ voices. Science, 208, 1174–1176. De Casper, A. J., & Spence, M. J. (1988). Prenatal maternal speech influences a newborn’s perception of speech sounds. Infant Behavior and Development, 9, 133–150. Ehret, G. (1992). Preadaptations in the auditory system for mammals for phoneme perception. In M. E. H. Schouten (Ed.), The auditory processing of speech: From sounds to words (pp. 99–112). Berlin, Germany: Mouton de Gruyter. Eidelman, A. I., & Kaitz, M. (1992). Olfactory recognition: A genetic or learned capacity? Journal of Developmental and Behavioral Pediatrics, 13(2), 126–127. Emerich, D. F., Scalzo, F. M., Enters, E. K., Spear, N. E., & Spear, L. P. (1985). Effects of 6-hydroxydopamine-induced catecholamine depletion on shock-precipitated wall climbing of infant rat pups. Developmental Psychobiology, 18, 215–227. Field, T. M., Schanberg, S. M., Scafidi, F., Bauer, C. R., Vega-Lahr, N., Garcia, R., et al. (1986). Tactile/kinesthetic stimulation effects on preterm neonates. Pediatrics, 77, 654–658. Fifer, W. P., & Moon, C. M. (1995). The effects of fetal experience with sound. In J. P. Lecanuet, W. P. Fifer, N. Krasnegor, & W. P. Smotherman (Eds.), Fetal development: A psychobiological perspective (pp. 351–368). Hillside, NJ: Erlbaum. Fillion, T. J., & Blass, E. M. (1986, February 14). Infantile experience with suckling odors determines adult sexual behavior in male rats. Science, 231, 729–731. Fleming, A. S., O’Day, D. H., & Kraemer, G. W. (1999). Neurobiology of mother-infant interactions: Experience and central nervous system plasticity across development and generations. Neuroscience and Biobehavioral Review, 23, 673–685. Freud, A., & Burlingham, D. (1943). War and children. New York: Medical War Books. Goodwin, G. A., & Barr, G. A. (1998). Behavioral and heart rate effects of infusing kainic acid into the dorsal midbrain during early development in the rat. Developmental Brain Research, 107(1), 11–20. Gould, S. J. (1989). A wonderful life. New York: Norton. Haeckel, E. (1892). The history of creation (Lankester, Trans.). In Natürliche Schöpfungsgeschichte (8th ed., Vol. 2, pp. 422–544). London: Kegan Paul, Trench, Trubner. Harlow, H. F. (1958). The Nature of Love. American Psychology, 12, 673–685. Hepper, P. G. (1987). The amniotic fluid: An important priming role in kin recognition. Animal Behaviour, 35, 1343–1346.
8/18/09 4:57:26 PM
30
Developmental Neuroscience
Hofer, M. A. (1971, June 4). Cardiac rate regulated by nutritional factor in young rats. Science, 172, 1039–1041. Hofer, M. A. (1975). Studies on how early maternal separation produces behavioral change in young rats. Psychosomatic Medicine, 37, 245–264. Hofer, M. A. (1976). The organization of sleep and wakefulness after maternal separation in young rats. Developmental Psychobiology, 9, 189–205. Hofer, M. A. (1980). Effects of reserpine and amphetamine on the development of hyperactivity in maternally deprived rat pups. Psychosomatic Medicine, 42, 513–520. Hofer, M. A. (1984). Relationships as regulators: A psychobiologic perspective on bereavement. Psychosomatic Medicine, 46, 183–197. Hofer, M. A. (1996). Multiple regulators of ultrasonic vocalization in the infant rat. Psychoneuroendocrinology, 21(2), 203–217. Hofer, M. A. (2005). The psychobiology of early attachment. Clinical Neuroscience Research, 4, 291–300.
Moriceau, S., Wilson, D. A., Levine, S., & Sullivan, R. M. (2006). Dual circuitry for odor-shock conditioning during infancy: Corticosterone switches between fear and attraction via amygdala. Journal of Neuroscience, 26, 6737–6748. Mousseau, T. A., & Fox, C. W. (1998). Maternal effects as adaptations. Oxford, England: Oxford University Press. Myers, M. M., Brunelli, S. A., Squire, J. M., Shindeldecker, R. D., & Hofer, M. A. (1989). Maternal behavior of SHR rats and its relationship to offspring blood pressures. Developmental Psychobiology, 22, 29–53. Newman, J. D. (Ed.). (1998). The physiological control of mammalian vocalization. New York: Plenum Press. Oppenheim, R. W. (1984). Ontogenetic adaptations in neural development: Towards a more “ecological” developmental psychobiology. In H. F. R. Prechtl (Ed.), Continuity of neural functions from prenatal to postnatal life (pp. 16–30). Philadelphia: Lippincott.
Hofer, M. A., Brunelli, S. A., & Shair, H. N. (2001). Developmental effects of selective breeding for an infantile trait: The rat pups ultrasonic isolation call (USV). Psychobiology, 39, 1–16.
Pedersen, P. E., Williams, C. L., & Blass, E. M. (1982). Activation and odor conditioning of suckling behavior in 3-day-old albino rats. Journal of Experimental Psychology: Animal Behavior Processes, 8, 329–341.
Hofer, M. A., & Shair, H. N. (1991). Trigeminal and olfactory pathways mediating isolation distress and companion comfort responses in rat pups. Behavioral Neuroscience, 105, 699–706.
Polan, H. J., & Hofer, M. A. (1998). Olfactory preference for mother over home nest shavings by newborn rats. Developmental Psychobiology, 33, 5–20.
Hofer, M. A., & Weiner, H. (1975). Physiological mechanisms for cardiac control by nutritional intake after early maternal separation in the young rat. Psychosomatic Medicine, 37, 8–24.
Polan, H. J., & Hofer, M. A. (1999). Maternally directed orienting behaviors of newborn rats. Developmental Psychobiology, 34, 269–279.
Kaitz, M., Lapidot, P., & Bronner, R. (1992). Parturient women can recognize their infants by touch. Developmental Psychobiology, 1, 35–39. Katz, L. M., Nathan, L., Kuhn, C. M., & Schanberg, S. M. (1996). Inhibition of GH in maternal separation may be mediated through altered serotonergic activity at 5-HT2A and 5-HT2C receptors. Psychoneuroendocrinology, 21(2), 219–235. Kirschner, M. W., & Gerhart, J. C. (2005). The plausibility of life: Resolving Darwin’s dilemma. New Haven, CT: Yale University Press. Knoll, A. H. (2003). Life on a young planet: The first three billion years of evolution on earth. Princeton, NJ: Princeton University Press.
Porter, R. H., Cernoch, J. M., & McLaughlin, F. J. (1983). Maternal recognition of neonates through olfactory cues. Physiology and Behavior, 30, 151–154. Robinson, S. R., & Smotherman, W. P. (1995). Habituation and classical conditioning in the rat fetus: Opioid involvements. In J.-P. Lecanuet, N. A. Krasnegor, W. P. Fifer, & W. P. Smotherman (Eds.), Fetal development: A psychobiological perspective (pp. 295–314). Hillside, NJ: Erlbaum. Rosenblum, L. A., & Moltz, H. (1983). Symbiosis in parent-offspring interactions. New York: Plenum Press.
Kraemer, G. W. (1992). A psychobiological theory of attachment. Behavioral and Brain Sciences, 15, 493–511.
Shear, M. K., Brunelli, S. R., & Hofer, M. A. (1983). The effects of maternal deprivation and of refeeding on the blood pressure of infant rats. Psychosomatic Medicine, 45, 3–9.
Kuhn, C. M., & Schanberg, S. M. (1991). Stimulation in infancy and brain development. In B. J. Carroll & J. E. Barrett (Eds.), Psychopathology and the brain (pp. 97–111). New York: Raven Press.
Skolnick, N. J., Ackerman, S. H., Hofer, M. A., & Weiner, H. (1980, June 6). Vertical transmission of acquired ulcer susceptibility in the rat. Science, 208, 1161–1163.
Lester, B. M., & Boukydis, C. F. (Eds.). (1985). Infant crying: Theoretical and research perspectives. New York: Plenum Press.
Smotherman, W. P., Bell, R. W., Hershberger, W. A., & Coover, G. D. (1978). Orientation to rat pup cues: Effects of maternal experiential history. Animal Behaviour, 26, 265–273.
Levy, F., Gervais, R., Kindermann, U., Orgeur, P., & Piketty, V. (1990). Importance of beta-noradrenergic receptors in the olfactory bulb of sheep for recognition of lambs. Behavioral Neuroscience, 104, 464–469. Lorenz, K. (1996). On aggression. New York: Harcourt, Brace & World. Makin, J. W., & Porter, R. H. (1989). Attractiveness of lactating females’ breast odors to neonates. Child Development, 60, 803–810.
Smotherman, W. P., & Robinson, S. R. (1986). Environmental determinants of behaviour in the rat fetus. Animal Behaviour, 34, 1859–1873. Stehouwer, D. J., & Campbell, B. A. (1978). Habituation of the forelimbwithdrawal response in neonatal rats. Journal of Experimental Psychology: Animal Behavior Processes, 4, 104–119. Stern, K., & McClintock, M. K. (1998, March 12). Regulation of ovulation by human pheromones. Nature, 392, 177–179.
Miczek, K. A., Tornatsky, W., & Vivian, J. (1991). Ethology and neuropharmacology: Rodent ultrasounds. In B. Oliver, J. Mos & J. L. Slanger (Eds.), Animal Models in Psychopharmacology (pp. 409–427). Basel, Switzerland: Birkhäuser.
Stone, E. A., Bonnet, K. A., & Hofer, M. A. (1976). Survival and development of maternally deprived rats: Role of body temperature. Psychosomatic Medicine, 38, 242–249.
Moffat, S. D., Suh, E. J., & Fleming, A. S. (1993). Noradrenergic involvement in the consolidation of maternal experience in postpartum rats. Physiology and Behavior, 53, 805–811.
Sullivan, R. M., Brake, S. C., Hofer, M. A., & Williams, C. L. (1986). Huddling and independent feeding of neonatal rats can be facilitated by a conditioned change in behavioral state. Developmental Psychobiology, 19, 625–635.
Moore, C. L., Jordan, L., & Wong, L. (1996). Early olfactory experience, novelty, and choice of sexual partner by male rats. Physiology and Behavior, 60, 1361–1367.
c02.indd Sec5:30
Sullivan, R. M., Hofer, M. A., & Brake, S. C. (1986). Olfactory-guided orientation in neonatal rats is enhanced by a conditioned change in behavioral state. Developmental Psychobiology, 19, 615–623.
8/18/09 4:57:27 PM
References 31
c02.indd Sec5:31
Sullivan, R. M., Taborsky-Barba, S., Mendoza, R., Itano, A., Leon, M., Cotman, C. W., et al. (1991). Olfactory classical conditioning in neonates. Pediatrics, 87, 511–518.
Varendi, H., Porter, R. H., & Winberg, J. (1996). Attractiveness of amniotic fluid odor: Evidence of prenatal olfactory learning? Acta Paediatrica, 85, 1223–1227.
Sullivan, R. M., & Wilson, D. A. (1995). Dissociation of behavioral and neural correlates of early associative learning. Developmental Psychobiology, 28, 213–219.
Weaver, I. C. G., D’Alessio, A. C., Brown, S. E., Hellstrom, I. C., Dymov, S., Sharma, S., Szyf, M., & Meaney, M. J. (2007). The transcription factor nerve growth factor-inducible protein A, mediates epigenetic programming: Altering epigenetic marks by immediate-early genes. Journal of Neuroscience, 27, 1756–1768.
Sullivan, R. M., Wilson, D. A., Wong, R., Correa, A., & Leon, M. (1990). Modified behavioral and olfactory bulb responses to maternal odors in preweanling rats. Brain Research. Developmental Brain Research, 53, 243–247.
West-Eberhard, M. J. (2003). Developmental plasticity and evolution. Oxford, England: Oxford University Press.
Takahashi, L. K. (1992). Ontogeny of behavioral inhibition induced by unfamiliar adult male conspecifics in preweanling rats. Physiology and Behavior, 52, 493–498.
Wilson, D. A., & Sullivan, R. M. (1994). Neurobiology of associative learning in the neonate: Early olfactory learning. Behavioral and Neural Biology, 61, 1–18.
8/18/09 4:57:27 PM
Chapter 3
Comparative Cognition and Neuroscience CHARLES T. SNOWDON AND KATHERINE A. CRONIN
all, insects, fish, birds, and mammals. Presumably, in environmental conditions where color vision would benefit the reproductive success of individuals, individuals possessing mutations leading to color vision would be more likely to survive and would leave more surviving offspring with the color vision mutation. Many cognitive and behavioral phenomena appear as examples of convergent evolution: families as the basic reproductive unit in many birds, marmosets, tamarins, titi monkeys, and humans; enhanced spatial memory in seed-caching birds and rodents; vocal learning in birds, marine mammals, and humans, and so forth. In this chapter, we first discuss the methods and cautions of comparative studies and then focus on a selection of cognitive phenomena for which there are good comparative data and at least some information on the neural bases. We include social processes among the phenomena that we
Developmental processes are typically viewed from the perspective of an individual’s history, but these processes can also be viewed from the broader historical context of evolution by natural selection. The study of the evolution of cognition and its neural correlates involves the use of a comparative method: the study of multiple species that differ in one factor that may influence the selective pressures on the trait of interest. Factors investigated may include phylogeny (e.g., birds vs. mammals, or apes vs. monkeys), social structure or social organization (e.g., multimale, multifemale groups vs. families), mating system (e.g., monogamy vs. polygamy), group size (e.g., pair with offspring vs. many individuals), foraging behavior (e.g., generalist vs. specialist, or clumped vs. distributed food resources), or ranging behavior (e.g., territorial vs. migratory). Thus, someone interested in the evolutionary origins of language might compare the abilities of humans with those of other great apes. Someone interested in spatial memory might compare two closely related species, one that stores food in many locations and recovers the food later and another that does not store food. The typical view of evolutionary processes is that they diverge. As new species form and become reproductively isolated, they diverge from one another. From this perspective, those species that have shared ancestry and are similar on most variables but differ on some critical one (e.g., monogamous vs. polygamous mating systems in some closely related rodent species) are the comparisons of greatest interest. However, an additional view of evolution is one of convergence (see Figure 3.1). Two quite distantly related species may face similar ecological problems and may independently reach similar solutions. One example is color vision, which has independently appeared in some, but not
Trait variation B
C
D
E
Time
A
Figure 3.1 Convergent and divergent evolution. Letters represent species on this hypothetical phylogenetic tree. The less the horizontal distance between species, the more similar the species are on the trait of interest. Species D and species E are more closely related to each other than are species B and species C. However, species D and species E are less similar on the trait of interest and are an example of divergent evolution, whereas species B and species C are more similar on the trait of interest and are an example of convergent evolution.
We thank Bridget Pieper and Carla Boe-Nesbit for critical comments on the manuscript and Andrew Fox for critical feedback on neural mechanisms of cooperation. Our research was supported by U.S. Public Health Service Grant MH035215 and a Hilldale Professorship from the University of Wisconsin, Madison. KAC was supported by a National Science Foundation Graduate Fellowship. 32
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c03.indd 32
8/17/09 1:58:29 PM
Methods and Cautions 33
call cognition. Thus, we review cooperation and prosocial behavior, social learning, spatial memory, and pair bonding. An obvious omission is comparative work on bird song learning and its neural mechanisms, which is covered in Chapter 45.
METHODS AND CAUTIONS Choosing Species for Comparison The selection of appropriate species for comparison depends on the nature of the question being addressed. However, often species are chosen on the basis of phylogenetic similarity alone. For example, chimpanzees (Pan troglodytes) and bonobos (Pan paniscus) are the closest relatives of humans, but these apes diverged from humans more than 4.5 million years ago. Behaviorally, chimpanzees differ significantly from bonobos by being more competitive and aggressive, and both species differ from humans in significant ways. Nonetheless, because both species have relatively large brains that are similar in form to those of humans, these species may be good models for comparisons with humans involving some cognitive abilities. However, they may not be good models for comparisons with some human social and emotional processes because the social environments of these species differ greatly from our own. Comparative models might also be chosen from species that are quite different but share some interesting commonality. For example, human fathers often appear to display competence in caring for and bonding with their infants, something rarely, if ever, observed in other great apes. However, fathers in several New World primate species and in some rodents and many birds exhibit spontaneous care of infants and form long-lasting bonds, making these species potentially better candidates than great apes and Old World primates for studying paternal behavior (Snowdon & Ziegler, 2007). As we show here, species with extensive coordination of infant care between different group members are more likely to exhibit social learning (Coussi-Korbel & Fragaszy, 1995) and cooperative behavior (Cronin, Kurian, & Snowdon, 2005). Ecological factors affecting foraging behavior can also lead to differences in rapidity of social learning (Lefebvre & Palameta, 1988). As we also show, social factors or mating systems can lead closely related species, such as monogamous versus polygynous mice and voles, to have very different behavior, hormones, and neural organizations than we would predict from phylogeny alone. Important comparisons may also be made within species. Call and Tomasello (1996) and Thompson, Oden, and Boysen (1997) have shown differences in performance on social learning and second-order analogical reasoning tasks between so-called enculturated chimpanzees that have had extensive
c03.indd Sec1:33
human interactions during development compared with chimpanzees that have been reared naturally by their own species. It is not clear what aspects of enculturation are important for high-level cognitive function, but the ability to interact comfortably with the humans who are testing the animals may lead these chimps to exhibit better performance than those with less of a history of interaction with humans. Differences within a single species can also inform researchers of developmental influences on cognition. Seasonal differences may also produce behavioral and neural variation within a species or even an individual. Male birds sing under the influence of testosterone stimulated by increasing day length. Furthermore, changes in the size and neural complexity of brain areas track seasonal differences, with consequent changes in rates of singing or song structure and complexity (Alger & Riters, 2006; Nottebohm, 1981; G. T. Smith, Brenowitz, Beecher, & Wingfield, 1997; see Chapter 4 for more details). No Scala Naturae Hodos and Campbell (1969) argued against the concept of scala naturae, a popular misconception that behavioral changes follow a natural phylogenetic scale from earlier evolved species exhibiting simpler behavior to more complex and more recently evolved species necessarily exhibiting more complex behavior. You still can find psychology texts talking about invertebrates having less complex behavior than fish, which in turn are simpler than birds, which are simpler than mammals. The reality of evolution is much messier than a simple linear hierarchy implies. Honeybees have a complex communication system to indicate the location of nectar (Von Frisch, 1950). Salmon remember the odors of their home streams over several years of ocean living and can return to their natal streams to spawn (Hasler, 1966). Clark’s nutcrackers (Nucifraga columbiana) of the U.S. Southwest can store up to 30,000 seeds each fall and can remember the location of enough of these seeds to survive the winter (Balda & Kamil, 1992). Cotton-top tamarins (Saguinus oedipus; Cronin & Snowdon, 2008; Hauser, Chen, Chen, & Chuang, 2003) and common marmosets (Callithrix jacchus; Burkart, Fehr, Efferson, & van Schaik, 2007) spontaneously display cooperative and reciprocal behaviors that a human parent would admire in his or her children, yet chimpanzees display little evidence of similar positive social behavior (Jensen, Hare, Call, & Tomasello, 2006; Silk et al., 2005; Vonk et al., 2008; see following discussion). Vonk and Povinelli (2006) argued that evolution creates no presumption about the phylogenetic distribution of psychological systems among closely related species, and yet, even among presumably sophisticated primatologists who might be expected to have a more subtle appreciation of the
8/17/09 1:58:30 PM
34
Comparative Cognition and Neuroscience
processes of evolution, we often encounter the idea that primates are somehow more special than other mammals and that apes have cognitive skills surpassing all others. This leads to the frequent dismissal of nonprimate species with complex social structures such as hyenas, wolves, and other social carnivores as well as species more difficult to study, such as marine mammals. In these species one can find evidence of group-specific differences in communication and behavior that might be called culture if seen in apes, as well as coordination of complex social behavior, such as teaching in meerkats (Thornton & McAuliffe, 2006) and communal hunting in hyenas (Drea & Frank, 2003). There is often evidence of clear relationships between aspects of body size, brain size, and cognition. The patterns of allometric relationships may often illuminate interesting exceptions from a strict phylogenetic progression. For example, Figure 3.2 plots body size and encephalization quotient (EQ) for several primate species grouped by phylogeny. There is little variation across New World monkeys, Old World monkeys, apes, and extinct hominoids in brain size as a proportion of body size. However, the two genera with the highest EQs are squirrel monkeys and capuchin monkeys, both New World primates, and one of the genera with the lowest EQ is the gorilla. Note also that modern humans have a significantly greater EQ than any other primate species. A modern reincarnation of the scala naturae is research that uses allometry of brain size to explain the evolution of language. Dunbar (reviewed in Dunbar, 2003) has shown that among terrestrial Old World primates there is a correlation between group size and neocortex size. This suggests that the cognitive demands of social complexity were driving forces in the evolution of brain size (the social
brain hypothesis). Dunbar ’s measure of social complexity was the number of grooming relationships that are possible within a group, as grooming is an essential component of maintaining social relationships in many primates. As group size increases linearly, the number of potential dyadic and higher order relationships increases exponentially. According to Dunbar, typical human groups average 150 in a social cohort, too large to allow individual relationships to be maintained by grooming. Thus, humans have evolved language, using gossip as a proxy for social grooming. In fact, there does appear to be a relationship between typical group size and neocortex volume among the species selected. But is the description of the relationship between group size, social complexity, and brain size adequate? Figure 3.3A shows New World primates that diverged from the ape lineage earlier than Old World monkeys. The monkeys with the largest brains and the largest groups (muriquis, Brachyteles spp.) almost never groom one another (Strier, 1997), whereas the primates with brains with less neocortical volume and the smallest group sizes (marmosets and tamarins) groom extensively. One field study of common marmosets reported adult pairs grooming each other more than 20% of the day (Lazaro-Perea, de Fatima Arruda, & Snowdon, 2004). Figure 3.3B shows that for both New World primates and apes there is a linear relationship between EQ and group size; so the basic relationship between group size and brain size still holds, but not for grooming time. These findings argue against a clear relationship between group size, grooming, social complexity, and the evolution of the neocortex. One lesson from this is that limiting the selection and range of species studied can potentially lead to different
8 New World primates
Encephalization Quotient
7
Old World primates Apes
6
Extinct hominoids 5
Modern humans
4 3 2 1 0 0.1
1
10 Body Weight
Figure 3.2 Encephalization quotient (EQ) plotted against log body weight for several genera of primates. Note the variation within each group of primates (Saimiri and Cebus as New World primates have higher EQ’s than all other
c03.indd Sec1:34
100
1000
nonhuman primates. Gorillas, furthest to the right, have a small EQ relative to all great apes and most monkeys. Only modern humans have an EQ that is not in the range of other species. (Drawn from data in Aiello & Dean, 1990)
8/17/09 1:58:30 PM
Methods and Cautions 35 (A)
25
Time Grooming (%)
20
15
10
5
0 0
(B)
20
40
60 Group Size
80
60 Group Size
80
100
120
3.5
Encephalization Quotient
3 2.5 2 1.5 1 0.5 0 0
20
40
conclusions. Another lesson is that brain size is not necessarily synonymous with complexity. As a nonprimate corollary of this, G. T. Smith et al. (1997) found that although the sizes of brain areas related to song production changed with the seasons in song sparrows, some of the change was due to changes in average neuron size and neuron density rather than in the number of neurons. Complexity must be carefully defined, and we should not assume that differences in size must necessarily be related to differences in complexity. Anthropomorphism and Anthropocentrism Many of the problems with the scala naturae view among scientists otherwise quite sophisticated about natural selection and evolution may be due to both anthropomorphism (ascribing human traits to nonhuman animals) and anthropocentrism (seeing cognition in other species through the lens of human abilities). When species are more similar to humans, it may be difficult to avoid ascribing humanlike cognition to these species. Even when we try hard to
c03.indd Sec1:35
100
120
Figure 3.3 The linear regression from New World primates is shown by a solid line, for Old World primates by a dotted line and for Apes by a dashed line. A: Grooming time as a function of group size in selected genera of New World primates, Old World primates and apes. Regression lines for Old World primates and apes have a positive slope, whereas New World primates show a negative relationship between grooming time and group size. (Data from Dunbar, 1991) B: Encephalization quotient as a function of group size in the same genera showing a positive relationship between brain size and group size for New World primates and apes and no clear trend for Old World primates. EQ data from Aiello and Dean (1990) and group size data from Dunbar (1991).
avoid being anthropomorphic, it is nearly impossible to avoid being anthropocentric. This has led to the idea that increasingly complex social life (as measured by mean group size) must be accompanied by increasingly complex cognitive and social skills (Barrett, Henzi, & Rendall, 2007). As mentioned, social group size may not be a good index of brain complexity, and many species far removed from humans have displayed complex cognition on one or more dimensions in the context of their environment. Furthermore, it may be a mistake to assume that all of human behavior is governed by complex cognitive processes. Many of us engage in automatic behaviors (routines in cooking, driving to work, semantically meaningless phrases used in greetings, etc.). We may be overrating our own cognitive ability simply because we are the species doing the rating. Cognitive complexity must be seen through the lens of the species being studied. None of this is to deny the very real abilities of humans but instead to urge scientists to be more modest when evaluating human abilities (and those of other great apes) compared with those of other species.
8/17/09 1:58:30 PM
36
Comparative Cognition and Neuroscience
Problem of Plasticity Comparing species on different aspects of cognition is one way to gain a deeper understanding of cognitive processes and cognitive evolution, yet animals, human and nonhuman, are notoriously plastic and are able to modify behavior. For example, ecological differences between captive and wild environments may lead to different behavior within the same species. Leavens, Hopkins, and Bard (2005) have shown that pointing behavior is observed readily in captive chimpanzees interacting with humans to obtain food, whereas pointing has never been observed in the wild. Unlike wild tamarins, captive-born cotton-top tamarins do not exhibit fear when exposed to snakes, a natural predator (Campbell & Snowdon, in press; Hayes & Snowdon, 1990). Captive-born tamarins responded with equal arousal to playbacks of calls of some natural predators and vegetarian howler monkeys, suggesting that initial responses to predator calls may be flexible and may be related to certain acoustic properties rather than innate predator recognition (Friant, Campbell, & Snowdon, 2008). Captive tamarins spontaneously produced alarm calls when fed a familiar food that had been adulterated with pepper, although they had never alarm called at that food previously (Snowdon & Boe, 2003). They also gave mobbing vocalizations toward a caretaker acting as though coming to catch a monkey, despite their lack of arousal toward natural predators (Campbell & Snowdon, 2007). These results on captive chimpanzees and tamarins suggest that captivity represents an ecological niche to which animals may adapt, leading to changes in behavioral contexts different from those in which their wild conspecifics would show similar behavior. Withinspecies comparisons of individuals living under different social or environmental conditions can identify the environmental (as opposed to phylogenetic and genetic) variables that can affect cognitive processing and behavioral expression. This contextual flexibility may paradoxically utilize simpler neural processes than those hypothesized for contextually inflexible cognitive processes. For example, with respect to communication between animals, Owren and Rendall (1997) have argued for flexibility in the contexts in which signals are used and how they are responded to by others. They offered a simple model of affective conditioning by which individuals learn the relationships between signals and affective state. As environments change, organisms can acquire new signal–meaning correspondences. Saffran, Aslin, and Newport (with human infants; 1996) and Hauser, Newport, and Aslin (with cotton-top tamarins; 2001) have demonstrated that human babies and monkeys can learn about statistical regularities in auditory input, a simple but rapid
c03.indd Sec1:36
way in which organisms can acquire contingencies between sounds. Thus, an animal in a novel context (e.g., captivity) can quickly learn to associate alarm calls with caretakers or veterinarians rather than leopards and snakes, or can learn to eat foods different from those found in the wild and still give appropriate food calls. These basic processes of conditioning provide a general way to learn rapidly about associations of signals and events in the environment without reliance on a host of hypothetical modality and context-specific modules that have been shaped by evolution (Tooby & Cosmides, 1990). Successful adaptation is more likely to occur when organisms can respond rapidly to changes in environment than rely on hard-wired species-specific stimulus–response connections. Two striking examples of plasticity come from work on rodents. For the first, Marler and colleagues studied two species of mice (Table 3.1). The California mouse (Peromyscus californicus) is one of the few species known from field studies to be both socially and genetically monogamous (Ribble, 1991). Males defend their mates and play an important role in infant care, with infant survival being affected by fathers (Gubernick & Tefari, 2000). The white-footed mouse (P. leucopus) is a close relative but is polygamous. These mice do not form pair bonds, and individual males and females may mate with many others. Males provide little, if any, infant care and do not show territorial defense. Bester-Meredith and Marler (2001) successfully cross-fostered these mice and showed that California mice reared by white-footed mice exhibited reduced paternal care and reduced territorial defense behavior, becoming more similar to their foster species. Furthermore, the patterns of distribution of vasopressin (a neuropeptide hormone related to aggression) staining in the brains of cross-fostered California mice more closely approximated the distribution patterns seen in white-footed mice than those seen in their own species. Within California mice, sires that were retrieved less as young retrieved their pups less often when they became adults (Bester-Meredith & Marler, 2001). An experimental increase in the amount of retrievals experienced by California mice offspring led to decreased attack latencies in both males and females and greater vasopressin immunoreactivity in the dorsal bed nucleus of the stria terminalis (Frazier, Trainor, Cravens, Whitney, & Marler, 2006). Cross-fostering (CF) to another species can change not only behavior but also brain organization and brain function. The specific nature of the crossfostering experience that leads to behavioral and neural changes is fairly minimal. In the second example, the work of Meaney and collaborators has shown that variations in maternal licking and grooming in rats (Rattus norvegicus) are related to variations in stress reactivity, with greater amounts of licking
8/17/09 1:58:31 PM
Social Learning 37 TABLE 3.1 Effects of cross-fostering (CF) on behavior and arginine vasopressin (AVP) brain staining in mice. Species CrossFostered
Territorial Aggression
Neutral Arena Aggression
Paternal Care
AVPImmuno Staining
Whitefooted mouse
No effect
Control < CF
No effect
No effect
California mouse
Control ⬎ CF
No effect
Control ⬎ CF
Control ⬎ CF
Note. From “Paternal Behavior and Aggression: Endocrine Mechanisms and Non-Genomic Transmission of Behavior,” by C. A. Marler, J. K. Bester-Meredith, and B. C. Trainor, 2003, Advances in the Study of Behavior, 32, pp. 263–323. Adapted with permission. Elsevier Science 2003.
and grooming reducing stress reactivity of the pups when they are adults. Furthermore, when offspring become adults they show similar licking and grooming patterns as their mothers. Cross-fostering studies showed these effects to be nongenomic but purely a result of the early grooming and licking received (Francis, Diorio, Liu, & Meaney, 1999). High levels of maternal licking and grooming translated into differences in hippocampal glucocorticoid receptor messenger RNA expression (Liu et al., 1997) as well as increased synapse formation in the hippocampus, increased expression of N-methyl-D-aspartic acid receptors, increased cholinergic innervation of the hippocampus, and enhanced spatial learning and memory (Liu, Diorio, Day, Francis, & Meaney, 2000). Furthermore, high levels of early maternal care are associated with differences in responsiveness of oxytocin receptors to stimulation by estrogen (Champagne, Diorio, Sharma, & Meaney, 2001). That such small but significant differences in early rearing can have profound effects on behavior and brain function in several species should caution us about thinking about behavior as being determined solely by the direct effects of natural selection. Organisms are not static entities but respond flexibly and dynamically to their environments.
SOCIAL LEARNING Social learning (or socially mediated learning) has been studied in a wide range of species. Social learning occurs when the acquisition of behavior is influenced by the activities of other individuals, either directly or indirectly (Box, 1984). Much research has focused on the mechanisms that enable social learning, such as an observer ’s attention being drawn to a particular object as a result of another individual interacting with that object (stimulus enhancement;
c03.indd Sec2:37
Spence, 1937), or the increased likelihood of performing a familiar motor action after seeing it performed by another individual (response facilitation; Byrne, 1994). More than seven mechanisms that enable social learning have been identified (Whiten & Ham, 1992). One mechanism, imitation, has received the most attention and has been categorized as the most challenging and unique form of social learning (Byrne, 1999). Some have argued that imitation requires theory of mind, or the ability to conceive of the intentions of others (Premack & Woodruff, 1978; see also Stone, this volume). Because theory of mind is widely thought to be an ability restricted to humans and potentially other great apes (Byrne, 1999), it has been argued that imitation is impossible for other taxa (Whiten & Ham, 1992). Miklosi (1999) reported that although some studies have provided data in support of imitative abilities in other species, others have quickly criticized and reinterpreted the results. The prevalence of imitative ability across taxa, and the cognitive mechanisms that underlie imitation, will likely be a hotly debated issue for some time, especially because some researchers with beliefs in scala naturae will cling to the uniqueness of imitation for humans and great apes. In order to evaluate the social learning capability or tendency of a group or species, researchers typically employ one of three methodologies. The first is simply exposing an individual trained to perform a skill to one or more naive individuals and then measuring whether the skill is expressed by the formerly naive individual(s). The second, transmission or diffusion chains, uses the same idea but measures whether the formerly naive individual subsequently transfers the acquired behavior to another naive individual (e.g., Galef & Allen, 1995). The third is the dual action task, in which there are two functionally equivalent methods available for solving a task. Individuals observe one or the other method and then are tested on whether their performance matches the method employed by the demonstrator (e.g., Humle & Snowdon, 2008). Social learning can permit behavioral calibration to the unpredictable properties of an environment with a degree of specificity not permitted by genetically coded information (Galef & Laland, 2005). However, it is a common misconception to assume that social learning is always more advantageous than individual learning, and indiscriminate copying of surrounding individuals is unlikely to be a stable strategy in all species (Boyd & Richerson, 1988; Laland, 2004). Theoretical models have demonstrated that social learning is more effective than individual learning when two conditions are met: (1) the cost of individually acquiring accurate information is high, and (2) knowledgeable individuals are present that have experienced the same environment (Boyd & Richerson, 1988). Therefore,
8/17/09 1:58:31 PM
38
Comparative Cognition and Neuroscience
we would expect that even closely related species might display different propensities to acquire information socially if they have evolved in different environments. Laland has expanded on these models to argue that natural selection should have favored heuristics dictating when individuals should acquire information socially, and from whom the information should be obtained. Decades ago, Klopfer (1959, 1961) proposed two hypotheses to account for interspecific variation in social learning. He reasoned that solitary species should demonstrate less social learning than gregarious species, and species with conservative foraging strategies should exhibit less social learning than species with opportunistic foraging styles. In fact, most attempts to predict or explain interspecific differences in social learning to date have been based on differences in either sociality or feeding ecology. Influence of Sociality on Social Learning Coussi-Korbel and Fragaszy (1995) presented a model relating social learning to social organization. They argued that behavioral coordination in time and/or space is common to all forms of social learning and that the extent to which behavioral coordination is expressed predicts social learning. An example of behavioral coordination in time (but not space) is a flock of birds feeding simultaneously while the nearest neighbors are at some distance. An example of behavioral coordination in space (but not time) is when an individual approaches a location where a conspecific was previously active but is no longer present. It is only behavioral coordination in both time and space that requires physical proximity between individuals. Differences in the amount of spatial proximity sought out and tolerated by conspecifics vary greatly across species and can often be related to larger social constructs such as dominance structures or mating systems. In despotic species, proximity between conspecifics, particularly near food or desirable items, will be infrequent. In more egalitarian species, however, proximity between conspecifics will be more common. Therefore, the social context in which an individual is immersed is likely to influence opportunities for, and subsequent expression of, social learning. However, it is not always the case that hierarchically organized species demonstrate less social learning. It appears that when the social hierarchy promotes close, constant monitoring of other individuals and their behavior, the probability for social learning may increase because of the increased attention paid to conspecifics. When the attention structure brought about by the hierarchy promotes social learning, it should do so in a heterogeneous fashion (Coussi-Korbel & Fragaszy, 1995). That is, not all
c03.indd Sec2:38
individuals will acquire the socially learned behavior, but rather those that are positioned relatively lower within the hierarchy will be more likely to attend to a demonstrator that is relatively more highly ranked and to subsequently express the modeled behavior. Some studies have made explicit attempts to relate social organization to social learning by performing interspecific comparisons of phylogenetically similar species. Cambefort (1981) compared the discovery and propagation of a feeding skill in chacma baboons (Papio ursinus) and vervet monkeys (Cercopithecus aethiops). The results were also compared with those of a previous study of mandrills (Mandrilli sphinx; Jouventin, Pasteur, & Cambefort, 1976). These three primate species belong to the same family, Cercopithecidae (Nowak, 1999), but exhibit different social organizations. The propagation of the socially learned feeding skill was closely related to the social structure or, more specifically, the attention structure of each species. Vervet monkeys exhibited the weakest hierarchy and group cohesion, and there was no social transmission of the novel foraging skill. In chacma baboons, which exhibited intermediate hierarchy strength, group cohesion, and attention to social partners, there was weak transmission of the novel foraging skill. Finally, in mandrills, which demonstrated the strongest hierarchy, strongest group cohesion, and most attention to social partners, there was fast social propagation of the novel foraging skill. Therefore, the pattern of transmission of the novel behavior mapped onto the hierarchical organization of the group. Another study compared the social transmission of flavor preferences in two hamster species—the golden hamster (Mesocricetus auratus), which is solitary, and the dwarf hamster (Phodopus campbelli), which is moderately social (Lupfer, Frieman, & Coonfield, 2003). When given a choice between a flavor chosen by a demonstrator and another flavor, dwarf hamsters preferred the demonstrator ’s flavor, whereas golden hamsters preferred the demonstrator ’s flavor only if the demonstrator was their mother. In the wild, adult dwarf hamsters interact with one another to share burrows and raise pups. In contrast, golden hamsters rarely interact with one another outside of the mother–pup relationship. Therefore, the degree of social learning expressed may have been predicted by the social organization of the species. A similar finding was reported when the more social pinyon jay (Gymnorhinus cyanocephalus) was compared to the less social Clark’s nutcracker. The individual and social learning abilities of the two species were compared, and the results indicated that pinyon jays learned faster socially than they did individually, whereas the nutcrackers’ performance was not enhanced in the social learning condition (Templeton, Kamil, & Balda, 1999).
8/17/09 1:58:31 PM
Social Learning 39
Whereas interspecies comparisons provide insights about the effect of the social organization of a species on social learning, intraspecies comparisons allow for investigation of the effects of social relationships within a species on social learning. Nicol and Pope (1994) found that the social transmission of a behavior (keypecking) in flocks of laying hens (Gallus gallus) was influenced by the hierarchy of the group and that social learning was greatest when the demonstrator was a dominant individual. A similar effect of social relationships was found in a study of cooperatively breeding common marmosets. Dyads were presented with a task requiring both individuals to act, whereby one manipulated an apparatus to bring food into the reach of the second individual, who could then retrieve the food. Dyads in which the more dominant individual assumed the role of retriever were more successful at solving the coproduction task. However, in the successful dyads, dominant individuals did not consume more rewards than subordinates, even though they had more direct access to the food rewards (Werdenich & Huber, 2002). Schwab, Bugnyar, Schloegl, and Kotrschal (2008) demonstrated that affiliative relationships among kin enhanced the performance of common ravens (Corvus corax) in a social learning task. Drea and Wallen (1999) reported intriguing results indicating that subordinate rhesus macaques (Macaca mulatta) performed less well on learning tasks when in the presence of dominant individuals, even though they had learned the information equally well. Drea and Wallen (1999) were not investigating social learning specifically, but their findings, combined with those of other studies on the effects of social relationships on performance, indicate that the identity of the demonstrator, as well as by-standers in the experimental setup, likely influences the degree of social learning expressed. Intraspecies comparisons can also elucidate developmental differences in social learning. Dependence on social learning may vary in predictable ways throughout the lifetime of an individual (Galef & Laland, 2005). As mentioned previously, there are likely fewer opportunities for social learning in a species that is rarely in proximity with conspecifics, but during periods of offspring dependence proximity will be more frequent and social learning opportunities may be quite regular. Black rats (Rattus rattus) learn socially to open pinecones to obtain seeds; however, it is only the pups, not adults, that are able to acquire the technique by observing experienced rats (Aisner & Terkel, 1992). Across many studies and species, it seems that juveniles are more likely than adults to incorporate new actions into their behavioral repertoire (e.g., Goodall, 1986; InoueNakamura & Matsuzawa, 1997; Kawai, 1965). Chimpanzees are generally adept at social learning both in the wild (reviewed in Matsuzawa, 2001) and in
c03.indd Sec2:39
captivity (reviewed in Whiten, Horner, & Litchfield, 2004). Chimpanzee societies exhibit a strict dominance hierarchy, with the most tolerant and longest lasting relationships existing between mother and offspring. An intraspecific comparison of tool use acquisition by young chimpanzees in Gombe National Park, Tanzania, indicated a striking sex difference in the social acquisition of the skill. Females began using tools for termite fishing at a younger age than males, although there was no difference in the behavior of the mother (the model) toward males and females. Young females spent more time watching their mothers use the tools, whereas males spent more time playing at the termite mound. Therefore, the attention paid by the juveniles was an influential factor in determining the onset of the socially learned skill (Lonsdorf, Eberly, & Pusey, 2004). Again, the direction of attention and the close social relationship appears crucial for predicting the occurrence of social learning. Influence of Feeding Ecology on Social Learning Predictions about the prevalence of social learning across species have also been made on the basis of feeding ecology. Specifically, species with opportunistic or generalist lifestyles, in which individuals are exposed to more environmental variation, should be more likely to demonstrate social learning than species that are conservative or specialists (Johnston, 1982; Klopfer, 1961). This hypothesis rests on the reasoning that social learning allows an individual to modify its behavior to the current environment more efficiently or quickly than would be possible by either individual learning or genetically determined behavior. In this sense, social learning is an adaptive specialization shaped by natural selection. The influence of feeding ecology on social learning has been explored experimentally. Klopfer (1961) demonstrated that greenfinches (Chloris chloris) learned a food discrimination task less well in pairs than individually, as opposed to great tits (Parus major), which did not suffer a learning decrement in pairs. He speculated that a failure to learn socially about novel foods would only fail to be maladaptive in species that display conservative feeding habits, such as the great tit. Dolman, Templeton, and Lefebvre (1996) compared two populations of Barbados Zenaida doves (Zenaida aurita) with different foraging styles. One population consistently exhibits conspecific aggression while foraging, whereas the other forages in flocks without aggression. The aggressively foraging species performed better on a social learning task that employed a heterospecific demonstrator, whereas the nonaggressive species performed better with a conspecific demonstrator. In this study, the type of feeding environment appeared to predict
8/17/09 1:58:32 PM
40
Comparative Cognition and Neuroscience
the pattern of social learning expressed, as the population that typically interferes with conspecifics to compete for food was unable to acquire a socially learned skill from a conspecific. The prevalence of social learning throughout the animal kingdom seems to be best explained by a combination of social, developmental, and ecological factors. Phylogenetic predictions alone appear to do a relatively poor job of explaining interspecific variation in social learning, especially if one considers that comparisons across populations of a single species often lead to as much variation as between-species comparisons. Social Learning and the Brain It is unlikely that the neuronal bases that underlie social learning differ dramatically from mechanisms known to underlie other forms of learning (for a review of the neuronal basis of learning, see Chapter 26). However, an additional pattern of neuronal activation may be unique to social learning and may not be involved in asocial learning such as classical or operant conditioning or trial-and-error learning. Mirror neurons have been identified in macaques (Macaca ssp.) in the ventral premotor and rostral inferior parietal cortex (see Figure 8.11 in Chapter 8 for the approximate location). The defining characteristic of mirror neurons is that they fire both when the animal performs an object-oriented action with its hand or mouth, and when the animal observes another individual (human or conspecific) performing the same motor action (di Pellegrino, Fadiga, Fogassi, Gallese, & Rizzolatti, 1992; Gallese, Fadiga, Fogassi, & Rizzolatti, 1996; Rizzolatti, Fadiga, Gallese, & Fogassi, 1996). A leading hypothesis regarding the function of the mirror neuron system posits that mirror neurons are the basis of action understanding (Rizzolatti, Fogassi, & Gallese, 2001). Indeed, accumulating evidence suggests that the understanding of the meaning of actions determines the discharge of mirror neurons, rather than observations of the actions themselves. Umiltà et al. (2001) demonstrated that the majority of mirror neurons of rhesus monkeys respond during the observation of partially hidden actions, when full visual information about the action is not necessary to recognize the goal. The mirror neuron system also responds to the sound of a goal-related action, even when the monkey cannot see the action (Keysers et al., 2003; Kohler et al., 2002). These observations provide support for the interpretation that the mirror neuron system recognizes actions performed by other individuals and matches these actions on neurons that code the same action, providing the observer with the motoric perspective of the actor. It is through this interpretation that the mirror neuron
c03.indd Sec2:40
system has garnered the attention of those interested in social learning. A growing amount of information available from EEG, transcranial magnetic stimulation, functional magnetic resonance imaging (fMRI), and positron emission tomography (PET) studies indirectly supports the presence of a mirror neuron system in humans (reviewed in Rizzolatti & Craighero, 2004). However, data suggest that there are some key differences between the mirror neuron systems of macaques and humans. For example, the mirror neuron system of humans, but not rhesus monkeys, responds to intransitive, meaningless movements. Perhaps this difference in sensitivity of the mirror neuron system sheds some light on the different imitative tendencies of humans and monkeys: Human children will “over imitate” (Horner & Whiten, 2005), copying movements that are irrelevant to accomplishing the task. (For a more detailed review of the mirror neuron system, see Chapter 16.) It is tempting to speculate that the mirror neuron system underlies social learning, given that the system allows for matching of the observation of motor actions with their execution. However, more data are needed on the involvement of motor neurons during the observation and acquisition of socially learned skills to understand whether mirror neurons are central to social learning processes. Furthermore, given that there are fundamental differences between macaques and humans in the properties of the mirror neuron system, data from additional species, ideally species that vary in their expression of social learning, will also be necessary to elucidate the role of the mirror neuron system in social learning.
SPATIAL MEMORY Spatial memory can be very important to help animals navigate through their environments to find mates or food. Because closely related species differ in the degree to which they rely on spatial memory, the comparative approach has been extremely powerful for understanding the neural mechanisms that underlie these behavioral differences. Foraging and food-caching behavior vary between species in both corvids (crows, ravens, jays) as well as parids (chickadees, tits, titmice). Some species must store food each fall and remember the locations of storage sites in order to find food over the winter. The Clark’s nutcracker from Arizona has been estimated to store up to 30,000 pine nuts each fall and appears to remember their locations even when the ground cover is transformed by snow (Balda & Kamil, 1992). Because other species within the same family do not need to store food over the winter, some have predicted that the food-caching species will show greater
8/17/09 1:58:32 PM
Spatial Memory 41
Brain Mechanisms and Spatial Memory The hippocampus is known as a primary region involved in both spatial memory and in transforming short-term to longterm memories. Several studies have shown differences in hippocampal volume, or the structure of neurons within the hippocampus, as a function of ecological differences. Birds from 3 seed-caching families had greater hippocampal volume relative to both body weight and brain volume than birds from 10 families that do not cache seeds (see Figure 3.4; Sherry, Jacobs, & Gaulin, 1992). Hampton and Shettleworth (1996) found large hippocampal volume and better spatial nonmatching-to-sample performance in food-storing black-capped chickadees (Poecile atricapillus) compared with nonfood-storing dark-eyed juncos
c03.indd Sec3:41
100 Hippocampal Complex Volume (mm3)
spatial learning ability and greater spatial memory than closely related species that do not store food. Several experimental studies support these predictions. Bednekoff, Balda, Kamil, and Hile (1997) showed that Clark’s nutcracker and pinyon jays (Gymnorhinus cyanocephalus) had better memory for seed cache location up to 60 days than noncaching Mexican jays (Aphelecoma ultramarina) and western scrub jays (A. coerulescens), although all species showed good recall up to 250 days after learning. Bond, Kamil, and Balda (2007) found better serial reversal learning performance in highly social pinyon jays compared with Clark’s nutcracker and Western scrub jays. Western scrub jays, in contrast, are able to learn to avoid caches where stored food has spoiled with passage of time, suggesting that scrub jays have a form of declarative memory that involves not only space but also time (Clayton, Yu, & Dickinson, 2001). Therefore, each species of corvid has particular abilities that serve functions appropriate for that species. In rodents, Barkley and Jacobs (2007) found that Merriam’s kangaroo rat (Dipodomys merriami), which hoards food, was more accurate in remembering locations for food than the nonhoarding Great Basin kangaroo rat (D. microps). Monogamous versus polygynous species differ in spatial memory requirements because males of monogamous species have relatively small territories and home ranges relative to those of males from polygynous species that must travel over a wider range to locate multiple females. Gaulin and Fitzgerald (1989) hypothesized that polygynous male meadow voles (Microtus pennsylvanicus) would learn spatial mazes more rapidly than females of the same species, whereas there would be no sex differences between males and females of monogamous pine and prairie voles (M. pinetorium and M. ochrogaster). Using a series of Hebb-Williams spatial mazes, Gaulin and Fitzgerald found a sex difference in polygynous voles with males learning faster, but, as predicted, there was no sex difference in maze acquisition in monogamous voles.
Nonstoring Food storing
10
1 1
10
100
1000
Body Weight
Figure 3.4 Hippocampus volume plotted against mean body weight for 13 families of birds. Open symbols indicate families with some species that store food. Both axes are plotted logarithmically. (Adapted from Sherry, Jacobs, & Gaulin, 1992)
(Junco heyemalis). Within these species, there is also variation in caching behavior, spatial memory, and hippocampal volume. Pravosudov and Clayton (2002) compared chickadees from Alaska (where food resources are scarce) with those from Colorado and found that Alaskan birds cached more food, recovered food more efficiently, were more accurate on spatial (but not on nonspatial) learning tasks and had greater hippocampal volume with more neurons. Migratory subspecies of juncos (which presumably have greater need for spatial memory) performed better than nonmigrating juncos in a spatial memory task and had more densely packed neurons in the hippocampus. Polygynous male meadow voles, which must travel over a greater distance to find multiple females than monogamous males or females of either type, also had larger hippocampal volume than females and than either sex of monogamous pine voles (see Figure 3.5; Jacobs, Gaulin, Sherry, & Hoffman, 1990). As noted previously, polygamous males also were faster to solve multiple spatial maze problems than were polygamous females, whereas monogamous males and females were equal in solving spatial mazes (Gaulin & Fitzgerald, 1989). However, increased hippocampal volume is not universally found in species that do more seed caching. Pravosudov and de Kort (2006) found that noncaching Western scrub jays had similar hippocampal volumes to food-caching European jackdaws (Corvus monedula), and Brodin (2005) found that willow tits (Parus montanus) from Europe had twice the hippocampal volume of blackcapped chickadees but did not differ in food hoarding behavior. We cautioned earlier about accepting size alone as a measure of complexity. Precise measures of neuronal
8/17/09 1:58:32 PM
42
Comparative Cognition and Neuroscience 700 Male Female
Home Range Size (m2)
600 500 400 300 200 100 0 Meadow Vole
Pine Vole
0.05 Male Female
Hippocampal Volume Relative to Brain Volume
0.049 0.048 0.047 0.046 0.045 0.044 0.043 0.042 0.041 Meadow Vole
Pine Vole
Figure 3.5 Home range size and relative volume of hippocampus in polygamous meadow voles and monogamous pine voles. Male meadow voles have significantly greater home range sizes and hippocampus volume than female meadow voles and monogamous voles of either sex. (Adapted from Jacobs, Gaulin, Sherry, & Hoffman 1990)
number, density, and dendritic fields may be more accurate measures of brain differences across species. Much more than spatial memory may be involved in food caching, such as remembering about food quality and time of storage (Clayton, 1998). This suggests that not only the hippocampus but many other brain areas may be involved in cognitive processes relating to foraging and food caching. COOPERATION, RECIPROCITY, AND DONATION Cooperative behavior emerges numerous times throughout the animal kingdom and, as is the case for social learning, its presence does not fit with a simple phylogenetic or brain size explanation. Cooperative interactions between conspecifics can be divided into two categories based on the distribution of costs and benefits to the actors. In one form of cooperation, cooperation for mutual benefit, the act is beneficial to all individuals involved (West, Griffin, & Gardner, 2007). The second form of cooperation occurs when one individual incurs a cost, albeit potentially temporary, while providing a benefit to a second individual. From an evolutionary
c03.indd Sec3:42
perspective, this costly form of cooperation is most often accounted for by kin selection (W. D. Hamilton, 1964) or reciprocal altruism (reciprocity; Trivers, 1971). The most commonly observed form of mutually beneficial cooperation is cooperative hunting (Packer & Ruttan, 1988). Data from African wild dogs (Lycaon pictus) and spotted hyenas (Crocuta crocuta) demonstrate that hunting success, prey mass, and the probability of multiple kills increased with the number of adults taking part in the hunt (Creel & Creel, 1995; Drea & Frank, 2003). Mutually beneficial cooperative hunting also occurs among African lions (Panthera leo; Schaller, 1972; Scheel & Packer, 1991). In the open plains of Etosha National Park in Namibia, lionesses have acquired preferential hunting roles within their pride, and hunts were more likely to be successful when the group size was large and huntresses occupied their preferred roles (Stander, 1992). Elaborate displays of cooperative hunting have also been observed in some wild populations of chimpanzees (Pan troglodytes; Boesch & Boesch, 1989; Gilby, Eberly, & Wrangham, 2008). Cooperative hunting is not limited to megavertebrates, however. A South American spider (Anelosimus eximius) weighing approximately 1 mg works in groups of a few to several thousand individuals to construct large basket-shaped webs that cover several cubic meters, allowing the spiders to cooperatively capture and subdue prey up to 30 times their own body size (Rypstra & Tirey, 1991). Mutually beneficial cooperation occurs outside of the hunting context as well. For example, many species exhibit mobbing, which is an antipredator behavior characterized by multiple individuals simultaneously attacking or harassing a predator. This behavior has been observed in mammals, fish, birds, and insects (Bartecki & Heymann, 1987; Curio, 1978; Dominey, 1983; Hennessy & Owings, 1978; Hoogland & Sherman, 1976; Shields, 1984). Other examples of mutually beneficial cooperative behavior include alliance formation, in which two or more individuals combine efforts against a third (reviewed in Dugatkin, 2002); mutual grooming (Dugatkin, 1997); and even cooperative mate acquisition (DuVal, 2007). Cooperative interactions that are not mutually beneficial are more difficult to explain within evolutionary theory. If one actor incurs a cost and a second actor acquires a benefit, the question arises as to why the first actor takes part in the costly act. If the individuals are related, then kin selection is the mechanism most commonly referenced to account for this behavior. It is assumed that costs incurred by the actor are offset by the indirect fitness benefits obtained through increasing the survival of relatives with shared genes (W. D. Hamilton, 1963). Well-documented examples of kin selection include cooperative breeding or helping at the nest, whereby reproductively mature individuals remain in
8/17/09 1:58:33 PM
Cooperation, Reciprocity, and Donation 43
their natal group to assist in the rearing of younger siblings (behavior in birds reviewed in Brown, 1987; behavior in mammals reviewed in Solomon & French, 1997). If the individuals are unrelated, then reciprocal altruism (Trivers, 1971) is often credited as the mechanism for maintaining the apparently costly behavior. Under reciprocal altruism, one individual incurs a cost and provides a benefit to a second individual at present, and at a later time the benefit is repaid by the second individual. In his seminal paper, Trivers listed the following factors that should increase the likelihood of reciprocal altruism occurring in a species: (a) long life span, (b) low dispersal rate, (c) high degree of mutual dependence, (d) extensive parental care, (e) lack of strong dominance hierarchies, and (f) tendency to aid in combat. Examples of reciprocal altruism in the wild include reciprocal sharing of blood meals by vampire bats (Desmodus rotundus; Wilkinson, 1984), predator inspection by sticklebacks (Gasterosteus aculeatus; Milinski, 1987), and reciprocal grooming in impalas (Aepyceros melampus; Hart & Hart, 1992). However, many have argued that empirical evidence for reciprocity is lacking and that, although the idea of reciprocity is theoretically appealing, its occurrence is extremely rare outside of humans (Hammerstein, 2003; Stevens & Hauser, 2004). Schuster and colleagues (Schuster, 2002; Schuster & Perelberg, 2004) have argued that coordinating behavior with a conspecific, often one involved in cooperative behavior, may be intrinsically rewarding. They posited from experiments on rats that economic analyses of the costs and benefits of cooperative interactions would not provide enough information to an outside observer to determine whether the cooperative interaction was beneficial to an actor because intrinsic rewards also play a role, as they do in other social interactions. This hypothesis remains to be tested on additional species and in varying contexts, but it provides a novel way of thinking about when cooperative behavior may emerge across taxa when economic reasoning falls short of explaining observed cooperative behavior. Investigations of cooperation historically focused on the selective pressures that could lead to its emergence (Alexander, 1974; Axelrod & Hamilton, 1981; Brown, 1983; I. M. Hamilton, 1963; W. D. Hamilton, 1964; Mayr, 1961; Trivers, 1971). However, interest in the proximate mechanisms of cooperation has grown (Brosnan & de Waal, 2002), and this may explain why cooperative behavior emerges in some taxa but not others. Recent studies have included investigations into the actors’ understanding of the partner ’s role in the cooperative act, the effects of various spatial and temporal reward distributions on cooperative performance, the impact of the relationship between actors on their ability to cooperate, the social system of the
c03.indd Sec4:43
species presented with an opportunity to cooperate, and the cognitive skills required to cooperate. Understanding of the partner ’s role in a cooperative act varies across species. Boesch and Boesch (1989) were the first to call attention to and distinguish between cooperation in which the actors take into account their partner ’s behavior and cooperation in which individuals are mutually attracted to the same resource and act independently of one another. The authors conceptually organized the hunting behavior of wild chimpanzees into four categories of increasing complexity based on the degree to which actors integrated their own behavior with that of their partner. This categorization scheme has been subsequently used to understand partner interactions in many cooperative contexts (Chalmeau, Lardeux, Brandibas, & Gallo, 1997; Chalmeau, Visalberghi, & Gallo, 1997; Cronin et al., 2005; Mendres & de Waal, 2000). Great apes (including humans) appear to understand the role their partner plays in a cooperative act and adjust their behaviors accordingly (Brownell, Ramani, & Zerwas, 2006; Chalmeau & Gallo, 1996a, 1996b; Chalmeau, Lardeux, et al., 1997). Variables used to evaluate understanding of the partner typically include measures of attempts to solve the apparatus in the absence of the partner and glances exchanged between actors. Investigations of whether monkeys understand the role of their partner have produced mixed results both within and across species. Some studies with tufted capuchin monkeys demonstrated that capuchins did not take into account the role of their partner when confronted with a cooperative task. The subjects solved the task but did so by chance alone, as determined by the high number of uncoordinated attempts by each actor (Chalmeau, Visalberghi et al., 1997; Visalberghi, 1997; Visalberghi, Pellegrini Quarantotti, & Tranchida, 2000). Others have argued that capuchin monkeys can solve a cooperative task and take into account their partner ’s role when the task design is intuitive, that is, when the subjects are able to see how the apparatus works and presumably understand the effects their actions have on the apparatus (Mendres & de Waal, 2000). We investigated the extent to which pair-bonded cotton-top tamarins understood the role of their partner in a cooperative problem-solving task and found that tamarins adjusted their behavior based on the presence or absence of their partner (Cronin et al., 2005), providing evidence that cotton-top tamarins are capable of understanding their partner ’s role in a cooperative task. Whether an individual will attend to a partner ’s behavior will likely vary with the amount of behavioral coordination and attentiveness to social cues typically expressed by that species. Intra- and interspecific variation in cooperation may be affected by the relationship between actors, as is the case
8/17/09 1:58:33 PM
44
Comparative Cognition and Neuroscience
for social learning. As argued by van Schaik and Kappeler (2006), individuals bonded over an extended length of time likely do not evaluate the immediate costs and benefits of their behavior but rather evaluate the long-term benefits and costs exchanged throughout the relationship. Dominance asymmetries may also affect cooperative success, either in the form of coercion by dominants to solve the task or avoidance of the task by subordinates (Chalmeau, 1994; Chalmeau & Gallo, 1996b; Chalmeau, Lardeux, et al., 1997; Tebbich, Taborsky, & Winkler, 1996). Melis, Hare, and Tomasello (2006) have found that tolerance of cofeeding in chimpanzee dyads is predictive of their success on a cooperative task. In addition, the social characteristics of a species (e.g., their characteristic degree of tolerance for nearby conspecifics, behavioral coordination, and mutual dependence) may influence the likelihood of successful cooperation (Snowdon & Cronin, 2007). Trivers (1971) noted that in species with strong dominance hierarchies, the likelihood of reciprocal altruism is reduced. Because of high levels of intragroup competition in Guinea baboons (Papio papio), Japanese macaques (Macaca fuscata), and rhesus macaques (M. mulatta), these species did not coordinate efforts to move heavy food-baited stones (Burton, 1977; Fady, 1972; Petit, Desportes, & Thierry, 1992), whereas Tonkean macaques (M. tonkeana), which are characterized by less strict dominance hierarchies and greater social tolerance, were more often successful at coordinating their actions to displace the baited stone (Petit et al., 1992). Consistent with the idea that social context can predict performance on cooperative tasks are the observations that chimpanzees, characterized by a strict dominance hierarchy and low social tolerance, performed better on competitive tasks than cooperative tasks (Hare & Tomasello, 2004) and that the socially tolerant bonobos outperformed chimpanzees on cooperative tasks (Hare, Melis, Woods, Hastings, & Wrangham, 2007). Attempts to investigate the evolutionary origins of the human tendency to act in the best interest of others (i.e., to act prosocially) have led to interesting results. Initial investigations into the evolutionary origins of prosociality indicated that humans’ closest living relatives, chimpanzees, overwhelmingly refused to donate food to a social partner, even when the potential donor could do so with very little effort and could not obtain the food for itself (Jensen et al., 2006; Silk et al., 2005; Vonk et al., 2008). One interpretation that followed from these findings was that the human tendency to act prosocially must have emerged recently in our evolutionary history because the trait was not present in the common ancestor of chimpanzees and humans. However, as we cautioned earlier in this chapter, convergent as well as divergent evolution should be considered. Some evidence has emerged to indicate that prosocial tendencies may be evident elsewhere in the primate order,
c03.indd Sec4:44
more specifically in a species that exhibits a social system similar to that of humans: the cooperatively breeding common marmoset. Burkart and colleagues (2007) have shown that common marmosets donated food to conspecifics in a task nearly identical to that used with chimpanzees. The authors interpreted their findings as evidence that a cooperatively breeding social system, one shared by humans and cooperatively breeding marmosets and tamarins, is key to the emergence of prosocial tendencies. Additional research is needed to determine whether the social systems of the species are the most important factor in these findings, as marmosets and chimpanzees differ in many other ways as well. However, these initial results highlight the importance of considering social factors in addition to phylogeny. Discussions of cognitive requirements for cooperation are most often made in relation to the ability to understand the role of the partner, as discussed previously, or the ability to engage in reciprocal altruism. Some have argued that the paucity of empirical evidence for reciprocal altruism is due to its steep cognitive demands (Hammerstein, 2003; Stevens & Hauser, 2004). Cognitive skills hypothesized to be necessary for reciprocal altruism include numerical quantification, time estimation, delay of gratification, detection and punishment of cheaters, analysis and recall of reputation, and inhibitory control (Stevens & Hauser, 2004). To date, these cognitive requirements have been discussed only on theoretical grounds and have not been examined empirically. Others assert that reciprocal altruism does not require complex cognition. Two forms of reciprocal altruism that do not require advanced cognition have been put forth. The first is generalized reciprocity, in which individuals decide whether to cooperate based on prior experiences, irrespective of the identity of the current partner (Pfeiffer, Rutte, Killingback, Taborsky, & Bonhoeffer, 2005). Generalized reciprocity does not require analysis and recall of individual reputations. This concept has been well exemplified in recent studies with Norway rats (Rattus norvegicus; Rutte & Taborsky, 2007, 2008). The second form of reciprocal altruism is symmetry-based or attitudinal reciprocity. Symmetry-based reciprocity occurs when closely bonded individuals help one another without stipulating returns. Because of the symmetrical and long-lasting characteristics of the relationship, the benefits generally balance out over a lifetime without the need for any purposeful “scorekeeping” (de Waal & Luttrell, 1988). We have found that unrelated, pair-bonded cottontop tamarins not only cooperate for mutual rewards but also cooperate when rewards are reciprocally distributed between actors and when rewards are repeatedly received by a single individual in the dyad. The tamarins were sensitive to the reward scenarios, cooperating most consistently for mutual rewards and least consistently when the rewards repeatedly went to the same individual, but cooperation
8/17/09 1:58:33 PM
Cooperation, Reciprocity, and Donation 45
persisted nonetheless (Cronin et al., 2005; Cronin & Snowdon, 2008). The tamarins’ cooperative performance in this study may have been due to attitudinal reciprocity as described by de Waal and Luttrell (1988); that is, the tamarins were cooperative with their long-term mates without keeping track of the exact costs and benefits incurred. Further studies of cooperation in tamarin dyads that differ in their social relationships are needed to further test this interpretation. As with the other phenomena discussed here, cooperative behavior emerges throughout the animal kingdom with little regard for phylogeny. Investigations into the social organizations of species and the social relationships within species have proved the most fruitful for explaining and predicting species and individual differences. Unfortunately, little is known about the neural processes that underlie cooperative interactions in nonhuman animals. However, scientists can gain some insight and begin to make predictions about neural involvement from recent studies performed on the most cooperative species: humans. Cooperation and the Brain Studies emerging from the nascent field of neuroeconomics have begun to elucidate some of the regions of the brain involved in human cooperative behavior (see also McCabe, this volume). These studies have primarily used fMRI data collected while humans engage in cooperative games. However, a single cooperative interaction necessarily includes multiple sequential stages, including the stage in which an individual decides whether to cooperate and the stage in which he or she experiences the outcome of the
cooperative interaction. Each stage is likely to recruit different brain regions, and fMRI investigations have aimed to isolate different stages of cooperative interactions. A recent fMRI study utilized the prisoner ’s dilemma to evaluate neural activation during the initial stage of cooperation, when participants decided whether to cooperate with a confederate. The prisoner ’s dilemma is a widely used game in which two individuals are given the simultaneous choice either to cooperate with the other individual or to defect. The payoff to the individual depends on his or her own choice and the choice of the partner, which is revealed once both individuals have made their choices. Regardless of the other player ’s choice, the choice to defect yields a higher payoff than the choice to cooperate. But if both players defect, both do worse than if both had cooperated (see Figure 3.6A). Rilling et al. (2007) contrasted activations when the participant decided to cooperate with those when the participant decided to defect. Results indicated that the choice to defect was associated with stronger activation in the rostral anterior cingulate cortex (rACC) and the dorsolateral prefrontal cortex (DLPFC). The activation of the rACC and the DLPFC is interesting because the DLPFC is often associated with executive control and goal maintenance and the rACC with the processing of aversive experiences and the detection of cognitive conflict. This pattern of activation seems to indicate that defection is not a simple, favorable choice by participants. The choice to cooperate was correlated with activation in the orbitofrontal cortex (OFC; see Figure 8.12 in Chapter 8). The OFC has been implicated in the processing of rewarding and punishing factors and in emotional decision making
Cooperate Defect
Player A
Player B Cooperate
Defect
R⫽3
S⫽0
R⫽3
T⫽5
T⫽5
P⫽1
S⫽0
P⫽1
(A)
Figure 3.6 A. Prisoner’s dilemma matrix. Payoff awarded to Player A is the top value within each cell; payoff awarded to Player B is the bottom value within each cell. Different payoff values can be used, however this rule must be followed: T > R > P > S. B. Trust game diagram. Each node represents a point at which a player must make a decision. At the first node, Player 1 either
c03.indd Sec4:45
P1: $0 P2: $405 P2 decision P1 decision
P1: $180 P2: $225 P1: $45 P2: $45
(B)
decides to split the benefits equally, or to allow Player 2 a move. If Player 1 decides to allow Player 2 a decision, the pot increases and both the potential gain and loss increase for Player 1. Player 2 then decides between an option that allows both players rewards but provides him/herself with a greater reward, or an option that provides him/herself with all the rewards. Different payoff values can be used. (Diagram modified from McCabe et al., 2001).
8/17/09 1:58:33 PM
46
Comparative Cognition and Neuroscience
(Bechara, Damasio, & Damasio, 2000; Ichihara-Takeda & Funahashi, 2006). Furthermore, Rilling et al. (2007) found a negative correlation between OFC and DLPFC activation that they interpreted as interplay between emotional and cognitive control leading up to the execution of cooperation or defection. Additional insight into the brain regions recruited during the decision to cooperate with another individual comes from fMRI investigations of humans engaged in a trust game that required individuals to decide whether to invest money in a second individual who could increase or decrease the payoff to the first individual (see Figure 3.6B). The investigation revealed that those participants who consistently attempted cooperation exhibited more specific prefrontal cortex (PFC) activation when playing the game with a human as compared to a computer employing a fixed, known probabilistic strategy (McCabe, Houser, Ryan, Smith, & Trouard, 2001). However, differential PFC activation between social and nonsocial versions of the game was not found for individuals who rarely cooperated. Using another version of the trust game, King-Casas et al. (2005) reported that the magnitude of neuronal activity in the dorsal striatum correlated with the intention to trust the other player, and the peak of the response shifted earlier in the decision process as player reputations developed in this reciprocal game. Rilling et al. (2002) found that different patterns of neural activation determined by whether the playing partner was identified as a computer or a human. Greater activation was observed in the anteroventral striatum, the rACC, and the OFC during social cooperation than during cooperation with a computer. The OFC and ventral striatum are recruited while one is anticipating an economic reward (e.g., Padoa-Schioppa & Assad, 2006; Roesch & Olson, 2004; Schultz, Dayan, & Montague, 1997), however Rilling et al. (2002) controlled for monetary gain in the social versus nonsocial contrast. Some investigations have specifically examined the differences in neural activation following a potentially cooperative interaction. Using the prisoner ’s dilemma and fMRI, Rilling, Sanfey, Aronson, Nystrom, and Cohen (2004) found that participants displayed increased neural activity in the ventral striatum in response to reciprocated interactions and decreased neural activity in the ventromedial PFC in response to unreciprocated interactions. The authors interpreted these results to indicate that the mesolimbic dopamine system processes errors in predictions about whether a social partner will act reciprocally. In fact, many of the regions recruited during cooperative interactions are the same regions classically involved in processing reward. The neural circuitry often involved in reward includes the caudate and particularly the ventral striatum, amygdala, and the medial and orbital PFCs
c03.indd Sec4:46
(reviewed in Cardinal, Parkinson, Hall, & Everitt, 2002; see also Volume 2, Chapter 40). It is interesting that even when the effects of tangible rewards are controlled, social cooperative interactions continue to engage some of the reward circuitry, suggesting that cooperating with another individual may be intrinsically rewarding. Similarly, Harbaugh, Mayr, and Burghart (2007) found that voluntary monetary donations also increase the neural activity of reward regions. It seems that data obtained from neuroscientific investigations as well as behavioral observations (i.e., Schuster, 2002; see above) converge to support the hypothesis that acting cooperatively with conspecifics is rewarding, and because these regions are rich in dopamine, it is likely that dopamine is involved in these cooperative interactions. The neuropeptide oxytocin may also be involved in cooperative social interactions. Oxytocin is a nonapeptide produced in the supraoptic and paraventricular nuclei of the hypothalamus. Oxytocin is released into the periphery from magnocellular neurons of the supraoptic nuclei, which project to the posterior pituitary. In addition, oxytocin is released centrally from parvocellular neurons of the paraventricular nuclei (Uvnas-Moberg, 1998). Oxytocin has been identified repeatedly in the coordination of positive social interactions in a wide range of species (reviewed in Uvnas-Moberg, 1998). In addition to having a well-known role in the onset of uterine contractions, milk letdown, and mother–infant bond formation (Gimpl & Fahrenholz, 2001; Levy, Kendrick, Goode, Guevara-Guzman, & Keverne, 1995; Nelson & Panksepp, 1998), oxytocin facilitates the formation and maintenance of important relationships outside the mother–infant context. Oxytocin is integral to the formation of pair bonds in monogamous voles (Carter, 1998; Carter, DeVries, & Getz, 1995; Young, 1999; see Box 3.1 and below). Oxytocin correlates with the expression of trust in humans (Zak, Kurzban, & Matzner, 2005) and increases the amount of generosity (donation) in humans (Zak, Stanton, & Ahmadi, 2007). Furthermore, treatment with oxytocin increases one’s willingness to accept risk in interpersonal interactions (Kosfeld, Heinrichs, Zak, Fischbacher, & Fehr, 2005). Because of the involvement of oxytocin in positive social interactions, some have begun to speculate that oxytocin may play a role in prosocial behaviors such as cooperation, reciprocity, and donation (i.e., Rutte & Taborsky, 2007). The term cooperation is used in different ways by economists, psychologists, and ethologists, and, as we have shown, investigations of cooperative behavior have spanned studies of individuals acting altruistically, engaging in reciprocity, taking risks for kin, providing benefits to mates, and engaging in economic games. The common link in all of these forms of cooperation is that they provide a benefit of some sort for another individual, and the most
8/17/09 1:58:34 PM
Pair Bonds
BOX 3.1 OXYTOCIN AND VASOPRESSIN Oxytocin (OT) and arginine vasopressin (AVP, also known as antidiuretic hormone) are closely related in structure but have different physiological roles. Both are made of nine amino acids (peptides) folded into a ring, and the two hormones differ in only two amino acids. These hormones are released into the blood from the posterior pituitary gland at the base of the brain, but they are also released from neural tissue in the hypothalamus and are active in several areas of the brain. Until 20 years ago, the best known role for these hormones was in the regulation of peripheral physiological processes. OT is involved in uterine contractions at birth (and indeed a synthetic version of OT is frequently used to induce labor) and in the milk letdown reflex in nursing. AVP is involved in the contraction and relaxation of smooth muscle (e.g., blood vessels) and regulates transport of water and sodium across cell walls, especially in the kidney. In recent years, other behavioral functions of these hormones have become widely known. OT is involved in the formation of pair bonds in monogamous female rodents (Carter, 1998), and AVP is involved in pair bond formation in monogamous male rodents. AVP is also involved in the aggressive behavior shown by territorial monogamous male rodents (Marler et al., 2003). AVP also plays a role in paternal behavior in monogamous rodents (Marler et al., 2003), and OT plays an important role in regulating female sexual behavior and maternal behavior. Both hormones are anxiolytic (i.e., they reduce anxiety), and release of OT during deep massage, stroking, and sexual intercourse has been hypothesized to play an important role in social bonding in humans. Indeed Gonzaga, Turner, Keltner, Campos, and Altemus (2006) found that OT levels increased during displays of romantic love, but not sexual arousal, in college women, suggesting a role of OT in human pair bonding and social reinforcement. information known so far about the brain has come from humans engaged in neuroeconomic games. However, these patterns of neuronal activation provide just a starting point for investigations of other forms of cooperative behavior. Much more information is needed to determine whether species that typically cooperate with one another, such as cooperatively hunting hyenas, have regional specializations or patterns of neuronal activation that differ from those of species that do not typically cooperate with one another. If a comparative neuroscience approach were to
c03.indd Sec5:47
47
be applied to cooperative behavior, one might speculate that in species in which it is beneficial to act cooperatively with conspecifics, cooperative behavior will have become associated with the brain’s reward system throughout the evolutionary history. But hypotheses of this sort remain to be tested.
PAIR BONDS In many mammalian species, close affiliative relationships between males and females outside of mating are rare. The closest relationships are found in species with biparental or cooperative care such as prairie voles, titi monkeys, tamarins, and marmosets. In mammals, females incur the costs of gestation and lactation, and males can never be certain of paternity of infants. Under such conditions, it has been assumed that males will be more successful attempting to fertilize as many females as possible rather than investing in parental care. However, in species with biparental or cooperative care, fathers often play a critical role in infant survival, and thus the reproductive success of both parents depends on joint infant care. Biparental, socially monogamous species are expected to have closer, more affiliative social relationships than polygamous species, and it is, in fact, under these conditions that close relationships between mates are found. The formation of the relationship has been well studied. Williams, Catania, and Carter (1992) found that prairie vole females cohabiting with a male for 24 hours with or without mating developed a strong bond. Cohabitation with mating for less than 24 hours, but not cohabitation alone, also led to pair formation. Savage, Ziegler, and Snowdon (1988) found that newly paired cotton-top tamarins spent more time in contact, in grooming, and engaged in sexual activity than did established pairs, with males initiating affiliation more often than females. Widowski, Porter, Ziegler, and Snowdon (1992) found that exposure to a novel cottontop tamarin male with no direct physical contact led to ovulation in reproductively suppressed female cotton-top tamarins. Schaffner, Shepherd, Santos, and French (1995) also found that male black tufted-ear marmosets (Callithrix kuhli) initiated more proximity behavior in the first 40 days after pairing and that sexual behavior decreased over time. Silva and Sousa (1997) studied variation in pair formation in sexually naive common marmosets and found that those that became pregnant within the first 10 weeks had more affiliative behavior, especially grooming, and greater coordination of behavior than those pairs that did not conceive immediately. This is the only study on the reproductive consequences of within-species variation in pair formation.
8/17/09 1:58:34 PM
48
Comparative Cognition and Neuroscience
Determination that a pair bond exists requires some experimental documentation of attachment: that an animal is distressed upon separation from its mate and displays greater affiliation than baseline upon reunion, that an individual will behave aggressively toward intruders, or that an individual will preferentially seek contact with its mate when given a choice between conspecifics. Mendoza and Mason (1986b) found greater disturbance, more aggression to intruders, and higher cortisol levels in the monogamous titi monkey (Callicebus moloch) compared to the polygynous squirrel monkey (Saimiri sciureus). When separated and offered a choice between mate and infant, titi monkeys chose their mates preferentially. Infants that were separated from both parents chose fathers over mothers (Mendoza & Mason, 1986a). Subsequently, when paired adult titi monkeys were separated for 30 min or 5 days and tested with either their mate or an opposite-sex stranger, male and female titi monkeys showed equal affiliation upon reunion with the mate. In contrast, they showed high levels of arousal when placed with an opposite-sex stranger. These results suggest strong and lasting relationships (FernandezDuque, Mason, & Mendoza, 1997). Cotton-top tamarins also exhibit distress when separated from mates for short time periods, with increased rates of long calling (which is used for within-group cohesion and by lost animals) during the period of separation and increased affiliative and sexual behavior when reunited after separation (Snowdon & Ziegler, 2007). Males are more vocal and appear more disturbed by separation than females. They also display levels of aggression toward intruders of both sexes, with males displaying equal amounts whether the mate is present or absent and females displaying higher levels only in the presence of the mate. Males also groom females significantly more often than females groom males (Snowdon & Ziegler, 2007). All of these results suggest that males are more responsible for maintaining relationships than females. Partners in pair-bonded species can also buffer against the effects of stress. T. E. Smith and French (1997) tested male and female tufted-ear marmosets in novel environments and found that levels of cortisol were lower when pairs were together in a novel environment than when they were alone. Rukstalis and French (2005) played back calls of the mate to isolated marmosets and found that the calls alone reduced cortisol levels as well as stress-related behavior. In summary, lasting adult heterosexual affiliative relationships are found in a few species that are socially monogamous or cooperatively breed. Several studies have looked at behavioral changes during pair formation and have evaluated attachment through studies involving challenges by unfamiliar animals of same and opposite sex and studies involving separation and reunion. The individual
c03.indd Sec5:48
recognition of a specific individual as a mate and the differential response to the mate than to other potential mates are also key components of a pair bond. What are the neural and hormonal mechanisms that maintain these relationships? Role of Oxytocin and Vasopressin Oxytocin is a neuropeptide that has been implicated in affiliative relationships. In the past 15 years, extensive research, primarily on monogamous voles, has indicated the importance of oxytocin in formation of affiliative relationships, especially in females (for reviews, see the following: mother–infant relationships, Nelson & Panksepp, 1998; and heterosexual adult relationships, Carter, 1998; Carter et al., 1995; Insel, 2003). Oxytocin, but not the closely related nonapeptide vasopressin, appears critical for female prairie voles in relationship formation (Insel & Hulihan, 1995). However, vasopressin levels increase in male prairie voles after pair formation, suggesting a sex difference in which hormones are involved in a pair bond for these rodents (Winslow, Hastings, Carter, Harbaugh, & Insel, 1993). In non-pair-bonding species of voles, oxytocin is not effective for inducing a pair bond and has a different distribution of receptors in the brain (reviewed by Young, 1999). Nonetheless, oxytocin infused chronically into male rats (a polygamous species) led to increased social interactions (Witt, Winslow, & Insel, 1992). Pedersen and Boccia (2002) found that oxytocin was critical for initiation and maintenance of sexual behavior in female rats. Oxytocin injected into the medial preoptic area of male rats also facilitated social recognition, whereas vasopressin did not (Popik & van Ree, 1991). However, Dantzer, Koob, Bluthé, and Le Moal (1988) reported that vasopressin in the septal region facilitated social memory in rats. Oxytocin knockout mice fail to recognize familiar conspecifics, and oxytocin administered to the medial amygdala prior to initial exposure facilitated social recognition (Bielsky & Young, 2004; Ferguson, Aldag, Insel, & Young, 2001). Rosenblum et al. (2002) compared highly affiliative bonnet macaques (Macaca radiata) with socially distant pigtail macaques (M. nemestrina) and found that the more affiliative bonnet macaques had higher cerebrospinal fluid oxytocin levels and lower corticotrophin-releasing hormone levels than the pigtail macaques. Winslow, Noble, Lyons, Sterk, and Insel (2003) reported that mother-reared male rhesus macaques had elevated cerebrospinal fluid oxytocin levels compared with nursery-reared macaques, and there was a significant correlation between oxytocin levels and affiliative behavior. In contrast, vasopressin levels did not differ with rearing condition but correlated positively with fearful behavior in rhesus macaques.
8/17/09 1:58:34 PM
Pair Bonds
Interest in oxytocin in humans has increased recently because of the role of oxytocin in both mother–infant attachment and adult affiliative behavior. Wismer Fries, Ziegler, Kurian, Jacoris, and Pollak (2005) found that children had elevated oxytocin levels when performing a task that included contact by the mother but not contact by an unfamiliar female. In contrast, children adopted from Eastern European orphanages several years previously showed low oxytocin responses to both adopted mothers and an unfamiliar female, indicating that early experience affects the oxytocin system. Oxytocin administered intranasally increases trust among humans (Kosfeld et al., 2005; Zak et al., 2005). Infusions of oxytocin into people with autism and with Asperger syndrome reduced repetitive behavior (Hollander et al., 2003) and increased retention of social cognition (Hollander et al., 2006). These studies support the role of oxytocin in producing calm behavior and facilitating social memories in humans. Turner, Altemus, Enos, Cooper, and McGuinness (1999) found that women asked to recount a negative experience of loss or abandonment showed decreased levels of serum oxytocin correlated with the degree of negative emotion expressed. Using data from Turner et al. (1999), Gonzaga et al. (2006) reported that women engaging in affiliation signals with a romantic partner had a significant positive correlation between the amount of positive signaling and serum oxytocin. Grewen, Girdler, Amico, and Light (2005) reported higher plasma oxytocin in both men and women in relationships with strong partner support. Thus, changes in oxytocin appear to track affiliative and positive emotional states in both sexes and across many species. In humans, peripheral measures of oxytocin levels appear to correlate with differences in affiliation, and peripheral infusions of oxytocin can increase affiliative behavior and social memory. Most comparative research on pair bonds has involved evaluating species differences; however, within-species individual differences in relationship quality and hormonal correlates are found in nonhuman primates as well. We have observed a fivefold variation in the amount of affiliative behavior between pairs of cotton-top tamarins and have also observed a fivefold variation in urinary oxytocin levels in both sexes that correlates with the amount of affiliative behavior expressed. Variation in sexual behavior explains most of the variance in male oxytocin levels, whereas variation in contact and grooming explains most of the variance in female oxytocin levels (Snowdon et al., in preparation). In summary, there is clear evidence for the importance of oxytocin for pair bonding in females of monogamous vole species and for the importance of vasopressin in male voles. Oxytocin can supersede the effects of mating
c03.indd Sec5:49
49
in pair bond formation. Oxytocin affects sexual behavior in female rodents and macaques and also appears to be important for social recognition in male rodents. In rodents, tamarins, and human primates, peripheral measures of oxytocin appear to be correlated with affiliation, attachment, trust, and positive emotions, and peripheral administration of oxytocin can alter affiliative behavior. Brain Mechanisms Comparative work between monogamous and polygynous voles has shown that monogamous female voles but not polygamous voles have increased oxytocin receptor density in the nucleus accumbens and caudate putamen and greater vasopressin receptor density in the ventral pallidum (Table 3.2; Young & Wang, 2004). Little is known about oxytocin distribution in monogamous, pair-bonded primate species, but Wang, Moody, Newman, and Insel (1997) found that there were no sex differences in the distribution of immunoreactive oxytocin neurons and fibers in the common marmoset. Oxytocin immunoreactive neurons were found in the paraventricular and supraoptic nuclei of the hypothalamus, the bed nucleus of the stria teminalis, and the medial amygdala. Vasopressin cells were found in the paraventricular, supraoptic, and suprachiasmatic nuclei and in the lateral area of the hypothalamus. The only sex difference was that males had a greater density of vasopressin reactive cells in the bed nucleus of the stria terminalis than females. In a biparental species, in which both parents are essential for infant care and both sexes contribute to the formation and maintenance of a pair bond, it is reasonable to find no sex differences in distribution of oxytocin-reactive neurons in the brain. In further work on common marmosets, Wang, Toloczko, et al. (1997) found vasopressin receptor binding in the nucleus accumbens, diagonal band, lateral septum, bed nucleus of the stria terminalis, amygdala, and anterodorsal and ventromedial hypothalamus, in addition to areas with immunoreactivity. Marmosets differed from voles in important ways: No vasopressin-producing cells were found in the amygdala. There was no plexus of immunoreactive fibers in the lateral septum. There was much greater visualization of vasopressin immunoreactive cells in the bed nucleus of the stria terminalis. Taken together, the results on marmosets suggest relatively little sexual dimorphism, an extensive overlap of oxytocin and vasopressin immunoreactive cells, and a different distribution than that found in rodents. So far, pair formation and maintenance has been treated as a unitary, species-specific trait, and yet there may be considerable individual variation within species, as we have noted. Recently, Hammock and Young (2005) described variation in social engagement and affiliative
8/17/09 1:58:35 PM
50
Comparative Cognition and Neuroscience
licking and grooming suggests the possibility for an interaction between early experience and gene expression.
TABLE 3.2 Oxytocin and arginine vasopressin receptor distribution as a function of mating system in voles and monogamous common marmosets. Brain Area
Monogamous vs. Polygamous Volea,b
Monogamous Primatec
SUMMARY
Oxytocin Nucleus accumbens
Monogamous
NA
Prelimbic cortex
Monogamous
NA
Bed nucleus of stria terminalis
Monogamous
NA
Midline thalamus
Monogamous
NA
Ventral reunions
Monogamous
NA
Lateral amygdala
Monogamous
NA
Central amygdala
Both
NA
Lateral septum
Polygamous
NA
Ventromedial hypothalamus
Polygamous
NA
Diagonal band
Monogamous
Yes
Laterodorsal thalamus
Monogamous
No
Central amygdala
Monogamous
Yes
Basolateral amygdala
Monogamous
Yes
Bed nucleus stria terminalis
Monogamous
Yes
Nucleus accumbens
Both
Yes
Accessory olfactory bulb
Both
No
Superior colliculus
Both
No
Lateral septum
Polygamous
Yes
Periventricular hypothalamus
NA
Yes
Ventromedial hypothalamus
NA
Yes
Suprachiasmatic hypothalamus NA
Yes
Vasopressin
a Oxytocin data from “Oxytocin Receptor Distribution Reflects Social Organization in Monogamous and Polygamous Voles,” by T. R. Insel and L. E. Shapiro, 1992, Proceedings of the National Academy of Sciences, USA, 89, pp. 5981–5985. b Vasopressin data from Insel, Wang & Ferris (1994). c Marmoset vasopressin data from “Vasopressin in the Forebrain of Common Marmosets (Callithrix jacchus): Studies with in situ Hybridization, Immunocytochemistry and Receptor Autoradiography,” by Z. Wang, D. Toloczko, L. J. Young, K. Moody, J. D. Newman & T. R. Insel, 1997, Brain Research, 768, pp. 147–156. Note. NA ⫽ not reported or known.
behavior in monogamous male prairie voles and correlated this variation with a polymorphism in the promoter region of the vasopressin 1a receptor. Thus, variation in the promotor region of the receptor genes, and consequently the expression of vasopressin 1a receptors, may account for behavioral variation within a species. Similar studies remain to be done with respect to oxytocin receptors. The variation in receptor sensitivity to estrogen described by Champagne et al. (2001) as a function of early maternal
c03.indd Sec5:50
We have demonstrated that both the proximate and ultimate mechanisms contributing to inter- and intraspecific variation across a wide range of cognitive phenomena can be best understood through a comparative lens that integrates not only phylogeny and brain size but also social and ecological factors. In our discussions of social learning, cooperation, spatial memory, and pair bonding, we have emphasized strong theoretical predictions about socioecological influences on the expression of these behaviors within and across taxa, and, when data have been available, we have highlighted underlying neuronal and hormonal differences that may relate to the observed variation. One major implication of this comparative perspective is that no single species will likely serve as a suitable model for understanding the myriad of interesting human cognitive abilities. For example, there is strong momentum to use various species of macaques as model species for humans in neuroscience, but we should keep in mind the potential implications of the differing social and ecological pressures that macaque and human lineages have faced since the divergence of our ancestry nearly 40 million years ago. Awareness of the cognitive domains in which macaques and humans would be expected from a socioecological perspective to differ may sharpen interpretations of findings from macaque studies and generate interesting hypotheses about interspecific variation if we were to look beyond humans and macaques. To better understand the selective pressures that have contributed to the wide range of expression of cognitive abilities across taxa, we must embrace data collected on primates and nonprimates alike and rely on multifaceted hypotheses that incorporate social and ecological pressures as well as phylogeny and brain size. Interdisciplinary collaboration is integral to accomplishing this feat but should yield a more comprehensive and generalizable understanding of both human and nonhuman cognition.
REFERENCES Aiello, L., & Dean, C. (1990). An introduction to human evolutionary anatomy. London: Academic Press. Aisner, R., & Terkel, J. (1992). Ontogeny of pine cone opening behaviour in the black rat Rattus rattus. Animal Behaviour, 44, 327–336. Alexander, R. D. (1974). The evolution of social behavior. Annual Review of Ecology and Systematics, 5, 325–383.
8/17/09 1:58:35 PM
References 51 Alger, S. J., & Riters, L. V. (2006). Lesions to the medial preoptic nucleus differentially affect singing and nest box-directed behaviors within and outside of the breeding season in European starlings (Sturnus vulgaris). Behavioral Neuroscience, 120, 1326–1336. Axelrod, R., & Hamilton, W. D. (1981, March 27). The evolution of cooperation. Science, 211, 1390–1396.
Call, J., & Tomasello, M. (1996). The effects of humans on the cognitive development of apes. In A. E. Russon, K. S. Bard, & S. T. Parker (Eds.), Reaching into thought (pp. 371–403). Cambridge, England: Cambridge University Press.
Barkley, C. L., & Jacobs, L. F. (2007). Sex and species differences in spatial memory in food-storing kangaroo rats. Animal Behaviour, 73, 321–329.
Cambefort, J. P. (1981). A comparative study of culturally transmitted patterns of feeding habits in the chacma baboon (Papio ursinus) and the vervet monkey (Cercopithecus aethiops). Folia Primatologica, 36, 243–263.
Barrett, L., Henzi, P., & Rendall, D. (2007). Social brains, simple minds: Does social complexity really require cognitive complexity? Philosophical Transactions of the Royal Society. Series B., 362, 561–575.
Campbell, M. W., & Snowdon, C. T. (2007). Vocal response of captivereared Saguinus oedipus during mobbing. International Journal of Primatology, 28, 257–270.
Bartecki, U., & Heymann, E. W. (1987). Field observation of snakemobbing in a group of saddleback tamarins Saguinus fuscicollis nigrifrons. Folia Primatologica, 48, 199–202.
Campbell, M. W., & Snowdon, C. T. (in press). Can captive-reared cottontop tamarins. Saguinus oedipus, learn to mob a predator? (in press). International Journal of Primatology.
Bechara, A., Damasio, H., & Damasio, A. R. (2000). Emotion, decision making and the orbitofrontal cortex. Cerebral Cortex, 10, 295–307.
Cardinal, R. N., Parkinson, J. A., Hall, J., & Everitt, B. J. (2002). Emotion and motivation: The role of the amygdala, ventral striatum, and prefrontal cortex. Neuroscience and Biobehavioral Reviews, 26, 321–352.
Balda, R. P., & Kamil, A. C. (1992). Long-term spatial memory in Clark’s nutcracker, Nucifraga columbiana. Animal Behaviour, 44, 761–769.
Bednekoff, P. A., Balda, R. P., Kamil, A. C., & Hile, A. G. (1997). Longterm spatial memory in four seed-caching corvid species. Animal Behaviour, 53, 335–341. Bester-Meredith, J. K., & Marler, C. A. (2001). Vasopressin and aggression in cross-fostered California mice (Peromyscus californicus) and white-footed mice (Peromyscus leucopus). Hormones and Behavior, 40, 51–64. Bielsky, I. F., & Young, L. J. (2004). Oxytocin, vasopressin and social recognition in mammals. Peptides, 25, 1565–1574. Boesch, C., & Boesch, H. (1989). Hunting behavior of wild chimpanzees in the Tai national park. American Journal of Physical Anthropology, 78, 547–573. Bond, A. B., Kamil, A. C., & Balda, R. P. (2007). Serial reversal learning and the evolution of behavioral flexibility in three species of North American corvids (Gymnorhinus cyanocephalus, Nucifraga columbiana, Aphelocoma californica). Journal of Comparative Psychology, 121, 372–379. Box, H. O. (1984). Primate behavior and socioecology. London: Chapman & Hall. Boyd, R., & Richerson, P. J. (1988). An evolutionary model of social learning: The effects of spatial and temporal variation. In T. R. Zentall & B. G. Galef (Eds.), Social learning: Psychological and biological perspectives (pp. 29–48). Hillsdale, NJ: Erlbaum. Brodin, A. (2005). Hippocampal volume does not correlate with foodhoarding rates in the black-capped chickadee (Poecile atricapillus) and willow tit (Parus montanus). Auk, 122, 819–828.
c03.indd Sec6:51
Byrne, R. W. (1999). Imitation without intentionality: Using string parsing to copy the organization of behaviour. Animal Cognition, 2, 63–72.
Carter, C. S. (1998). Neuroendocrine perspectives on social attachment and love. Psychoneuroendocrinology, 23, 779–818. Carter, C. S., DeVries, A. C., & Getz, L. L. (1995). Physiological substrates of mammalian monogamy: The prairie vole model. Neuroscience and Biobehavioral Reviews, 19, 303–314. Chalmeau, R. (1994). Do chimpanzees cooperate in a learning task? Primates, 35, 385–392. Chalmeau, R., & Gallo, A. (1996a). Cooperation in primates: Critical analysis of behavioural criteria. Behavioral Processes, 35, 101–111. Chalmeau, R., & Gallo, A. (1996b). What chimpanzees (Pan troglodytes) learn in a cooperative task. Primates, 37, 39–47. Chalmeau, R., Lardeux, K., Brandibas, P., & Gallo, A. (1997). Cooperative problem solving by orangutans (Pongo pygmaeus). International Journal of Primatology, 18, 23–32. Chalmeau, R., Visalberghi, E., & Gallo, A. (1997). Capuchin monkeys, Cebus apella, fail to understand a cooperative task. Animal Behaviour, 54, 1215–1225. Champagne, F., Diorio, J., Sharma, S., & Meaney, M. J. (2001). Naturally occurring variations in maternal behavior in the rat are associated with differences in estrogen-inducible central oxytocin receptors. Proceedings of the National Academy of Sciences, USA, 98, 12736–12741. Clayton, N. S. (1998). Memory and the hippocampus in food-storing birds: A comparative approach. Neuropharmacology, 37, 441–452.
Brosnan, S. F., & de Waal, F. B. M. (2002). A proximate perspective on reciprocal altruism. Human Nature, 13, 129–152.
Clayton, N. S., Yu, K. S., & Dickinson, A. (2001). Scrub jays (Aphelocoma coerulescens) form integrated memories of the multiple features of caching episodes. Journal of Experimental Psychology: Animal Behavior Processes, 27, 17–29.
Brown, J. L. (1983). Cooperation: A biologist’s dilemma. Advances in the Study of Behavior, 13, 1–37.
Coussi-Korbel, S., & Fragaszy, D. M. (1995). On the relation between social dynamics and social learning. Animal Behaviour, 50, 1441–1453.
Brown, J. L. (1987). Helping and communal breeding in birds: Ecology and evolution. Princeton, NJ: Princeton University Press.
Creel, S., & Creel, N. M. (1995). Communal hunting and pack size in African wild dogs, Lycaon pictus. Animal Behaviour, 50, 1325–1339.
Brownell, C. A., Ramani, G. B., & Zerwas, S. (2006). Becoming a social partner with peers: Cooperation and social understanding in one- and two-year-olds. Child Development, 77, 803–821.
Cronin, K. A., Kurian, A. V., & Snowdon, C. T. (2005). Cooperative problem solving in a cooperatively breeding primate (Saguinus oedipus). Animal Behaviour, 69, 133–142.
Burkart, J. M., Fehr, E., Efferson, C., & van Schaik, C. P. (2007). Otherregarding preferences in a non-human primate: Common marmosets provision food altruistically. Proceedings of the National Academy of Sciences, USA, 104, 19762–19766.
Cronin, K. A., & Snowdon, C. T. (2008). The effects of unequal reward distributions on cooperative performance by cottontop tamarins, Saguinus oedipus. Animal Behaviour, 75, 245–257.
Burton, J. J. (1977). Absence of spontaneous cooperative behavior in a troop of Macaca fuscata confronted with baited stones. Primates, 18, 359–366.
Curio, E. (1978). The adaptive significance of avian mobbing: I. Teleonomic hypotheses and predictions. Zeitschrift fur Tierpsychologie, 48, 175–183.
Byrne, R. W. (1994). The evolution of intelligence. In P. J. B. Slater & T. R. Halliday (Eds.), Behavior and evolution (pp. 223–265). Cambridge, England: Cambridge University Press.
Dantzer, R., Koob, G. F., Bluthé, R.-M., & Le Moal, M. (1988). Septal vasopressin modulates social memory in rats. Brain Research, 457, 143–147.
8/17/09 1:58:35 PM
52
Comparative Cognition and Neuroscience
de Waal, F. B. M., & Luttrell, L. M. (1988). Mechanisms of social reciprocity in three primate species: Symmetrical relationship characteristics or cognition? Ethology and Sociobiology, 9, 101–118.
Gonzaga, G. C., Turner, R. A., Keltner, D., Campos, B., & Altemus, M. (2006). Romantic love and sexual desire in close relationships. Emotion, 6, 163–179.
di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: A neurophysiological study. Experimental Brain Research, 91, 176–180.
Goodall, J. (1986). The chimpanzees of Gombe: Patterns of behavior. Cambridge, MA: Belknap Press.
Dolman, C. S., Templeton, J., & Lefebvre, L. (1996). Mode of foraging competition is related to tutor preference, Zenaida aurita. Journal of Comparative Psychology, 110, 45–54. Dominey, W. J. (1983). Mobbing in colonially nesting fishes, especially the bluegill, Lepomis macrochirus. Copeia, 1086–1088. Drea, C. M., & Frank, L. G. (2003). The social complexity of spotted hyenas. In F. B. M. de Waal & P. L. Tyack (Eds.), Animal social complexity: Intelligence, culture, and individualized societies (pp. 121–148). Cambridge, MA: Harvard University Press. Drea, C. M., & Wallen, K. (1999). Low-status monkeys “play dumb” when learning in mixed social groups. Proceedings of the National Academy of Sciences, USA, 96, 12965–12969. Dugatkin, L. A. (1997). Cooperation among animals: An evolutionary perspective. Oxford, England: Oxford University Press. Dugatkin, L. A. (2002). Animal cooperation among unrelated individuals. Naturwissenschaften, 89, 533–541. Dunbar, R. I. M. (1991). Functional significance of grooming in primates. Folia Primatologica, 57, 121–131. Dunbar, R. I. M. (2003). The social brain: Mind, language, and society in evolutionary perspective. Annual Review of Anthropology, 32, 163–181. DuVal, E. H. (2007). Social organization and variation in cooperative alliances among male lance-tailed manakins. Animal Behaviour, 73, 391–401. Fady, J. C. (1972). Absence of instrumental-type cooperation in feral. Papio papio. Behaviour, 43, 157–164. Ferguson, J. N., Aldag, J. M., Insel, T. R., & Young, L. J. (2001). Oxytocin in the medial amygdala is essential for social recognition in the mouse. Journal of Neuroscience, 21, 8278–8285. Fernandez-Duque, E., Mason, W. A., & Mendoza, S. P. (1997). Effects of duration of separation on responses to mates and strangers in the monogamous titi monkey (Callicebus moloch). American Journal of Primatology, 43, 225–237. Francis, D., Diorio, J., Liu, D., & Meaney, M. J. (1999, November 5). Nongenomic transmission across generations of maternal behavior and stress responses in the rat. Science, 286, 1155–1158. Frazier, C. R. M., Trainor, B. C., Cravens, C. J., Whitney, T. K., & Marler, C. A. (2006). Paternal behavior influences development of aggression and vasopressin expression in male California mouse offspring. Hormones and Behaviour, 50, 699–707. Friant, S. C., Campbell, M. A., & Snowdon, C. T. (2008). Captive-born cotton-top tamarins (Saguinus oedipus) respond similarly to vocalizations of predators and non-predators. American Journal of Primatology, 70, 707–710. Galef, B. G., Jr., & Allen, C. (1995). A new model system for studying behavioural traditions in animals. Animal Behaviour, 50, 705–715. Galef, B. G., Jr., & Laland, K. N. (2005). Social learning in animals: Empirical studies and theoretical models. BioScience, 55, 489–499. Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119, 593–609. Gaulin, S. J. C., & Fitzgerald, R. W. (1989). Sexual selection for spatial learning ability. Animal Behaviour, 37, 322–331.
Grewen, K. M., Girdler, S. S., Amico, J., & Light, K. C. (2005). Effects of partner support on resting oxytocin, cortisol, norepinephrine, and blood pressure before and after warm partner contact. Psychosomatic Medicine, 67, 531–538. Gubernick, D. J., & Tefari, T. (2000). Adaptive significance of male parental care in a monogamous mammal. Proceedings of the Royal Society of London. Series B, 267, 147–150. Hamilton, W. D. (1963). The evolution of altruistic behavior. American Naturalist, 97, 354–356. Hamilton, W. D. (1964). The genetical evolution of social behavior. Journal of Theoretical Biology, 7, 1–52. Hammerstein, P. (2003). Why is reciprocity so rare in social animals? A protestant appeal. In P. Hammerstein (Ed.), The genetic and cultural evolution of cooperation (pp. 83–93). Cambridge, MA: MIT Press. Hammock, E. A. D., & Young, L. J. (2005, June 10). Microsatellite instability generates diversity in brain and sociobehavioral traits. Science, 308, 1630–1634. Hampton, R. R., & Shettleworth, S. J. (1996). Hippocampus and memory in food-storing and in a nonstoring bird species. Behavioral Neuroscience, 110, 946–964. Harbaugh, W. T., Mayr, U., & Burghart, D. R. (2007, June 15). Neural responses to taxation and voluntary giving reveal motives for charitable donations. Science, 316, 1622–1625. Hare, B., Melis, A. P., Woods, V., Hastings, S., & Wrangham, R. (2007). Tolerance allows bonobos to outperform chimpanzees on a cooperative task. Current Biology, 17, 619–623. Hare, B., & Tomasello, M. (2004). Chimpanzees are more skilful in competitive than in cooperative cognitive tasks. Animal Behaviour, 68, 571–581. Hart, B. L., & Hart, L. A. (1992). Reciprocal allogrooming in impala, Aepyceros melampus. Animal Behaviour, 44, 1073–1083. Hasler, A. D. (1966). Underwater guideposts: The homing of salmon. Madison: University of Wisconsin Press. Hauser, M. D., Chen, K. M., Chen, F., & Chuang, E. (2003). Give unto others: Genetically unrelated cotton-top tamarin monkeys preferentially give food to those who altruistically give food back. Proceedings of the Royal Society of London. Series B, 270, 2363–2370. Hauser, M. D., Newport, E. L., & Aslin, R. N. (2001). Statistical learning of the speech stream in a non-human primate: Statistical learning in cotton-top tamarins. Cognition, 78, B53–B64. Hayes, S. L., & Snowdon, C. T. (1990). Predator recognition in cotton-top tamarins (Saguinus oedipus). American Journal of Primatology, 20, 283–291. Hennessy, D. F., & Owings, D. H. (1978). Snake species discrimination and the role of olfactory cues in the snake-directed behavior of the California ground squirrel. Behaviour, 65, 115–124. Hodos, W., & Campbell, C. B. G. (1969). Scala naturae: Why there is no theory in comparative psychology. Psychological Review, 76, 337–350. Hollander, E., Bartz, J., Chaplin, W., Phillips, A., Sumner, J., Soorya, L., et al. (2006). Oxytocin increases retention of social cognition. Biological Psychiatry, 61, 498–503.
Gilby, I. C., Eberly, L. E., & Wrangham, W. R. (2008). Economic profitability of social predation among wild chimpanzees: Individual variation promotes cooperation. Animal Behaviour, 75, 351–360.
Hollander, E., Novatny, S., Hanratty, N., Yaffe, R., deCaria, C., Aronowitz, B. R., et al. (2003). Oxytocin infusion reduces repetitive behaviors in adults with autism and Asperger ’s disorders. Neuropsychopharmacology, 28, 193–198.
Gimpl, G., & Fahrenholz, F. (2001). The oxytocin receptor system: Structure, function, and regulation. Physiological Reviews, 81, 629–683.
Hoogland, J. L., & Sherman, P. W. (1976). Advantages and disadvantages of bank swallow coloniality. Ecological Monographs, 46, 33–58.
c03.indd Sec6:52
8/17/09 1:58:36 PM
References 53 Horner, V., & Whiten, A. (2005). Causal knowledge and imitation/emulation switching in chimpanzees (Pan troglodytes) and children (Homo sapiens). Animal Cognition, 8, 164–181.
pigeons. In T. R. Zentall & B. G. Galef, Jr. (Eds.), Social learning: Psychological and biological perspectives (pp. 141–164). Hillsdale, NJ: Erlbaum.
Humle, T., & Snowdon, C. T. (2008). Socially biased learning in the acquisition of a complex foraging task in juvenile cottontop tamarins, Saguinus oedipus. Animal Behaviour, 75, 267–277.
Levy, F., Kendrick, K. M., Goode, J. A., Guevara-Guzman, R., & Keverne, E. B. (1995). Oxytocin and vasopressin release in the olfactory bulb of parturient ewes: Changes with maternal experience and effects on acetylcholine, gamma-aminobutyric acid, glutamate and noradrenaline release. Brain Research, 669, 197–206.
Ichihara-Takeda, S., & Funahashi, S. (2006). Reward-period activity in primate dorsolateral prefrontal and orbitofrontal neurons is affected by reward schedules. Journal of Cognitive Neuroscience, 18, 212–226. Inoue-Nakamura, N., & Matsuzawa, T. (1997). Development of stone tool use by wild chimpanzees (Pan troglodytes). Journal of Comparative Psychology, 111, 159–173. Insel, T. R. (2003). Is social attachment an addictive disorder? Physiology and Behavior, 79, 351–357. Insel, T. R., & Hulihan, T. J. (1995). A gender-specific mechanism for pair bonding: Oxytocin and partner preference formation in monogamous voles. Behavioral Neuroscience, 109, 782–789.
Liu, D., Diorio, J., Tannenbaum, B., Caldji, C., Francis, D., Freedman, A., et al. (1997, September 12). Maternal care hippocampal glucocorticoid receptors and hypothalamic-pituitary-adrenal responses to stress. Science, 277, 1659–1662. Lonsdorf, A. V., Eberly, L. E., & Pusey, A. E. (2004, April 15). Sex differences in learning in chimpanzees. Nature, 428, 715–716.
Insel, T. R., & Shapiro, L. E. (1992). Oxytocin receptor distribution reflects social organization in monogamous and polygamous voles. Proceedings of the National Academy of Sciences, USA, 89, 5981–5985.
Lupfer, G., Frieman, J., & Coonfield, D. (2003). Social transmission of flavor preferences in two species of hamsters (Mesocricetus auratus and Phodopus campbelli). Journal of Comparative Psychology, 117, 449–455.
Insel, T. R., Wang, Z.-X., & Ferris, C. F. (1994). Patterns of brain vasopressin receptor distribution associated with social organization in microtine rodents. Journal of Neuroscience, 14, 5381–5392.
Marler, C. A., Bester-Meredith, J. K., & Trainor, B. C. (2003). Paternal behavior and aggression: Endocrine mechanisms and non-genomic transmission of behavior. Advances in the Study of Behavior, 32, 263–323.
Jacobs, L. F., Gaulin, S. J. C., Sherry, D. G., & Hoffman, G. E. (1990). Evolution of spatial cognition: Sex specific patterns of spatial behavior predict hippocampal size. Proceedings of the National Academy of Sciences, USA, 87, 6349–6352.
Matsuzawa, T. (Ed.). (2001). Primate origins of human cognition and behavior. Tokyo: Springer-Verlag.
Jensen, K., Hare, B., Call, J., & Tomasello, M. (2006). What’s in it for me? Self-regard precludes altruism and spite in chimpanzees. Proceedings of the Royal Society of London. Series B, 273, 1013–1021. Johnston, T. D. (1982). The selective costs and benefits of learning: An evolutionary analysis. Advances in the Study of Behavior, 12, 65–106. Jouventin, P., Pasteur, G., & Cambefort, J. P. (1976). Observational learning of baboons and avoidance of mimics: Exploratory tests. Evolution, 31, 214–218.
Mayr, E. (1961, November 10). Cause and effect in biology. Science, 134, 1501–1506. McCabe, K., Houser, D., Ryan, L., Smith, V., & Trouard, T. (2001). A functional imaging study of cooperation in two-person reciprocal exchange. Proceedings of the National Academy of Sciences, USA, 98, 11832–11835. Melis, A. P., Hare, B., & Tomasello, M. (2006). Engineering cooperation in chimpanzees: Tolerance constraints on cooperation. Animal Behaviour, 72, 275–286.
Kawai, M. (1965). Newly acquired pre-cultural behavior of the natural troop of Japanese monkeys on Koshima Islet. Primates, 6, 1–30.
Mendoza, S. P., & Mason, W. A. (1986a). Contrasting response to intruders and involuntary separation by monogamous and polygamous New World monkeys. Physiology and Behavior, 38, 795–801.
Keysers, C., Kohler, E., Umiltà, M. A., Nanetti, L., Fogassi, L., & Gallese, V. (2003). Audiovisual mirror neurons and action recognition. Experimental Brain Research, 153, 628–636.
Mendoza, S. P., & Mason, W. A. (1986b). Parental division of labor and differentiation of attachments in a monogamous primate (Callicebus moloch). Animal Behaviour, 34, 1336–1347.
King-Casas, B., Tomlin, D., Anen, C., Camerer, C. F., Quartz, S. R., & Montague, P. R. (2005, April 1). Getting to know you: Reputation and trust in a two-person economic exchange. Science, 308, 78–83.
Mendres, K. A., & de Waal, F. B. M. (2000). Capuchins do cooperate: The advantage of an intuitive task. Animal Behaviour, 60, 523–529.
Klopfer, P. H. (1959). Social interactions in discrimination learning with special reference to feeding behavior in birds. Behaviour, 14, 282–299. Klopfer, P. H. (1961). Observational learning in birds: The establishment of behavioral modes. Behaviour, 17, 71–80. Kohler, E., Keysers, C., Umiltà, M.A., Fogassi, L., Gallese, V., & Rizzolatti, G. (2002, August 2). Hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846. Kosfeld, M., Heinrichs, M., Zak, P. J., Fischbacher, U., & Fehr, E. (2005, June 2). Oxytocin increases trust in humans. Nature, 435, 673–676. Laland, K. N. (2004). Social learning strategies. Learning and Behavior, 32, 4–14. Lazaro-Perea, C., de Fatima Arruda, M., & Snowdon, C. T. (2004). Grooming as a reward? Social function of grooming between females in cooperatively breeding marmosets. Animal Behaviour, 67, 627–636.
c03.indd Sec6:53
Liu, D., Diorio, J., Day, J. C., Francis, D. D., & Meaney, M. J. (2000). Maternal care hippocampal synaptogenesis and cognitive development in rats. Nature Neuroscience, 3, 799–806.
Miklosi, A. (1999). The ethological analysis of imitation. Biological Reviews, 74, 347–374. Milinski, M. (1987, January 29). Tit for tat in sticklebacks and the evolution of cooperation. Nature, 325, 433–435. Nelson, E. E., & Panksepp, J. (1998). Brain substrates of infant-mother attachment: Contributions of opioids, oxytocin, and norepinephrine. Neuroscience and Biobehavioral Reviews, 22, 437–452. Nicol, C. J., & Pope, S. J. (1994). Social learning in small flocks of laying hens. Animal Behaviour, 47, 1289–1296. Nottebohm, F. (1981, December 18). A brain for all seasons: Cyclical anatomic changes in song control nuclei of the canary brain. Science, 214, 1368–1370. Nowak, R. M. (1999). Walker ’s primates of the world. Baltimore: Johns Hopkins University Press.
Leavens, D. A., Hopkins, W. D., & Bard, K. A. (2005). Understanding the point of chimpanzee pointing: Epigenesis and ecological validity. Current Directions in Psychological Science, 14, 185–189.
Owren, M. J., & Rendall, D. (1997). An affect-conditioning model of nonhuman primate vocal signaling. In M. D. Beecher, D. H. Owings, & N. S. Thompson (Eds.), Perspectives in ethology (Vol. 12, pp. 399– 346). New York: Plenum Press.
Lefebvre, L., & Palameta, B. (1988). Mechanisms, ecology, and population diffusion of socially learned, food-finding behavior in feral
Packer, C., & Ruttan, L. (1988). The evolution of cooperative hunting. American Naturalist, 132, 159–198.
8/17/09 1:58:36 PM
54
Comparative Cognition and Neuroscience
Padoa-Schioppa, C., & Assad, J. A. (2006, May 11). Neurons in the orbitofrontal cortex encode economic value. Nature, 441, 223–226. Pedersen, C. A., & Boccia, M. L. (2002). Oxytocin maintains as well as initiates female sexual behavior: Effects of a highly selective oxytocin antagonist. Hormones and Behavior, 41, 170–177. Petit, O., Desportes, C., & Thierry, B. (1992). Differential probability of “coproduction” in two species of macaque (Macaca tonkeana, M. mulatta). Ethology, 90, 107–120. Pfeiffer, T., Rutte, C., Killingback, T., Taborsky, M., & Bonhoeffer, S. (2005). Evolution of cooperation by generalized reciprocity. Proceedings of the Royal Society of London. Series B, 272, 1115–1120. Popik, P., & van Ree, J. M. (1991). Oxytocin but not vasopressin facilitates social recognition following injection into the medial preoptic area of the rat brain. European Neuropsychopharmacology, 1, 555–560. Pravosudov, V. V., & Clayton, N. S. (2002). A test of the adaptive specialization hypothesis: Population differences in caching memory and the hippocampus in black-capped chickadee (Poecile atricapilla). Behavioral Neuroscience, 116, 515–522. Pravosudov, V. V., & de Kort, S. R. (2006). Is the Western scrub-jay (Aphelocoma californica) really an underdog among food-caching corvids when it comes to hippocampal volume and food caching propensity? Brain, Behavior and Evolution, 67, 1–9. Premack, D. G., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1, 515–526. Ribble, D. O. (1991). The monogamous mating system of Peromyscus californicus as revealed by DNA fingerprinting. Behavioral Ecology and Sociobiology, 29, 161–166. Rilling, J. K., Glenn, A. L., Jairam, M. R., Pagnoni, G., Goldsmith, D. R., Elfenbein, H. A., et al. (2007). Neural correlates of social cooperation and non-cooperation as a function of psychopathy. Biological Psychiatry, 61, 1260–1271. Rilling, J. K., Gutman, D. A., Zeh, T. R., Pagnoni, G., Berns, G. S., & Kilts, C. D. (2002). A neural basis for social cooperation. Neuron, 35, 395–405. Rilling, J. K., Sanfey, A. G., Aronson, J. A., Nystrom, L. E., & Cohen, J. D. (2004). Opposing bold responses to reciprocated and unreciprocated altruism in putative reward pathways. NeuroReport, 15, 2539–2543. Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual Review of Neuroscience, 27, 169–192. Rizzolatti, G., Fadiga, L., Gallese, V., & Fogassi, L. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Research, 3, 131–141. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding of imitation and action. Nature Reviews Neuroscience, 2, 661–670. Roesch, M. R., & Olson, C. R. (2004, April 9). Neuronal activity related to reward value and motivation in primate frontal cortex. Science, 304, 307–310. Rosenblum, L. A., Smith, E. L. P., Altemus, M., Scharf, B. A., Owens, M. J., Nemeroff, C. B., et al. (2002). Differing concentrations of corticotrophin releasing factor and oxytocin in the cerebrospinal fluid of bonnet and pigtail macaques. Psychoneuroendocrinology, 27, 651–660. Rukstalis, M., & French, J. A. (2005). Vocal buffering of the stress response: Exposure to conspecific vocalizations moderates urinary cortisol excretion in isolated marmosets. Hormones and Behavior, 47, 1–7. Rutte, C., & Taborsky, M. (2007). Generalized reciprocity in rats. Public Library of Science, 5, 1421–1425. Rutte, C., & Taborsky, M. (2008). The influence of social experience on cooperative behaviour of rats (Rattus norvegicus): Direct vs. generalised reciprocity. Behavioral Ecology and Sociobiology, 62, 499–505. Rypstra, A. L., & Tirey, R. S. (1991). Prey size, prey perishability and group foraging in a social spider. Oecologia, 86, 25–30.
c03.indd Sec6:54
Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996, December 13). Statistical leaning in 8-month-old infants. Science, 274, 1926–1928. Savage, A., Ziegler, T. E., & Snowdon, C. T. (1988). Sociosexual development, pair bond formation and mechanisms of fertility suppression in female cotton-top tamarins (Saguinus oedipus oedipus). American Journal of Primatology, 14, 345–359. Schaffner, C. M., Shepherd, R. E., Santos, C. V., & French, J. A. (1995). Development of heterosexual relationship in Wied’s black tufted-ear marmosets (Callithrix kuhli). American Journal of Primatology, 36, 185–200. Schaller, G. B. (1972). The Serengeti lion. Chicago: University of Chicago Press. Scheel, D., & Packer, C. (1991). Group hunting behaviour of lions: A search for cooperation. Animal Behaviour, 41, 697–709. Schultz, W., Dayan, P., & Montague, P. R. (1997, March 14). A neural substrate of prediction and reward. Science, 275, 1593–1599. Schuster, R. (2002). Cooperative coordination as a social behavior. Human Nature, 13, 47–83. Schuster, R., & Perelberg, A. (2004). Why cooperate? An economic perspective is not enough. Behavioural Processes, 66, 261–277. Schwab, C., Bugnyar, T., Schloegl, C., & Kotrschal, K. (2008). Enhanced social learning between siblings in common ravens, Corvus corax. Animal Behaviour, 75, 501–508. Sherry, D. F., Jacobs, L. F., & Gaulin, S. J. C. (1992). Spatial memory and adaptive specialization of the hippocampus. Trends in Neuroscience, 15, 298–303. Shields, W. M. (1984). Barn swallow mobbing: Self defence, collateral defence, group defence or paternal care? Animal Behaviour, 32, 132–148. Silk, J. B., Brosnan, S. F., Vonk, J., Henrich, J., Povinelli, D. J., Richardson, A. S., et al. (2005, October 27). Chimpanzees are indifferent to the welfare of unrelated group members. Nature, 437, 1357–1359. Silva, H. P. A., & Sousa, M. B. C. (1997). The pair-bond formation and its role in the stimulation of reproductive function in female common marmosets (Callithrix jacchus). International Journal of Primatology, 18, 387–400. Smith, G. T., Brenowitz, E. A., Beecher, M. D., & Wingfield, J. C. (1997). Seasonal changes in testosterone, neural attributes of song control nuclei, and song structure in wild songbird. Journal of Neuroscience, 17, 6001–6010. Smith, T. E., & French, J. A. (1997). Social and reproductive conditions modulate urinary cortisol excretion in black tufted-ear marmosets (Callithrix kuhli). American Journal of Primatology, 42, 253–267. Snowdon, C. T., & Boe, C. Y. (2003). Social communication about unpalatable foods in tamarins (Saguinus oedipus). Journal of Comparative Psychology, 117, 142–148. Snowdon, C. T., & Cronin, K. A. (2007). Cooperative breeders do cooperate. Behavioural Processes, 76, 138–141. Snowdon, C. T., & Ziegler, T. E. (2007). Growing up cooperatively. Journal of Developmental Processes, 2, 40–66. Solomon, N. G., & French, J. A. (1997). Cooperative breeding in mammals. Cambridge, England: Cambridge University Press. Spence, K. W. (1937). Experimental studies of learning and the higher mental processes in infra-human primates. Psychological Bulletin, 34, 806–850. Stander, P. E. (1992). Cooperative hunting in lions: The role of the individual. Behavioral Ecology and Sociobiology, 29, 445–454. Stevens, J. R., & Hauser, M. D. (2004). Why be nice? Psychological constraints on the evolution of cooperation. Trends in Cognitive Sciences, 8, 60–65. Strier, K. B. (1997). Subtle cues of social relationships in male muriqui monkeys (Brachyteles arachnoides). In W. G. Kinzey (Ed.), New world primates: Evolution ecology and behavior (pp. 109–118). New York: Aldine de Gruyter.
8/17/09 1:58:37 PM
References 55 Tebbich, S., Taborsky, M., & Winkler, H. (1996). Social manipulation causes cooperation in keas. Animal Behaviour, 52, 1–10. Templeton, J. J., Kamil, A. C., & Balda, R. P. (1999). Sociality and social learning in two species of corvids: The pinyon jay (Gymnorhinus cyanocephalus) and the Clark’s nutcracker (Nucifraga columbiana). Journal of Comparative Psychology, 113, 450–455. Thompson, R. K. R., Oden, D. L., & Boysen, S. T. (1997). Languagenaive chimpanzees (Pan troglodytes) judge relations between relations in a conceptual matching-to-sample task. Journal of Experimental Psychology: Animal Behavior Processes, 23, 31–43.
Wang, Z., Toloczko, D., Young, L. J., Moody, K., Newman, J. D., & Insel, T. R. (1997). Vasopressin in the forebrain of common marmosets (Callithrix jacchus): Studies with in situ hybridization, immunocytochemistry and receptor autoradiography. Brain Research, 768, 147–156. Werdenich, D., & Huber, L. (2002). Social factors determine cooperation in marmosets. Animal Behaviour, 64, 771–781.
Thornton, A., & McAuliffe, K. (2006, July 14). Teaching in wild meerkats. Science, 313, 227–229.
West, S. A., Griffin, A. S., & Gardner, A. (2007). Social semantics: Altruism, cooperation, mutualism, strong reciprocity and group selection. Journal of Evolutionary Biology, 20, 415–432.
Tooby, J., & Cosmides, L. (1990). The past explains the present: Emotional adaptations and the structure of ancient environments. Ethology and Sociobiology, 11, 375–424.
Whiten, A., & Ham, R. (1992). On the nature and evolution of imitation in the animal kingdom: Reappraisal of a century of research. Advances in the Study of Behavior, 21, 239–283.
Trivers, R. L. (1971). The evolution of reciprocal altruism. Quarterly Review of Biology, 46, 35–57.
Whiten, A., Horner, V., & Litchfield, C. A. (2004). How do apes ape? Learning and Behavior, 32, 36–52.
Turner, R. A., Altemus, M., Enos, T., Cooper, B., & McGuinness, T. (1999). Preliminary research on plasma oxytocin in normally-cycling women: Investigating emotion and interpersonal distress. Psychiatry, 62, 97–113.
Widowski, T. M., Porter, T. A., Ziegler, T. E., & Snowdon, C. T. (1992). The stimulatory effect of males on the initiation, but not the maintenance, of ovarian cycling in cotton-top tamarins (Saguinus oedipus). American Journal of Primatology, 26, 97–108.
Umiltà, M. A., Kohler, E., Gallese, V., Fogassi, L., Fadiga, L., Keysers, C., et al. (2001). I know what you are doing: A neurophysiological study. Neuron, 31, 155–165.
Wilkinson, G. S. (1984, March 8). Reciprocal food sharing in the vampire bat. Nature, 308, 181–184.
Uvnas-Moberg, K. (1998). Oxytocin may mediate the benefits of positive social interactions and emotions. Psychoneuroendocrinology, 23, 819–835. van Schaik, C. P., & Kappeler, P. M. (2006). Cooperation in primates and humans: Closing the gap. In P. M. Kappeler & C. P. van Schaik (Eds.), Cooperation in primates and humans: Mechanisms and evolution (pp. 3–21). Berlin, Germany: Springer-Verlag. Visalberghi, E. (1997). Success and understanding in cognitive tasks: A comparison between (Cebus apella) and (Pan troglodytes). International Journal of Primatology, 18, 811–830. Visalberghi, E., Pellegrini Quarantotti, B., & Tranchida, F. (2000). Solving a cooperation task without taking into account the partner ’s behavior: The case of capuchin monkeys (Cebus apella). Journal of Comparative Psychology, 114, 297–301. Von Frisch, K. (1950). Bees: Their vision, chemical senses and language. Ithaca, NY: Cornell University Press. Vonk, J., Brosnan, S. F., Silk, J. B., Henrich, J., Richardson, A. S., Lambeth, S. P., et al. (2008). Chimpanzees do not take advantage of very low cost opportunities to deliver food to unrelated group members. Animal Behaviour, 75, 1757–1770. Vonk, J., & Povinelli, D. J. (2006). Similarity and difference in the conceptual system of primates: The unobservability hypothesis. In E. Wasserman & T. Zentall (Eds.), Comparative cognition: Experimental explorations of animal intelligence (pp. 363–387). Oxford, England: Oxford University Press.
c03.indd Sec6:55
Wang, Z., Moody, K., Newman, J. D., & Insel, T. R. (1997). Vasopressin and oxytocin immunoreactive neurons and fibers in the forebrain of male and female common marmosets (Callithrix jacchus). Synapse, 27, 14–25.
Williams, J. R., Catania, K. C., & Carter, C. S. (1992). Development of partner preferences in female prairie voles (Microtus ochrogaster): The role of social and sexual experience. Hormones and Behavior, 26, 339–349. Winslow, J. T., Hastings, N., Carter, C. S., Harbaugh, C. R., & Insel, T. R. (1993, October 7). A role for central vasopressin in pairbonding in monogamous prairie voles. Nature, 365, 545–548. Winslow, J. T., Noble, P. L., Lyons, C. K., Sterk, S. M., & Insel, T. S. (2003). Rearing effects on cerebrospinal fluid oxytocin concentration and social buffering in rhesus monkeys. Neuropsychopharamacology, 28, 910–918. Wismer Fries, A. B., Ziegler, T. E., Kurian, J. R., Jacoris, S., & Pollak, S. D. (2005). Early experience in humans is associated with changes in neuropeptides critical for regulating social behavior. Proceedings of the National Academy of Sciences, USA, 102, 17237–17240. Witt, D. M., Winslow, J. T., & Insel, T. R. (1992). Enhanced social interactions in rats following chronic, centrally infused oxytocin. Pharmacology, Biochemistry, and Behavior, 43, 855–861. Young, L. J. (1999). Oxytocin and vasopressin receptors and speciestypical social behaviors. Hormones and Behavior, 36, 212–221. Young, L. J., & Wang, Z. (2004). The neurobiology of pair bonding. Nature Neuroscience, 7, 1048–1054. Zak, P. J., Kurzban, R., & Matzner, W. T. (2005). Oxytocin is associated with human trustworthiness. Hormones and Behaviour, 48, 522–527. Zak, P. J., Stanton, A. A., & Ahmadi, S. (2007). Oxytocin increases generosity in humans. Public Library of Science, 2(11), E1128.
8/17/09 1:58:37 PM
Chapter 4
Biological Rhythms LANCE J. KRIEGSFELD AND RANDY J. NELSON
“a rose is not necessarily and unqualifiedly a rose . . . it is a very different biochemical system at noon and at midnight.” —Colin Pittendrigh, 1965
When it comes to success, the expression “timing is everything” is an oft-touted mantra. Traditionally, however, timing was not considered important in the study of the mechanisms underlying behavior. Yet, behavioral constructs such as learning, memory, sensation, perception, attention, and motivation vary markedly according to the time of day or season of the year. Motivation can be defined generally as why individuals do what they do. From a neurobiological perspective, an animal eats because its hunger circuits are activated by specific neurochemicals. However, the response of circuits underlying motivated behaviors is quite different given time of day or year, and it is essential to consider timing when asking questions about the biological mechanisms underlying behavior. To add to this example, an animal eats at night in response to the impact of its internal biological clock on the responsiveness of hunger circuits to specific neurochemical signals. Despite the impact of timing systems on brain and behavior, their influence on dependent measures of interest is frequently ignored. For example, learning and memory performance are typically examined during the middle of the day in nocturnal (night active) rodents, a time when they are normally asleep. As a result, many reported deficits in performance can be attributed to time-of-day effects rather than (or in addition to) the intended manipulation. One goal of this chapter is to underscore the impact of biological rhythms on behavior and physiology to encourage a full appreciation of the significance and magnitude of such timed changes for investigators in the behavioral sciences.
FUNCTIONAL SIGNIFICANCE OF BIOLOGICAL RHYTHMS One of the most predictable features of life on earth is the regular pattern of environmental changes associated with the movement of the planet. Life evolved in a cyclic environment. Except for organisms living at the bottom of the oceans or deep within caves, day follows night, and the seasons change. These orderly and predictable changes in the environment have existed since life first began to evolve, although the timing of these events may have changed slightly over time. The rotation of the earth results in periodic exposure to the radiation of the sun, which causes predictable changes in light and ambient temperatures, as well as associated changes in the relative humidity of the air and in the oxygen levels of aqueous habitats. The biological clocks of animals and plants permit them to start or stop locomotor activities or activate photosynthetic machinery, respectively, in preparation for light. The action of the master biological clock in synchronizing individual bodily functions has been compared to the role of the conductor of an orchestra. As a result, internal processes serve to prepare the body for certain activities to occur later; for example, the elevated adrenal secretions coinciding with the morning onset of activity prepare an individual for increased activity levels and for breaking the nightly fast. All eukaryotic and most prokaryotic organisms tested to date, from unicellular organisms to humans, display circadian rhythms. Predictably, circadian rhythms have not evolved in organisms that live for less than 24 hours. Similarly, circannual rhythms have evolved only in animals that live for a year or more. Physiological systems reveal a wide range of rhythmic changes. For instance, neurotransmitter turnover, body temperature, and blood plasma levels of adrenalin, potassium, sodium, cortisol, androgens, and growth
We thank Zachary Weil for helpful comments on an earlier version of this chapter. The authors were supported during the preparation of this chapter by NIH grants HD050470 (LJK), MH57353, and NSF grant OIS 04-16897 (RJN). 56
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c04.indd 56
8/18/09 5:11:29 PM
Properties of Biological Rhythms
hormone all show pronounced circadian rhythms. Some of these changes may be on the order of 100% to 200% from baseline values. Peak daily cortisol concentrations, for example, which usually occur just prior to or immediately after awakening, coincide with the onset of activity in the morning. This programmed elevation of cortisol concentrations increases blood pressure and cardiac output prior to the active phase of the day. Increased cortisol concentrations are not driven by increased activity levels because the same circadian rhythm is observed in bedridden patients under constant conditions (Aschoff, 1965). In some instances, neurochemical receptors are produced prior to a circadian-programmed increase in ligand production, with both processes being coordinated by the circadian system. Body temperature peaks in mid-afternoon, when people are most active, but again, muscular activity is not solely responsible for this “heating” (Refinetti & Menaker, 1992). It may seem that the concept of biological clocks and programmed temporal changes in behavioral function is at odds with the concept of homeostasis in biological sciences and medicine. Homeostatic processes work to maintain physiological parameters within specific and often narrow ranges. Biologists and physicians frequently considered large fluctuations in physiology and behavior to be pathological; until recently, many resisted the idea of the programmed changes in physiology and behavior that we now understand to underlie homeostatic processes. As a result, we should not consider the appropriate parameters for a specific system to be fixed but, instead, variable relative to time of day or year. What orchestrates physiology and behavior so that organisms engage in appropriate behaviors at suitable times of day or during optimal seasons of the year? Some rhythms are the result of environmental factors that impose temporal control over behavior and physiology—these rhythms are termed driven rather than endogenous (from within). In contrast, most temporal changes in brain and behavior are controlled by internal biological clocks. This chapter focuses on these latter clocks in the control of behavior.
Biological Function (%)
100
Chronobiology has borrowed terms and concepts extensively from the field of engineering. For example, a rhythm is defined as a recurrent event that is characterized by its period, frequency, amplitude, and phase (Vitaterna, Takahashi, & Turek, 2001). Period is the length of time required to complete one cycle of the rhythm in question (e.g., the amount of time required to go from peak to peak or trough to trough). Frequency is computed as the number of completed cycles per unit of time (e.g., 6 cycles/day). Amplitude is the amount of change above and below the average value; that is, the distance of the peak from the average. The phase represents a point on the rhythm relative to some objective time point during the cycle. For example, under standard conditions, the phase of onset of the activity portion of a mouse’s activity-rest cycle corresponds closely with the onset of dark (Figure 4.1). Phase relations among various biological rhythms can also be described (e.g., the onset of the active phase of the daily sleep-wake cycle tightly corresponds with the peak of stress hormones secretion in both mice and humans). Many biological rhythms have been recognized for thousands of years. In the past, however, rhythms in migration, daily activity patterns, or hibernation were attributed to exogenous factors. Although exogenous factors may serve a permissive or synchronizing role for biological rhythms, endogenous timing mechanisms mediate many of the observed rhythms in brain and behavior. The best evidence to document whether a rhythm is driven by exogenous or endogenous signals is obtained via isolation studies (Aschoff, 1981; DeMairan, 1729; Thrun, Moenter, O’Callaghan, Woodfill, & Karsch, 1995; Vitaterna et al., 2001; Zucker & Boshes, 1982). Persistence of a biological rhythm in the absence of environmental cues provides compelling evidence that the rhythm under study is generated from within the individual and not driven by the environment. If a biological rhythm disappears under constant conditions, then it is reasonable to suggest that some cyclic cue in the environment drives the biological rhythm. This logic
Amplitude
Rhythm 1
Period length
Average value
60 Phase relationship
Minimum
20 0
Rhythm 2 12
24 Time (Hours)
c04.indd Sec1:57
PROPERTIES OF BIOLOGICAL RHYTHMS
Maximum
80
40
57
Figure 4.1 Components of biological rhythms. Rhythms can be analyzed in terms of amplitude, frequency, and period length. Note. The relationship from one rhythm to another is expressed in terms of phase relationships. Both cycles have a period of 24 hours (frequency 1 day).
8/18/09 5:11:30 PM
58
Biological Rhythms
has now been applied in hundreds of studies that have firmly established the existence of biological rhythms driven by endogenous biological clocks. Incredibly, the period of the daily activity rhythm of one individual can be transferred to another individual by means of brain tissue transplants that contain the master biological clock (Lehman et al., 1987; Ralph, Foster, Davis, & Menaker, 1990). All animals (and all eukaryotic and some prokaryotic plants) studied to date have endogenous clocks that mediate biological rhythms (Roenneberg & Merrow, 2002). The periods of biological rhythms range from the 1-ms cycle of firing among some neurons to longer cycles such as the 90-minute cycle of REM sleep, the 4- or 5-day estrous cycles of rats, the annual cycle of hibernation, the 17-year cycle of cicada emergence, or the 100-year cycle of century plant flowering. How are these biological rhythms generated? Are they all mediated by biological clocks? Certainly, the periods of some biological rhythms, including most central nervous system and cardiovascular rhythms vary widely within the same individual depending on activity. The periods of other endogenous cycles, such as the wake-sleep cycle, are largely constant for the same individual, but there may be great inter-individual or species variation. Four types of biological rhythms can be identified, however, that are typically coupled with environmental factors, and the periods of these rhythms vary little under natural conditions. These relatively constant biological rhythms mimic the periods of the geophysical cycles of night and day (circadian), the tides (circatidal), the phases of the moon (circalunar), and the seasons of the year (circannual; Dunlap, Loros, & DeCoursey, 2004). These rhythms persist when animals are isolated from the respective environmental cues, but when isolated, these rhythms only approximate the periods of the environmental cues to which they are normally synchronized. Thus, the terms for many biological rhythms use the prefix circa (Halberg, 1959). Although isolation experiments are necessary to determine the extent of endogenous generation of any biological rhythm, it should be emphasized that the environment often exerts permissive effects on endogenously driven biological rhythms. The evidence for circadian or circannual rhythms is much stronger than for circatidal and circalunar rhythms, especially among vertebrates. Thus, this chapter focuses on circadian and circannual rhythms in behavioral neuroscience. Light serves as an important environmental time cue, or zeitgeber (from the German, meaning “time giver”), for most species (Morin & Allen, 2006; Roenneberg, Daan, & Merrow, 2003). Temperature is an important zeitgeber for some poikilothermic animals, and possibly a secondary zeitgeber for birds and mammals, but light is the primary zeitgeber among homeothermic animals (Foster,
c04.indd Sec1:58
Hankins, & Peirson, 2007). As mentioned previously, the internal clock runs with a period that is approximately 24 hours in constant conditions, yet the functions controlled by this clock recur precisely every 24 hours when entrained (synchronized) to a 24-hour light-dark cycle. Because day length varies across the year in relatively small increments, internal clocks must exhibit plasticity. If a mouse is placed in constant dim light so that there is no daily zeitgeber, then the onset time of its locomotor activity begins to drift out of synchrony with local time; similarly, if a person is put into a windowless room for weeks, his wake-sleep cycle will drift out of phase with local conditions. Without the daily light-dark cycle providing a daily reset, the endogenous circadian clock of a hamster or human only approximates 24-hour cycles. Biological rhythms that are not synchronized with environmental cues are called free-running. Each individual displays its own free-running period (abbreviated as ), which remains relatively constant. The free-running periods of biological clocks are precise, but are not exactly 24 hours. The observation that different hamsters display an array of different free-running periods when housed in the same room suggests that they are not synchronized by each other ’s behavior, and that subtle geophysical cues are not providing temporal information. However, social factors can provide a zeitgeber for humans living in constant conditions (Mistlberger & Skene, 2004, 2005; Turek & Zee, 1999). Appropriately timed exercise or exogenous melatonin also influences human circadian timing (Mistlberger & Skene, 2005). As noted, individual animals display species-specific times of locomotor activity onset that are often linked to the timing of food intake, water consumption, and social activities (Mistlberger & Skene, 2004, 2005). If the zeitgeber is phase-shifted, then animals adapt their activity to the new regimen in a few days (length of time is dependent on the extent of change). The circadian rhythms of other physiological processes, such as adrenocortical hormone release, body temperature, and blood plasma volume, may require more or less time to synchronize to the new lighting regime (Davidson, Yamazaki, Arble, Menaker, & Block, 2006; Yamazaki et al., 2000). This process of adaptation to rapid phase shifts in environmental lighting results in what is commonly called “jet lag” when people travel through different time zones. Ultradian (shorter than circadian) and infradian (longer than circadian) rhythms are biological rhythms that do not correspond to any known geophysical cycles. Ultradian cycles are commonly observed; for example, the 90-minute cycle characteristic of REM sleep is a welldocumented ultradian rhythm (Schulz & Lavie, 1985). The pulsatile secretion of several hormones, including gonadotropin-releasing hormone (GnRH), luteinizing hormone
8/18/09 5:11:30 PM
Circadian Control of Brain and Behavior 59
(LH), testosterone, growth hormone, and corticosterone, represent ultradian rhythms (Schulz & Lavie, 1985). Ultradian rhythm variation in free estradiol and cortisol concentrations during menstrual cycles correlates with depressive symptoms (Bao et al., 2004) as well as sleep disturbances (Voss, 2004). Ultradian rhythms in locomotor activity, feeding, and metabolism have been reported in a number of high-metabolic mammalian species such as shrews and voles (reviewed in Liu, Li, & Wang, 2007), and may be related to ancient metabolic cell cycles (Lloyd, Lemar, Salgado, Gould, & Murray, 2003). Infradian rhythms are less frequently observed than ultradian rhythms; such biological rhythms are longer than a day, but shorter than a lunar month. Generally, infradian rhythms in testicular function are rare among vertebrates. Dormice (Glis glis) have been reported to display infradian rhythms of about 60 days in body mass and body fat content (Grimes, Melnyk, Martin, & Mrosovsky, 1981; Melnyk, Mrosovsky, & Martin, 1983a, 1983b). The most common types of infradian rhythms are those associated with ovarian cycles; that is, estrous or menstrual cycles. Hamsters and mice display 4-day estrous cycles; rats have estrous cycles of either 4 or 5 days. Estrous cycles are 16 days in guinea pigs and sheep. Estrous cycles persist in constant conditions (i.e., self-sustaining), but are endogenously generated rhythms that do not correspond to any known geophysical cue. Although human menstrual cycles approximate a lunar month, the length of the menstrual cycle seems to be only coincidentally similar to the length of the lunar month, rather than reflecting any adaptive link to the lunar cycle (Knobil & Hotchkiss, 1988). CIRCADIAN CONTROL OF BRAIN AND BEHAVIOR Evolution of the Circadian System From prokaryotic to eukaryotic organisms, the 24-hour solar cycle has had a pervasive impact on the evolution of life. As suggested previously, a variety of conditions require that organisms internally track daily time. For instance, all energetic requirements for normal health and functioning cannot be simultaneously fulfilled, and peaks in energetic processes must be partitioned throughout the day. Likewise, all behavioral requirements (e.g., foraging, mating, nest building) cannot be simultaneously performed. To allow organisms to anticipate daily environmental change and synchronize their behavior and physiology accordingly, individuals have evolved an endogenous circadian timekeeping mechanism. Not only does this system allow for the coordination of internal physiology, but it also allows animals to predict recurring 24-hour events, such
c04.indd Sec2:59
as food availability and predator activity, and adjust their behavior appropriately. Perhaps the most direct evidence for the adaptive significance of circadian rhythms comes from studies of cyanobacteria—a group of prokaryotes. Analysis of relative fitness of different strains of cyanobacteria shows that strains with a circadian period similar to the light/dark cycle of the environment have greater reproductive success (Ouyang, Andersson, Kondo, Golden, & Johnson, 1998; Woelfle, Ouyang, Phanvijhitsiri, & Johnson, 2004). Strains whose clocks were disrupted were defeated by strains with a functional clock, but only when held in a light/dark cycle; any competitive advantage was lost in constant conditions. In multicellular organisms, this competitive advantage is also observed; Arabidopsis thaliana, as in cyanobacteria, grow faster and survive longer when housed in light cycles closest to their endogenous circadian period (Dodd et al., 2005). Circadian rhythms are strikingly ubiquitous, with most behavioral, biochemical, and physiological responses of organisms showing daily variation that persists in constant environmental conditions. As described next, some genes are part of the cellular clockwork mechanism, whereas others are controlled directly or indirectly by these core clock genes (i.e., clock-controlled genes). To further understand the evolution of the circadian system, one logical step is to determine the percent of the genome under circadian control across taxa. In Arabidopsis, about 6% of the estimated 8,000 genes studied are rhythmic (Harmer et al., 2000). In contrast, in the retina of Xenopus perhaps only a few critical proteins (0.2%) are directly under the control of the circadian clock (C. B. Green & Besharse, 1996). In Drosophila, an oligonucleotide-based high-density array to measure gene expression changes on a whole genome level revealed that approximately 7% to 9% of genes express a circadian pattern (Ceriani et al., 2002; Claridge-Chang et al., 2001; C. B. Green & Besharse, 1996; McDonald & Rosbash, 2001). Investigations of head versus body rhythms showed that the genes exhibiting circadian patterns have little overlap, suggesting specificity of function for core clock genes (CCGs) in distinct systems (Ceriani et al., 2002). In mice the same general conclusion has been reached by comparing gene expression patterns between liver and heart (Storch et al., 2002). While 8% to 10% of genes are rhythmically expressed in one of the tissues, there are few genes that show circadian regulation in both tissues. Even the genes that have a rhythm in both tissues are frequently out of phase, suggesting independent local function. Determination of similarities and differences across species in clock gene and CCG homologues, and the signals that set their phase, will be necessary to fully understand the evolution of the circadian system.
8/18/09 5:11:30 PM
Biological Rhythms
Adaptive Significance of the Circadian System Animals have evolved to synchronize their endogenous circadian rhythms with the environment in order to promote survival (e.g., DeCoursey & Krulas, 1998). Numerous behaviors are restricted to specific times of day in response to a variety of selection pressures. For example, diurnal species (i.e., species active during the day) confine behaviors such as feeding, locomotion, foraging, and reproduction to the light hours in order to avoid predation. Likewise, nocturnal (i.e., active at night) species such as owls are active at night to maximize the availability of nocturnal prey (e.g., small rodents). As described later, the circadian “clock” that synchronizes rhythms in behavior and physiology is localized to a discrete bilateral nucleus in the anterior hypothalamus. If this master brain clock is destroyed, animals lose their ability to restrict behaviors to a particular time of day, and this arrhythmicity may compromise survival (DeCoursey & Krulas, 1998; DeCoursey, Krulas, Mele, & Holley, 1997). Humans have evolved to maintain maximum performance during daytime and circadian variation in learning and memory, attention, reaction time, and perception are prominent (Babkoff, Caspy, Mikulincer, & Sing, 1991; Colquhoun, 1981). This is especially important for individuals working night shifts that require critical cognitive performance. Generally, human performance peaks with the daily afternoon peak in body temperature, although verbal reasoning appears to peak earlier in the circadian cycle (Colquhoun, 1981). When people were maintained in the laboratory for a month in either 24- or 24.6-hour days, about half of the individuals adapted with peak melatonin occurring during the normally scheduled sleep period (synchronized), whereas half displayed peak melatonin while awake (nonsynchronized). Nonsynchronized individuals reduced total sleep time, as well as sleep latency and rapid eye movement (REM) latency. Cognitive performance was impaired among the nonsynchronized individuals and enhanced among the synchronized subjects (Wright, Hull, Hughes, Ronda, & Czeisler, 2006). People are likely to be nonsynchronized when working during the night shifts, but resorting to a normal day-night schedule during the weekend, or when undergoing repeated jet travel across several time zones. The ability of people to estimate time intervals also varies on a circadian basis. For example, time perception evaluations at 0900, 1300, 1700, and 2100 hours revealed that short-term time perception was more influenced by circadian phase than by memory load or various psychophysiological factors (Kuriyama et al., 2003). Such human studies should be interpreted cautiously because either approximately onethird of the day is missing or circadian influences are masked by sleep deprivation. These latter difficulties can be overcome by distributing small meals and naps throughout the day and having subjects maintain a constant routine designated by
c04.indd Sec2:60
the researchers (Brown & Czeisler, 1992; el-Hajj Fuleihan et al., 1997; Khalsa, Jewett, Duffy, & Czeisler, 2000). One study of visual selective attention used a different technique to dissociate the effects of circadian phase and time awake (Horowitz, Cade, Wolfe, & Czeisler, 2003). After 38 hours of no sleep, observers increased reaction times for spatial configuration and conjunction tasks. Observers traded accuracy for speed when sleepy, which could lead to decision errors. Indeed, extended shift durations lead to increased impairments in attentional errors, significant medical errors, and so-called adverse events in critical care units (Barger et al., 2006). In a web-based analysis of medical errors, interns working more than 5 extended-duration shifts per month reported more attentional failures during lectures, rounds, and clinical activities, including surgery; fatiguerelated preventable adverse events (including fatality) increased by 300% in these interns. Temporal variation in behavior is associated with temporal variation in underlying physiology and biochemistry. These underlying processes are modulated by endogenously driven circadian rhythms that are synchronized by environmental time cues. Clearly, a body clock that is not synchronized with the environment is not adaptive. Some rhythms, however, exhibit a peak in activity during one time of day while other rhythms are at a nadir at this time. For example, in humans, cortisol concentrations peak at dawn while melatonin and prolactin concentrations peak in the middle of the night. The relationship between daily fluctuations in physiology and the environmental light/dark cycle is different for nocturnal and diurnal species. Thus, although a myriad of rhythms have a period of about 24 hours, each of these rhythms has a unique phase with respect to one another and to the light/dark cycle (Figure 4.2). Stated differently, although bodily rhythms exhibit different phases
Daily Rhythm
60
Glucose Prolactin Glucocorticoids Testosterone Melatonin
Figure 4.2 Examples of circadian rhythms in humans. Note. Shaded areas represent times of sleep whereas the unshaded areas represent times of activity. All parameters exhibit unique peaks and troughs relative to each other. This phase relationship among individual rhythms is critical for optimal functioning and the maintenance of homeostasis.
8/18/09 5:11:30 PM
Circadian Control of Brain and Behavior 61
relative to some objective point in time, the phase relationship among rhythms remains stable, unless a perturbation occurs (e.g., jet lag). Consequently, circadian rhythms help to maintain homeostasis within the body. In addition, the phase of specific rhythms helps to prepare the body for necessary daily activities in advance of their actual occurrence. For example, cortisol rhythms rise in humans prior to waking in order to facilitate the onset of morning activity (e.g., Van Cauter & Refetoff, 1985; Weitzman et al., 1971). It is not difficult to imagine how physiological functioning would suffer without an internal circadian clock synchronized to the environment. Most of us have experienced general feelings of malaise and other maladies following a long flight across time zones. While we can recover within a few days from acute jet lag, millions of frequent flyers, shift workers, individuals with sleep disorders, and other individuals whose work day is not fixed are exposed chronically to such temporal disruptions. These individuals provide the opportunity to examine the effects of more chronic circadian disruptions. In fact, this loss of synchrony between the circadian clock in the brain and the environment leads to pronounced clinical pathologies. One recent study found that elderly mice subjected to temporal disruptions equivalent to a flight from Washington to Paris, once a week for 8 weeks, die sooner a result of their bodies being out of sync with local time (Davidson, Sellix, et al., 2006). Flight attendants frequently traveling across time zones exhibit cognitive deficits associated with reductions in temporal lobe structures (Cho, 2001; Cho, Ennaceur, Cole, & Suh, 2000). Numerous studies show that shift workers have a higher incidence of cancer (Conlon, Lightfoot, & Kreiger, 2007; Hansen, 2006; Kubo et al., 2006; O’Leary et al., 2006; Patel, 2006), diabetes (Karlsson, Alfredsson, Knutsson, Andersson, & Toren, 2005; Morikawa et al., 2005; Poole, Wright, & Nattrass, 1992; Robinson, Yateman, Protopapa, & Bush, 1990; Sanborn, Currie, & Bailey, 1982), ulcers (Costa, 1996; Koda et al., 2000; Kolmodin-Hedman & Swensson, 1975; Segawa et al., 1987), hypertension and cardiovascular disease (Alstadhaug, Salvesen, & Bekkelund, 2005; Costa, 1996; Hwang & Lee, 2005; Kivimaki et al., 2006; Wolk, Gami, Garcia-Touchard, & Somers, 2005), psychological disorders (Bildt & Michelsen, 2002; De Koninck, 1991; Leonard, Fanning, Attwood, & Buckley, 1998; Munakata et al., 2001; Skipper, Jung, & Coffey, 1990; Venuta, Barzaghi, Cavalieri, Gamberoni, & Guaraldi, 1999), and a host of other clinical issues. In fact, just changing the time of day during which cancer chemotherapy is administered can nearly double the chances of survival in patients suffering from cancers with an estimated 30% to 40% 5-year survival rate, including childhood leukemia and colorectal carcinomas (Hrushesky, 1990, 1993, 1995; Kanabrocki et al., 2006; Lee & Balick, 2006; Levi, 1987, 1994, 2001, 2002; Takimoto, 2006). These findings, although largely correlated, point to a critical role
c04.indd Sec2:61
for internal circadian timing in maintaining normal brain functioning and peripheral physiology. The two fundamental functions of the circadian system, internal organization and entrainment to the environment, are fundamental for optimal regulation of physiology and behavior. For each aspect of physiology and behavior, the circadian system sits upstream of a regulatory system, modulating the timing and synchronization of events. The mechanisms controlling circadian function—coordination of bodily systems and synchronization with the environment— represents the focus of the following sections. The Brain Clock Whereas a “master” circadian clock has been localized to the suprachiasmatic nucleus (SCN) located in the anterior hypothalamus in mammals (Figure 4.3; Moore & Eichler, 1972; Stephan & Zucker, 1972), it is now more appropriate to conceptualize the circadian system as an assembly comprised not only of a master clock, but also a series of subordinate clocks whose phase and coordinated activity is set by the SCN. As described in the following sections, the SCN has direct access to environmental time via retinal projections to the clock. Because subordinate central and peripheral clocks do not have access to such time cues, it is necessary for the SCN to communicate such information throughout the CNS and periphery. In addition to the core clock genes (CCGs) responsible for SCN and subordinate clock function, CCGs represent important output and local coordination systems. The stability of this hierarchical arrangement is necessary for normal body functioning and disease prevention. Numerous lines of evidence indicate that the master mammalian circadian clock is located in the SCN. The first
Figure 4.3 The mammalian circadian clock is located in the suprachiasmatic nucleus (SCN) of the anterior hypothalamus. Note. The SCN is pictured in this schematic of a coronal section through a rodent brain. The SCN is situated at the base of the brain directly above the optic chiasm (OC) and directly surrounding the third ventricle (V3). The sagittal schematic in the upper right corner depicts the approximate rostral-caudal location depicted in the coronal section.
8/18/09 5:11:31 PM
Biological Rhythms
indication that the SCN is the master clock comes from studies in which lesions ablating the SCN abolish circadian rhythmicity in adrenal corticoid secretion and locomotor behavior (Moore & Eichler, 1972; Stephan & Zucker, 1972). SCN-lesioned animals continue to show the full range of normal behaviors, but their temporal organization is lost and never recovers, irrespective of how early in development the lesions are performed (Mosko & Moore, 1979). The initial conclusion that the SCN serves as the master clock in the brain has been confirmed in the subsequent 30 years by converging lines of research involving in vivo, ex vivo, and in vitro studies carried out in many different laboratories. For example, transplants of donor SCN tissue into the brains of arrhythmic, SCN-lesioned hosts restore circadian rhythmicity in behavior (Lehman et al., 1987; Ralph et al., 1990). Importantly, rhythms are restored with the period of the donor SCN, indicating that the transplanted tissue does not act by restoring host-brain function but that the “clock” is contained in the transplanted tissue. Further evidence that clock function is contained within the SCN comes from studies demonstrating that circadian rhythms in neural firing rate persist in isolated SCN tissue maintained in culture (D. J. Green & Gillette, 1982; Groos & Hendriks, 1982; Shibata, Oomura, Kita, & Hattori, 1982). These studies confirmed that input from extra-SCN brain sites is not necessary for circadian rhythms in this nucleus (D. J. Green & Gillette, 1982; Groos & Hendriks, 1982; Shibata et al., 1982). In hypothalamic slice preparations, the SCN is intrinsically capable of sustaining not only circadian rhythms in neuronal firing rate, but also rhythms in glucose utilization and vasopressin secretion (Gillette & Reppert, 1987; D. J. Green & Gillette, 1982; Newman & Hospod, 1986). Primary cultures and organotypic explants of the rat SCN are similarly characterized by the distinctive capacity to generate circadian rhythms in vasopressin and vasoactive intestinal polypeptide (VIP) release for multiple cycles (Earnest & Sladek, 1987; Shinohara, Honma, Katsuno, Abe, & Honma, 1994; Watanabe, Koibuchi, Ohtake, & Yamaoka, 1993). Vasopressin and VIP rhythms produced by the same SCN explant are independently phased suggesting that these circadian rhythms may be generated by neurons that comprise two separable populations of oscillators within the SCN (Shinohara et al., 1994). An excellent overview of these studies in historical perspective is available (Weaver, 1998). The Molecular Clock Within a cell, circadian rhythms are produced by an autoregulatory transcriptional/translational negative feedback loop that takes approximately 24 hours (Box 4.1). While the general mechanism for circadian oscillations at the
c04.indd Sec2:62
cellular level is common among organisms, the components comprising the feedback loop differ. For the purpose of clarity, only the core mammalian feedback loop is described. Earlier work proposed a core feedback loop that begins when two proteins, CLOCK and BMAL1, bind to one another and drive the transcription of messenger RNA (mRNA) of the Period (Per) and Cryptochrome (Cry) genes by binding to the E-box (CACGTG) domain on these gene promoters. Three Period (Per1, Per2, and Per3) and two cryptochrome genes (Cry1 and Cry2) have been identified. The mRNA for these genes is translated into PER and CRY proteins in the cytoplasm of the cell over the course of the day. Throughout the day, these proteins build up within the cytoplasm, and when they reach high enough levels, they form hetero- and homo-dimers. These newly formed dimers then feed back to the nucleus where they bind to the CLOCK:BMAL1 protein complex to turn off their own transcription (Figure 4.4).
Cytoplasm
PER
Night
CRY
Afternoon
CLOCK BMAL
[CACGTG] Morning Nucleus Per mRNA PER protein Clock mRNA CLOCK protein
Expression pattern
62
0
6
12
18
0 6 Time (h)
12
18
24
Figure 4.4 A simplified model of the intracellular mechanisms responsible for mammalian circadian rhythm generation. Note. The process begins in the cell nucleus when CLOCK and BMAL1 proteins dimerize to drive the transcription of the Per (Per1, Per2, and Per3) and Cry (Cry1 and Cry2) genes. In turn, Per and Cry are translocated to the cytoplasm and translated into their respective proteins. Throughout the day, PER and CRY proteins rise within the cell cytoplasm. When levels of PER and CRY reach a threshold, they form heterodimers, feed back to the cell nucleus, and negatively regulate CLOCK:BMAL1 mediated transcription of their own genes. This feedback loop takes approximately 24 hours, thereby leading to an intracellular circadian rhythm. From Kriegsfeld and Silver, 2006. Reprinted with permission.
8/18/09 5:11:31 PM
Circadian Control of Brain and Behavior 63
BOX 4.1 TRANSCRIPTION, TRANSLATION, AND POSTTRANSLATIONAL EVENTS Gene transcription is the process by which the sequence of nucleotides in a single strand of DNA is transcribed into a single strand of complementary RNA. Transcription factors bind to the beginning of the DNA sequence where the gene to be transcribed is located. In order to synthesize RNA, the two tightly twisted strands of DNA must be unraveled by enzymes called helicases. A gene consists of a unique linear sequence of DNA. Among eukaryotic organisms, some of the nucleotide sequences within the gene are noncoding sequences, called introns, which alternate with coding sequences, called exons. There are special markers denoting the start and end points of each gene. A distinct sequence of nucleotides, called a promoter or facilitory region, marks the start of the gene. The binding of a transcription factor to the promoter allows special enzymes, called RNA polymerases, to attach to the promoter and begin the process of RNA synthesis. The sequence of RNA nucleotides, determined by the sequence of nucleotides along the DNA, eventually determines the sequences of amino acids in the protein gene product of the specific gene in question. After transcription, enzymes clip out the intron sequences; then other enzymes splice together the remaining segments (exons) to form messenger RNA (mRNA). The mRNA leaves the cell nucleus, travels to the rough endoplasmic reticulum (RER), and serves as the template for translation into a linear sequence of amino acids, which occurs on ribosomes. After translation is completed, further processing occurs. The penultimate product is typically packaged into vesicles before being transported to the Golgi apparatus. Additional posttranslational processing usually occurs within the rough endoplasmic reticulum, the Golgi apparatus, and in the vesicles to yield the final version of the peptide/protein. More recently, it has become clear that the cellular clockwork is more complex, with a number of integrated feedback loops whose regulators are often, themselves, controlled by elements of the core clock mechanism. Two other promoter elements have emerged as important for circadian rhythm generation, DBP/E4BP4 binding elements (D boxes) and REV-ERB/ROR binding elements (RREs; Ueda et al., 2005). REV-ERB, an orphan nuclear receptor, negatively regulates the activity of the CLOCKBMAL1 complex and is also acted on by PER and CRY. Transcription of REV-ERB is controlled by the same mechanism controlling Per and Cry transcription. Similarly, the transcription factor DPB is positively regulated by the
c04.indd Sec2:63
CLOCK:BMAL1 complex (Ripperger & Schibler, 2006) and acts as an important output mechanism by driving rhythmic transcription of other output genes via a PAR basic leucine zipper (PAR bZIP; Lavery et al., 1999). Whereas most work to date has focused on transcriptional regulation as the key mechanism driving cellular rhythms, posttranscriptional and posttranslational events are also critical for circadian coordination (Baggs & Green, 2003; C. Kramer, Loros, Dunlap, & Crosthwaite, 2003; Reddy et al., 2006). In addition to transcriptional/translational control of cellular clock function, regulatory kinases also play a pronounced role in regulation of circadian period. Over a decade ago, a circadian mutation called tau was identified that resulted in a shortened circadian period in Syrian hamsters (Ralph & Menaker, 1988). It is now known that the tau locus is encoded by casein kinase I epsilon (CKI; Lowrey et al., 2000; H. Wang, Ko, Koletar, Ralph, & Yeomans, 2007). In normal rodents, CKI phosphorylates PER and “tags” it for degradation throughout the day. Eventually, PER acts to overwhelm CKI, and dimerizes with CRY to feed back to the cell nucleus. The mutant form of CKI is unable to phosphorylate PER, leading to a short circadian period in tau mutant mice due to premature nuclear entry of PER:CRY dimers (Lowrey et al., 2000; Vielhaber, Eide, Rivers, Gao, & Virshup, 2000). The pronounced effects of single circadian gene mutations have been dramatized in a study identifying the genetic basis for a sleep abnormality in humans known as familial advanced sleep-phase syndrome (FASPS). In affected individuals, sleep onset occurs very early, around 19:30 hr, sleep duration is normal, and wake-up time is advanced to about 4:30 hr. This sleep disorder was found to be the result of a single point mutation in the CKI binding region of the PER2 gene, causing hypophosphorylation by CKI in vitro (Toh et al., 2001). Thus, as in the tau mutant, abnormal phosphorylation of PER protein by CKI likely leads to premature negative feedback of PER:CRY heterodimers, thereby speeding the “gears” of the circadian clock. A role for Per3 in human delayed sleep phase syndrome has also been reported (Archer et al., 2003). Whereas extraordinary progress has been made in uncovering the mechanisms responsible for clock function at the molecular level, the complexity of this mechanism is still not fully understood. Determining the specific interactions among complementary feedback loops will allow a further understanding of cellular clock function. Because downstream effects of CCGs are crucial in translating cellular clock function to physiological outcomes, it will be necessary to broaden our understanding of the means by which core clock mechanisms convey relevant timing information at a systems level.
8/18/09 5:11:32 PM
64
Biological Rhythms
Ubiquity of Circadian Clocks The genes regulating circadian rhythmicity and their protein products have been found in numerous sites, including extra-SCN brain loci and in the periphery (Abe et al., 2002; Balsalobre, Damiola, & Schibler, 1998; Kriegsfeld, Korets, & Silver, 2003; Yamazaki et al., 2000). These findings led to questions regarding the unique nature of the master oscillator in the SCN, the functional significance of extra-SCN oscillators, and mechanisms of coordination of these widely dispersed clocks. As described in the following sections, the SCN serves to coordinate cellular oscillators throughout the CNS and periphery. Without a master clock, all other timing systems cease to function as independent cellular oscillators in target systems lose their combined coordination. The loss of coherent rhythmicity in the SCN or peripheral tissue can be due to dampening of rhythms in individual cells or to loss of synchrony among a population of cells in the tissue. Use of Per1-luciferase reporter animals indicates that rhythms in peripheral tissues damp then disappear over time due to uncoupling (desynchronization) among oscillators that retain their individual rhythms (Nagoshi et al., 2004; D. Welsh, 2004). Presumably, peripheral clock cells normally get phase information from the SCN to synchronize individual oscillators to each other. In this view, the SCN sets the phase of peripheral circadian clocks daily, coordinating the activity of tissues and organs of the body relative to one another, thereby maintaining homeostasis. However, the phase of peripheral clocks is also significantly influenced by the daily food intake schedule (e.g., Stokkan, Yamazaki, Tei, Sakaki, & Menaker, 2001). In contrast to hamsters, which show strong circadian organization of nocturnal feeding, common voles (Microtus arvalis) feed throughout the day with an ultradian rhythm of 2 to 3 hours. In voles, the clock-gene mRNAs display high circadian amplitudes in the SCN, but display virtually no cyclicity in the liver (van der Veen et al., 2006). Entrainment of the Circadian System As suggested previously, in order to be adaptive for an organism, circadian rhythms must be synchronized to local environmental time. In addition to a direct visual pathway from retinal ganglion cells to the visual cortex, there is also a direct retinohypothalamic tract (RHT) projecting from the optic nerve to the SCN (Klein & Moore, 1979; Moore & Klein, 1974; Figure 4.5). This second visual pathway is necessary and sufficient to entrain (synchronize) the SCN to the environmental light/dark cycle. If the primary visual pathway is transected at the level of the optic tract beyond the optic chiasm (i.e., caudal to the SCN), then the animal
c04.indd Sec2:64
Figure 4.5 The visual pathway mediating entrainment in mammals. Note. Light stimulates intrinsically photosensitive ganglion cells within the eyes that convey illumination information directly to the SCN of the hypothalamus. This pathway is separate from the classical visual system and uses melanopsin as the light-sensitive photopigment.
is visually blind, but the circadian system continues to respond to photic cues by synchronizing the animal to the light/dark cycle (Klein & Moore, 1979). These studies demonstrate the route whereby environmental photic information can reach the SCN. However, the specific retinal photoreceptor responsible for transmitting this signal was initially enigmatic, as entrainment was independent of traditional, image-forming photoreceptors; mice lacking both rod and cone photoreceptors (rd/rd) exhibit grossly normal entrainment even though they are visually blind (Foster et al., 1993; Freedman et al., 1999; Lucas, Freedman, Munoz, Garcia-Fernandez, & Foster, 1999; Van Gelder, 2001). These findings led to a search for a novel nonrod/noncone photoreceptor. Several years of rapid discovery beginning in this millennium identified a subset of light-responsive ganglion cells containing the photopigment melanopsin (Berson, Dunn, & Takao, 2002; Hannibal & Fahrenkrug, 2002). These ganglion cells project directly to the SCN and were initially thought to be the sole photoreceptors necessary for entrainment. However, melanopsin deficient mice exhibit only minor impairments in entrainment (Lucas et al., 2003; Panda et al., 2002; Ruby et al., 2002). This discrepancy was resolved by showing that entrainment is abolished in mice doubly mutant for both melanopsin and traditional rod/cone photoreceptors (Hattar et al., 2003; Panda et al., 2003). These findings suggest that traditional rod/cone photoreceptors project to
8/18/09 5:11:32 PM
Circadian Control of Brain and Behavior 65
specialized light-responsive ganglion cells that then transmit this integrated photic information directly to the SCN. Thus, these two types of receptors likely work together to entrain the circadian clock, and either receptor type can support entrainment in the absence of the other.
Circadian Output and the Control of Behavior Diffusible SCN Output Early work in which SCN grafts were shown to restore circadian patterns in activity-related behaviors suggested that circadian rhythmicity could be supported by diffusible output from the clock (Lehman et al., 1987; Ralph et al., 1990; Silver, Lehman, Gibson, Gladstone, & Bittman, 1990). This supposition was based on the fact that transplants restored circadian function independent of the establishment of neural SCN connections with the host brain. This possibility was demonstrated definitively by encapsulating donor SCN tissue in a membrane that prevented neural outgrowth while allowing the diffusion of signals between graft and host; behavioral rhythms were still restored under these conditions (Silver, LeSauter, Tresco, & Lehman, 1996). One candidate diffusible signal is prokineticin-2 (PK2; Cheng et al., 2002). This protein is expressed rhythmically in the SCN and its receptor is present in all major SCN targets (Cheng, Bittman, Hattar, & Zhou, 2005; Cheng et al., 2002). Likewise, PK2 administration during the night (when levels are low) inhibits wheel-running behavior. Whether this signal normally operates in a diffusible manner and/or is released synaptically requires further examination. A second candidate diffusible signal is transforming growth factor-alpha (TGF-alpha; A. Kramer et al., 2001). As with PK2, TGF-alpha is expressed rhythmically in the SCN and its administration inhibits wheel-running behavior. The receptor for TGF-alpha is also expressed in the subparventricular zone (SPVZ), the major target of the SCN. Again, the degree that TGF-alpha is released in a diffusible manner under normal conditions requires further study. Studies in which fiber output is eliminated from the SCN (allowing only for diffusible output), in conjunction with administration of PK2 and TGF-alpha antagonists, are necessary to begin to answer this question. Although it is intriguing to speculate on the role of these signals in communicating information from the SCN, the problem of unequivocally identifying an endogenous, physiologically relevant diffusible SCN signal is complex. The necessary and sufficient criteria to confirm the existences of a diffusible signal in a fluid volume have been summarized previously (Nicholson, 1999). First, evidence that the removal or replacement of the signaling substance results in a change in the response being controlled and an
c04.indd Sec2:65
assay of the substance should indicate that it is present or increases, or both, in a well-defined temporal relationship to the response (and similarly declines when the response disappears). In addition, evidence must be obtained that a fluid compartment is the conduit for a diffusible or transported signal. The signal must have access to and enter the compartment where the fluid dynamics and turnover in the compartment should allow appropriate movement of the signal. Although PK2 and TGF-alpha meet some of these criteria, further research is necessary to clarify the role of these molecules in communicating circadian information. Neural Control of Neurosecretory Factors In contrast to behavioral rhythms (e.g., locomotion, drinking, gnawing), endocrine rhythms require neural projections from the SCN to endocrine targets; endocrine rhythms are abolished after knife cuts severing SCN fibers (Hakim, DeBernardo, & Silver, 1991; Nunez & Stephan, 1977) and are not restored in SCN-lesioned transplanted animals (Meyer-Bernstein et al., 1999; Nunez & Stephan, 1977; Silver et al., 1996), presumably due to inadequate neural innervation of the host brain by the graft. Further evidence for a neural SCN output signal regulating hormone secretion is seen in studies of female hamsters. When housed in constant light, the activity of a subset of hamsters “splits” into two separate activity bouts within a 24-hour interval. These split females display two daily preovulatory LH surges, each approximately half the concentration of a single surge in a nonsplit female (Swann & Turek, 1985). Under normal conditions, both halves of the bilaterally symmetrical SCN are active in synchrony. In ovariectomized, estrogen-implanted split hamsters examined during one of their activity bouts, however, activation of the SCN occurs on one side of the brain (monitored by expression of the early immediate gene FOS), but not on the other, suggesting that each half of the SCN can control an activity bout (de la Iglesia, Meyer, Carpino, & Schwartz, 2000). Remarkably, FOS activation in GnRH neurons was only seen on the side of the brain in which SCN FOS expression occurred (de la Iglesia et al., 2000; de la Iglesia, Meyer, & Schwartz, 2003). These findings suggest that the precise timing of the LH surge is derived from a neural signal originating in the SCN and communicated to ipsilateral GnRH neurons, as a diffusible output signal would reach both sides of the brain. Importantly, some hypothalamic sites are activated ipsilaterally, while others are activated either ipsilaterally or bilaterally in the split animal, again supporting the notion of multiple SCN output pathways (Yan, Foley, Bobula, Kriegsfeld, & Silver, 2005). Neural output from the SCN has been extensively investigated in rats and hamsters using tract-tracing techniques (Kalsbeek, Teclemariam-Mesbah, & Pevet, 1993;
8/18/09 5:11:33 PM
66
Biological Rhythms
Kriegsfeld, Leak, Yackulic, LeSauter, & Silver, 2004; Leak & Moore, 2001; Morin, Goodless-Sanchez, Smale, & Moore, 1994; Stephan, Berkley, & Moss, 1981; Watts & Swanson, 1987). Importantly, many of these monosynaptic projections target brain regions containing neuroendocrine cells producing hypothalamic-releasing hormones. Direct projections have been traced from the SCN to the medial preoptic area (MPOA), supraoptic nucleus (SON), anteroventral periventricular nucleus (AVPV), the paraventricular nucleus (PVN), the dorsomedial nucleus of the hypothalamus (DMH), and the lateral septum and the arcuate (Arc). The SCN also projects to the pineal through a multisynaptic pathway (Klein, 1985; Klein et al., 1983). There is abundant evidence for direct neural SCN control of neuroendocrine cell populations (Buijs, Hermes, & Kalsbeek, 1998; Buijs, van Eden, Goncharuk, & Kalsbeek, 2003; Egli, Bertram, Sellix, & Freeman, 2004; Gerhold, Horvath, & Freeman, 2001; Horvath, Cela, & van der Beek, 1998; Kalsbeek & Buijs, 2002; Kalsbeek, Fliers, Franke, Wortel, & Buijs, 2000; Kalsbeek, van Heerikhuize, Wortel, & Buijs, 1996; Kriegsfeld, Silver, Gore, & Crews, 2002; Van der Beek, Horvath, Wiegant, Van den Hurk, & Buijs, 1997; Van der Beek, Wiegant, Van der Donk, Van den Hurk, & Buijs, 1993; Vrang, Larsen, & Mikkelsen, 1995). Because these cell populations can regulate neurochemicals that are secreted into the CSF (Reiter & Tan, 2002; Skinner & Caraty, 2002; Skinner & Malpaux, 1999; Tricoire, Moller, Chemineau, & Malpaux, 2003) or general circulation, SCN-derived signals can control widespread systems in the brain and body. Considered together, the findings summarized previously suggest several possibilities. For example, behavioral rhythms may be controlled by a diffusible signal(s), whereas endocrine rhythms may require neural output. Alternatively, behavioral and endocrine rhythms can both be supported by diffusible signals, but the threshold for supporting behavioral rhythms is lower. Finally, behavioral rhythms are controlled by both neural and diffusible signals, and either can maintain rhythmic function, while endocrine rhythms can only be supported via neural connections. Definitive identification of biologically significant endogenous diffusible signal(s) and the precise mode of SCN control is a current line of inquiry.
SEASONAL CHANGES IN BRAIN AND BEHAVIOR In common with daily fluctuations in energy availability, intake, and requirements, energy availability and requirements vary throughout the year, and individuals must parse their various activities across the year to maximize energy
c04.indd Sec2:66
utilization and survival. Thus, all energetically expensive activities (e.g., mating, migrating, foraging, nest building, and thermoregulation) cannot be simultaneously performed. To allow anticipation of annual fluctuations in energy conditions and synchronization of their behavior and physiology accordingly, individuals have evolved mechanisms to determine time of year. In common with the circadian system, seasonal timekeeping mechanisms permit the coordination of internal physiology, and also allow animals to predict recurring seasonal events, such as food availability, general weather conditions, and predator activity, and adjust their behavior appropriately. Seasonal Changes in Reproductive Function Whereas many seasonal changes in behavior have been documented, breeding seasons represent the most salient seasonal change in behavior. In addition to mating behaviors, other associated behaviors such as food intake, aggression, and territorial defense show marked seasonal fluctuations. In general, small animals tend to breed during the long days of spring and summer, whereas large animals with relatively extended periods of gestation, breed during the short days of autumn. In both cases, offspring are produced in the spring when food availability is at a seasonal maximum. The specific timing of breeding represents a selective compromise among competing trade-offs such as the availability of food for gestation or lactation. Syrian hamsters (Mesocricetus auratus) have been studied as the exemplar of a long-day breeder. Individuals of this species reduce circulating concentrations of reproductive steroid hormones, LH, and follicle-stimulating hormone (FSH), after exposure to simulated winter day lengths or appropriate melatonin treatment (Bartness, Powers, Hastings, Bittman, & Goldman, 1993; Swann & Turek, 1988). In sheep, the most commonly studied short-day breeder, exposure to short days leads to increased LH secretion, manifested as an increase in the frequency of pulsatile LH secretion (Lehman et al., 1997). Such changes in pituitary gonadotropin secretion lead to changes in gonadal growth and gonadal steroid hormone secretion and the subsequent onset of mating behavior during the autumn. Circulating testosterone concentrations decrease rapidly in response to the short day via enhanced negative feedback effects on gonadotropin-releasing hormone (GnRH) secretion in male rodents (Turek, 1977). Short-day male rodents display lower circulating testosterone concentrations compared with long-day males. Because testosterone modulates many behaviors (in addition to reproduction), this hormone is another potential mediator of seasonal adjustments in the brain, which could act alone or in combination with melatonin or other hormones. Examples of testosterone affecting
8/18/09 5:11:33 PM
Seasonal Changes in Brain and Behavior 67
seasonal behaviors include the combined effect of testosterone and photoperiod on locomotor activity in male Syrian hamsters (Ellis & Turek, 1983) and the reinstatement by exogenous testosterone of female odor preference in castrated male meadow voles maintained in long or short days (Ferkin & Gorman, 1992). Testosterone replacement prevents elevated food intake in castrated male deer mice (Blank, Korytko, Freeman, & Ruf, 1994) and restores aggressive behaviors in Syrian hamsters maintained in short days (Jasnow, Huhman, Bartness, & Demas, 2000). Although no correlation has been reported between testosterone concentrations and spatial learning and memory among male meadow voles (Galea, Kavaliers, Ossenkopp, & Hampson, 1995), short days reduce testosterone, hippocampal volume, and spatial learning in deer mice (Perrot-Sinal, Kavaliers, & Ossenkopp, 1998). However, direct manipulation of testosterone in seasonally breeding rodents and the potential interaction between photoperiod and testosterone has rarely been investigated in learning and memory paradigms (Pyter, Trainor, & Nelson, 2006). Testosterone increases the volume of song-production brain regions in songbirds (Ball et al., 2004) and altered neurochemistry (Bittman, Tubbiola, Foltz, & Hegarty, 1999) and neuroanatomy (Gomez & Newman, 1991) in photoperiodic rodents. For example, testosterone increases the number of cells expressing the proopiomelanocortin gene in the hypothalamus (Bittman et al., 1999) and restores neuronal branching and morphology within the amygdala (Gomez & Newman, 1991) of castrated Syrian hamsters. Steroid hormones have important effects on learning and memory although there are species differences. Despite the presence of androgens and estrogens in both males and females, for the most part studies on males have focused on the role of androgens whereas studies on females have focused on the role of estrogens. In addition to manipulating photoperiod to induce seasonal changes in brain and behavior, studies have mimicked similar seasonal results by manipulating melatonin, the physiological signal into which ambient photoperiod is transduced (Bartness et al., 1993; Carter & Goldman,
Figure 4.6 Input pathway from the retina to the pineal gland in mammals.
Pineal
SCN
PVN
OC
Eye SCG
c04.indd Sec3:67
1983; Nelson, Badura, & Goldman, 1990). The synthesis of melatonin occurs exclusively at night and is inhibited directly by light. The duration of melatonin release is proportional to the length of the dark phase (Illnerova, Hoffmann, & Vanecek, 1984); consequently, short-day animals experience longer durations of melatonin than do long-day animals. Infusion of appropriate duration profiles of physiological concentrations of melatonin to pinealectomized hamsters has shown the critical role of melatonin duration in signaling day length (Bartness et al., 1993). Peromyscus and other small rodents, whose gestations are relatively short (about 3 weeks), are long-day breeders and respond reproductively to long-day (short-duration) melatonin signals. Melatonin affects circadian rhythms of behavior (Golombek, Pevet, & Cardinali, 1996), but also provides seasonal information throughout the body. Melatonin receptors are distributed discretely throughout the rodent brain (Drew et al., 2001; Dubocovich, Rivera-Bermudez, Gerdin, & Masana, 2003; Weaver, Carlson, & Reppert, 1990). Importantly, melatonin receptors are present in the hippocampal area (entorhinal cortex) and other nonreproductive regions of rodent brains (Musshoff, Riewenherm, Berger, Fauteck, & Speckmann, 2002; Weaver et al., 1990), providing a neural substrate for direct effects of melatonin on learning and memory and other cognitive and motivated behaviors (described next). Manipulation of melatonin duration to mimic long or short days induces seasonal changes in behavior. Removal of the source of melatonin via pinealectomy impairs photoperiodic responses (Bartness & Goldman, 1989; Goldman, 2001). The neural mechanisms regulating circadian changes in melatonin secretion have been well characterized (Figure 4.6). Circadian rhythms in melatonin secretion in most mammals depend on neural efferents from the SCN to the region of the paraventricular nucleus of the hypothalamus (PVN). This projection continues through the medial forebrain bundle to the superior cervical ganglion (SCG) of the spinal cord. From the SCG, sympathetic neurons drive pineal melatonin secretion during the dark, while melatonin production and secretion are inhibited during the light
Note. Light information is transduced into a neural signal in the retina and transmitted via a direct retino-hypothalamic tract to the suprachiasmatic nucleus (SCN). From the SCN, fibers synapse in the paraventricular nucleus of the hypothalamus (PVN). From the PVN, fibers travel through the medial forebrain bundle to the superior cervical ganglion (SCG). Postganglionic fibers from the SCG then project to the pineal gland to modulate melatonin production/secretion. OC Optic chiasm.
8/18/09 5:11:33 PM
68
Biological Rhythms
portion of the LD cycle (Cassone, 1990; Ganguly, Coon, & Klein, 2002). This multisynaptic pathway has been confirmed using the transneuronal retrograde tracer, pseudorabies virus, injected into the pineal gland (Card, 2000; Larsen, Enquist, & Card, 1998). This technique confirmed the links in the pathway from the SCN to the pineal gland, and also suggested that two parallel circuits from the SCN (one from the dorsomedial and one ventrolateral) likely drive melatonin secretion. This day-night regulation of melatonin is regulated by the SCN (for a review, see Cassone, 1990), and, as with other hormonal systems mentioned previously, lesions of the SCN abolish circadian rhythms in melatonin production and secretion (Scott, Jansen, Kao, Kuehl, & Jackson, 1995; Tessonneaud, Locatelli, Caldani, & Viguier-Martinez, 1995). Although melatonin influences GnRH secretion, this hormone does not appear to act directly on GnRH neurons. Given the widespread distribution of the GnRH system, it has been difficult to determine the melatonin-sensitive systems that, in turn, act on GnRH neurons to regulate their activity. Because regression of the reproductive axis is, in part, due to melatonin-mediated increases in the negative feedback of sex steroids, those brain regions co-expressing melatonin receptors and androgen receptors may be critical in this response. The dorsomedial hypothalamus (DMH) of Syrian hamsters binds both melatonin and androgen with high affinity (Maywood, Bittman, & Hastings, 1996). Lesions of the DMH block short-day and melatonin-induced regression of the reproductive system (Lewis, Freeman, Dark, Wynne-Edwards, & Zucker, 2002; Maywood et al., 1996), suggesting that the DMH is a key target of melatonin in this species. In other species such as Siberian hamsters, melatonin feedback to the SCN is critical for melatonin-induced seasonal alterations in several physiological parameters. Lesions of the SCN block the effects of daily, long-duration melatonin infusions (i.e., short-day pattern) on body mass, fat pad distribution, and reproductive function (Bartness, Goldman, & Bittman, 1991; Bittman, Bartness, Goldman, & DeVries, 1991). This finding suggests that the circuit beginning with the SCN also requires the SCN as a target. In addition to acting on the DMH, melatonin binding is largely seen in the pars tuberalis of seasonally breeding mammals (Bittman & Weaver, 1990; Weaver et al., 1990). In hypothalamic-pituitary transected sheep, melatonin implants in region of the pars tuberalis reduce prolactin secretion in a manner similar to short-day lengths (Lincoln & Clarke, 1997). However, melatonin implants in this region do not affect gonadotropin secretion, suggesting that this region may be important for the regulation of the lactotropic, but not gonadotropic, photoperiodic effects (Lincoln & Clarke, 1997). Research on the means by which melatonin regulates other parameters is still in its infancy.
c04.indd Sec3:68
It is unclear whether these melatonin-sensitive targets project directly on the GnRH system or indirectly via projections to other peptidergic systems. One candidate intermediary system is kisspeptin, a peptide recently shown to have marked stimulatory effects on the GnRH system. After chronic exposure to short days, Siberian hamsters with suppressed reproductive function exhibit marked reduction in kisspeptin cell labeling in the anteroventral periventricular nucleus, a neural target of the SCN (Greives et al., 2007; Mason et al., 2007). In common with other species, a subset of individual Siberian hamsters fails to respond to day-length information, however, and maintains their reproductive function. These so-called photoperiod nonresponsive individuals exhibit kisspeptin expression akin to that of long-day animals, suggesting that short photoperiods (and melatonin) are ignored by the brains of these animals. These results suggest an important role for kisspeptin in coordinating and relaying environmentally relevant information to the reproductive axis as well as a role for this peptide in regulating seasonal changes in reproductive function (Revel et al., 2006, 2007). A number of studies suggest a role for clock genes in the control of seasonality (Hofman, 2004; Johnston et al., 2003; Lincoln, Andersson, & Loudon, 2003). In Syrian and Siberian hamsters, photoperiod alters the duration of clock and clock-controlled gene expression, while the amplitude of gene expression is influenced by photoperiod in the pars tuberalis (Johnston et al., 2003; Messager, Hazlerigg, Mercer, & Morgan, 2000). In sheep, however, the relative timing of clock genes is altered by photoperiod in the pars tuberalis, providing a mechanism of temporal encoding and downstream control (Hazlerigg, Andersson, Johnston, & Lincoln, 2004; Lincoln et al., 2003; Lincoln, Johnston, Andersson, Wagner, & Hazlerigg, 2005; Lincoln, Messager, Andersson, & Hazlerigg, 2002). These correlational results are intriguing and suggest that phase and/or amplitude of clock and clock-controlled genes in SCN brain targets and endocrine glands may predict their responsiveness to upstream signals on a daily schedule. Seasonal Changes in the Avian Song System Associated with seasonal breeding are activities related to reproduction such as territorial defense, migration, or communication. For example, male canaries sing more frequently in spring than in winter, and they appear to lose components of their songs after each breeding season and incorporate new song components each spring (Brenowitz & Beecher, 2005). In spring, as day length increases, the testes grow and secrete androgens, song frequency increases, song repertoire enlarges, and the higher vocal center (HVC) and the robust nucleus of the archistriatum (RA), two brain
8/18/09 5:11:33 PM
Seasonal Changes in Brain and Behavior 69
nuclei necessary for song production, double in size. Under autumnal day lengths, testes regress in size and androgen production virtually stops, frequency of singing decreases, song repertoire shrinks, and the HVC and RA regress in size (Nottebohm, 2005). Treatment with testosterone in autumn mimics spring hormonal conditions and supports song production. The seasonal plasticity in behavior has been assumed to reflect the seasonal changes in brain morphology induced by hormonal adjustments (Tramontin & Brenowitz, 2000), although the precise mechanisms underlying seasonal plasticity in song complexity remains unspecified (Brenowitz, Lent, & Rubel, 2007). Estrogens, converted from testicular androgens or produced de novo in CNS neurons (Schlinger, Soma, & London, 2001), appear necessary to activate the neural mechanisms underlying the song system in birds. Androgens enter neurons containing aromatase, which converts them to estrogens. Aromatase is generally localized in neurons adjacent to other neurons containing estrogen receptors in the hypothalamus and preoptic area of songbird brains, as well as in limbic structures and in the structures constituting the neural circuit controlling bird song. The brain appears to be the primary source of estrogens, which activate masculine behaviors in many bird species (London, Monks, Wade, & Schlinger, 2006). Photoperiod is important in some birds to mediate these changes, including recruitment of new neurons (Nottebohm, 2005). Testosterone, or its metabolites including estrogens, appears to drive the seasonal changes in brain structure and behavior. In addition to seasonal changes in singing behavior, substantial seasonal changes occur in the morphology of several song nuclei. For example, the volume of the HVC and RA increases by 99% and 77%, respectively, among male canaries maintained under spring day lengths (12 hours light/day) relative to birds in autumnal conditions ( 12 hours light/day; Nottebohm, 1981). Similar results have been reported for more than 25 other bird species (reviewed in Ball, Riters, & Balthazart, 2002). These seasonal changes in the size of specific brain structures are probably mediated by testosterone or its metabolites (Ball et al., 2002; London et al., 2006). Testosterone appears to act via brain-derived neurotrophic factor (BDNF) to promote survival of new neurons in the brains of adult songbirds (Rasika, Alvarez-Buylla, & Nottebohm, 1999). Testosterone upregulates BDNF in the HVC of adult male and female canaries and infusion of antibody against BDNF blocks androgen-induced neurogenesis. Behavioral feedback is important because singing increases BDNF expression in the HVC in proportion to the number of songs produced (Li, Jarvis, Alvarez-Borda, Lim, & Nottebohm, 2000). BDNF seems to determine life expectancy of newly born cells in adult songbird brains.
c04.indd Sec3:69
Vernal elevation in testosterone concentrations coincident with long days provokes BDNF production that protects new neurons within 10 days of cell birth (Alvarez-Borda, Haripal, & Nottebohm, 2004). Other avian models of seasonal brain plasticity include seasonal change in hippocampal volume of food-caching (Hoshooley & Sherry, 2007; Smulders, Sasson, & DeVoogd, 1995) and brood parasitic (Sherry, Forbes, Khurgel, & Ivy, 1993) birds. The hippocampus is involved in spatial learning and memory, and generally, species with a larger relative hippocampal volume display better spatial learning and memory (Sherry, Jacobs, & Gaulin, 1992). Hippocampal size is reduced during the winter in these bird species when spatial learning and memory performance associated with food storing (Barnea & Nottebohm, 1994, 1996) and nest parasitism (Sherry et al., 1993) is reduced. Comparable studies of seasonal brain plasticity in mammals have been relatively rare, despite the fact that seasonal breeding has been well-documented in nontropical mammals. Seasonal Changes in Mammalian Brain and Behavior Although the brain constitutes only about 2% to 3% of the total body mass in rodents, it consumes over 10% of total energy expenditure (Mink, Blumenschine, & Adams, 1981). Thus, minor reductions in brain mass could save significant energy (Jacobs, 1996). Seasonal changes in brain weight have been documented in rodents and shrews (Yaskin, 1984). Brain weights are higher in summer than winter (Yaskin, 1984). Winter brain mass is reduced in grey squirrels (Sciurus carolinensis; Lavenex et al., 2000b) and ferrets (Mustela putorius; Weiler, 1992). Also, brain mass (absolute and corrected for body mass) and specific brain regions (e.g., hippocampus) are reduced during winter in common shrews (Sorex araneus) and bank voles (Clethrionomys glareolus; Yaskin, 1984). A significant part of the seasonal change in brain weight might merely reflect differences in water content; however, the neocortex and the basal portion of the brains (i.e., the corpus striatum) of rodents and shrews show seasonal cytoarchitectural changes. Seasonal changes in hippocampal morphology have been reported in hibernating ground squirrels (Citellus undulatus; Popov, Bocharova, & Bragin, 1992) and meadow voles (Microtus pennsylvanicus; Galea & McEwen, 1999), although hippocampal volume did not vary with season in grey squirrels (Lavenex, Steele, & Jacobs, 2000). In European hamsters (Cricetus cricetus) and sheep, seasonal changes in the innervation of the brain have been reported (Buijs et al., 1986; Xiong, Karsch, & Lehman, 1997). Seasonal changes in neuroendocrine function (Wehr, 1998), hypothalamic peptide expression (Hofman & Swaab,
8/18/09 5:11:34 PM
70
Biological Rhythms
1995), and serotonin function (Alstadhaug et al., 2005; Brewerton & George, 1990) have also been reported in humans. Seasonal changes in human behavioral pathology are also observed in anxiety and depression (Enns et al., 2006; Lewy, Lefler, Emens, & Bauer, 2006; Rosenthal et al., 1984; Sigmon et al., 2007), migraine headaches (Alstadhaug et al., 2005; Brewerton & George, 1990), as well as incidence, severity, and mortality of strokes (Carolei, Marini, De Matteis, Di Napoli, & Baldassarre, 1996; H. Wang, Sekine, Chen, & Kagamimori, 2002; Y. Wang et al., 2003). Thus, despite the relative lack of seasonal organization of reproductive function, it is apparent that humans retain responsiveness to photoperiod (reviewed in Bronson, 1995, 2004), and that photoperiod-mediated adjustments in rodent brain and behavior may be important to understand seasonal changes in human brain and behavior. Photoperiod also affects cell division in the dentate gyrus and subependymal zone of adult mammals (Huang, DeVries, & Bittman, 1998). Long-term exposure to short days in Syrian hamsters doubles the number of new neurons produced in these brain regions, as well as in the hypothalamus and cingulate-retrosplenial cortex (Huang et al., 1998). There are no appreciable photoperiodic differences in brain volume of either the granule cell layer of the hippocampus or the dentate. These results are in contrast to studies of Peromyscus which likely represents a species difference. No differences in cell proliferation are observed in brains of grey squirrels (Lavenex et al., 2000), although the use of low doses of BrdU may not pick up differences (Gould & Gross, 2002). Motoneurons controlling penile muscles, neuromuscular junctions, and somas are smaller in adult male Siberian hamsters and white-footed mice exposed to short days compared with animals exposed to long days (Forger & Breedlove, 1987). Adult male deer mice (Peromyscus maniculatus) maintained in short days have smaller brains and lower adjusted hippocampal volume relative to those maintained in long days (Perrot-Sinal et al., 1998). Thus, several examples of seasonal influences on several brain measures exist, but what is lacking is an understanding of how seasonal factors influence neural structures in mammals. It is likely that seasonal brain changes will be subtle in mammals compared to birds because birdsong generally displays an all-or-none seasonality; seasonal changes in learning and memory are consistent, albeit subtle, and such variation likely provides important grist for evolutionary processes. Seasonal Changes in Learning and Memory Deer mice and white-footed mice tested during the breeding season display better spatial learning performance compared to mice tested during the nonbreeding season (Galea,
c04.indd Sec3:70
Kavaliers, Ossenkopp, Innes, & Hargreaves, 1994; Pyter, Reader, & Nelson, 2005). This may represent differences in territory size, and therefore spatial memory requirements (Jacobs, 1996). The relationship among home range size and spatial learning and memory has been established in several rodent species. For example, male Peromyscus maniculatus have larger home ranges than females during the breeding season that decline in winter (Bronson, 1985, 1988). This is paralleled by superior spatial learning and memory performance in male Peromyscus as compared to females in long, but not short, days (Galea, Kavaliers, & Ossenkopp, 1996; Galea et al., 1994). The effects of photoperiod on learning and memory are consistent with a life-history explanation of reproductive strategy. Both hippocampal size and spatial learning and memory performance are sexually dimorphic among polygynous rodents (Jacobs, 1996). Polygynous male rodents outperform females in spatial tasks and have larger hippocampi (Galea et al., 1994, 1996; Jacobs, 1996). In common with birds, spatial learning and memory performance is often positively correlated with hippocampal size in mammals (Jacobs, Gaulin, Sherry, & Hoffman, 1990; Sherry, Jacobs, & Gaulin, 1992). In food caching birds, one might predict that hippocampal volume increases during winter (Pravosudov & Clayton, 2002), and that caching behaviors and hippocampal volume should increase among individuals inhabiting particularly harsh conditions. Indeed, black-capped chickadees (Poecile atricapillus) from the most harsh conditions (i.e., exposed to lowest temperatures, shortest day lengths, and most snow cover) had the largest hippocampal volumes and most hippocampal neurons compared to birds from mild conditions (Roth & Pravosudov, 2008). Thus, environmental conditions can shape specific brain structures in precise ways to enhance survival and reproductive success. Seasonal Changes in Aggression It is well documented that androgens regulate aggressive behavior, especially the dramatic male-male interactions associated with mating territories. For instance, many male mammals and birds set up and defend territories before the onset of the breeding season (Goymann, Landys, & Wingfield, 2007; Wingfield, Jacobs, & Hillgarth, 1997). Females often choose males on the basis of resources available in these territories so evolutionary pressures are high on males to compete successfully. Aggressive behavior is costly, both in terms of energy and in terms of potential injury or death. Thus, individuals often exchange information about the likely outcome of the aggressive encounter without actual combat. The energetic costs and survival from wounds vary seasonally leading to fluctuations in
8/18/09 5:11:34 PM
Seasonal Changes in Brain and Behavior 71
the likelihood of aggressive encounters as the cost-benefit ratio changes. Individuals of several rodent species undergo a seasonal shift from highly territorial, asocial behavior during the breeding season to a social, highly interactive existence during winter. Such species typically undergo reproductive quiescence at the end of the breeding season in response to short days. The resulting decrease in androgen secretion may be necessary or permissive for the seasonal shift in sociality, but in wood rats (Neotoma fuscipes) nonsteroidal mechanisms mediate the seasonal change in social behavior (G. S. Caldwell, Glickman, & Smith, 1984). The seasonal change in social organization confers several advantages. During the breeding season, rodents control resources that promote their own survival and that of their offspring, and they often aggressively exclude nonkin from access to resources. During the winter, however, this strategy is abandoned in favor of group living that conserves energy and enhances survival in the face of low temperatures and reduced food availability. Many species of rodents conserve energy during the winter by forming aggregations of huddling animals (Madison, 1984). In these aggregations, different sexes and even different species commingle. Even in the absence of huddling behavior, animals may tolerate one another better in close quarters during the winter than during the breeding season. For example, male meadow voles (Microtus pennsylvanicus) are highly territorial in the spring and summer and occupy open meadows, whereas red-backed voles (Clethrionomys gapperii) breed in forest habitats. During the winter months, meadow voles migrate into the spruce forest habitats occupied by the redbacked voles, presumably to take advantage of the protective cover provided by the trees. In some cases, they share nests with individuals of other rodent species (Madison, 1984). Consistent for a role of androgens mediating both mating and aggression, male meadow voles trapped during the winter and tested in paired encounters in a neutral arena displayed less interspecific aggression than voles trapped in summer. The winter reduction in aggressiveness permits energy-saving group huddling. As the animals enter their breeding condition in the spring, they reestablish mutually exclusive territories. Some males within a population do not undergo reproductive regression, maintaining testicular function and producing sperm and androgens during simulated winter conditions (for review, see Prendergast, Kriegsfeld, & Nelson, 2001). The advantages of continuous breeding capability evidently incur substantial hidden costs because only a minority of each population adopts this strategy. One such cost may be that reproductively competent males, because of unusual aggressiveness during the winter, are unable to participate in communal huddling and thus incur greater energetic costs in overwintering.
c04.indd Sec3:71
High behavioral and energetic costs associated with the maintenance of the reproductive system in winter may explain why nonregressive types do not normally predominate in temperate or boreal-zone populations of rodents (Prendergast et al., 2001). This observation is supported by a field study of winter nesting behavior of prairie voles (M. ochrogaster; McShea, 1990). Most voles in the population studied were reproductively inactive during the winter and formed groups of huddling individuals. Two males, however, remained in breeding condition and never huddled with other animals. In pairwise aggression tests, these two males were much more aggressive than reproductively quiescent individuals. In another study, reproductive status also influenced odor preferences of meadow voles maintained in simulated winter day lengths (Gorman, Ferkin, Nelson, & Zucker, 1993). Males that retained reproductive capability in winter day lengths preferred the odors of females that also failed to inhibit reproduction during short days. This preference may facilitate the sporadic occurrences of winter breeding frequently reported for this species (reviewed in Nelson, 1987). As noted, many if not most species reduce aggression outside of the breeding season. For some species, however, aggression must be maintained throughout the year independent of breeding season. For example, over-wintering migrant birds must compete with residents within local feeding flocks. In such cases, aggression would likely be modulated by mechanisms unrelated to reproduction. In hamsters, short days increase male resident-intruder aggression (P. sungorus; Demas, Polacek, Durazzo, & Jasnow, 2004; Wen, Hotchkiss, Demas, & Nelson, 2004; M. auratus; H. K. Caldwell & Albers, 2004; Garrett & Campbell, 1980; Jasnow et al., 2000). This effect is paradoxical because increased aggression occurs when testosterone concentrations are at a nadir. Despite the lack of plasma androgens, estrogens may still be important (see the following discussion). Adrenalectomy prevents increased aggression in short days in Siberian hamsters (Demas et al., 2004). This mechanism may be involved in winter aggression of birds. Studies of zebra finches (Taeniopygia guttata) show that the adrenal hormone dehydroepiandrosterone (DHEA) can be indirectly converted into estrogens within the brain (Soma, Alday, Hau, & Schlinger, 2004); however, DHEA does not appear to influence aggression in Siberian hamsters (Scotti, Belén, Jackson, & Demas, 2008). In common with hamsters, short day lengths increase aggression levels in beach mice (Peromyscus polionotus). Hormone manipulation studies revealed that the estrogen receptor subtype (ER ) agonist PPT (propylpyrazole-triol) and the ER, agonist DPN (diarylpropionitrile) increased aggression in short-day P. polionotus and decreased aggression in “long-day” mice (Trainor, Lin, Finy, Rowland, &
8/18/09 5:11:34 PM
72
Biological Rhythms
Nelson, 2007). These results suggested that photoperiod regulates processes that occur after estrogens bind to their appropriate receptors. Steroid hormones, including estrogens, can affect physiological and behavioral processes via slow (hours to days) genomic or fast (seconds to minutes) nongenomic pathways (Vasudevan & Pfaff, 2007). Estradiol injections of beach mice resulted in rapid ( 15 minutes) increases in aggression in short-day mice, but not longday mice (Trainor et al., 2007). This suggests that estradiol increases aggression via nongenomic actions in short days, but not in mice housed in long days. Moreover, gene chip analyses indicated that estrogen-dependent expression of genes containing estrogen response elements in their promoters was decreased in the bed nucleus of the stria terminalis (BNST) of short-day mice compared with that of long-day mice. These data suggest that the environment regulates the effects of steroid hormones on aggression in P. polionotus by determining the molecular pathways that are activated by steroid receptors (Nelson & Trainor, 2007). Another factor that plays an important role in regulating aggression is neuronal nitric oxide synthase (nNOS or NOS-1; 11). nNOS produces the neurotransmitter, nitric oxide (NO), as a by-product of the conversion of arginine into citrulline in the central and peripheral nervous systems (Nelson & Trainor, 2007). Nitric oxide produced from neurons appears to be involved in regulating some aggressive behaviors. For example, male mice with targeted disruption of the nNOS gene (nNOS–/–) display sustained aggressive behavior and persistent sexual behavior (Nelson & Chiavegatto, 2001; Nelson et al., 1995). Castration and testosterone replacement studies indicate that testosterone is necessary, but not sufficient, to provoke elevated aggressive behavior in nNOS–/– mice (Kriegsfeld, Dawson, Dawson, Nelson, & Snyder, 1997). Because gonadal steroid hormones influence neuronal nitric oxide synthase (nNOS), and this enzyme has been implicated in aggressive behavior, nNOS expression was hypothesized to be decreased in short-day male Siberian hamsters and negatively correlated with the display of territorial aggression. Again, all short-day housed hamsters were significantly more aggressive than long-day animals, regardless of whether they regressed (responsive) or maintained (nonresponsive) gonadal size or testosterone concentrations (Wen et al., 2004). Short-day animals, both reproductively responsive and nonresponsive morphs, also displayed significantly fewer nNOS-immunoreactive cells in brain areas associated with aggression including the anterior and basolateral amygdaloid areas and paraventricular nuclei as compared to long-day hamsters. Together, these results suggest that seasonal aggression in male Siberian hamsters is regulated by photoperiod, through mechanisms that are likely independent from gonadal steroid hormones.
c04.indd Sec3:72
SUMMARY The predictable daily and seasonal changes in the environment have led to the evolution of clock mechanisms that permit anticipation of daily and seasonal events to maximize reproductive fitness and survival. Responses to natural light cycles result in adaptive temporal organization in humans and other animals. Since the invention and use of electrical lights, starting around the turn of the twentieth century, this temporal organization has been dramatically altered. Light at night has significant social, ecological, behavioral, and health consequences that are only now becoming apparent for both daily and seasonal rhythms. Increased understanding of circadian organization may lead to individually tailored medical treatments. Appreciation that there are individuals who are “morning larks” or “night owls” may lead to individualized learning programs that schedule cognitive demanding tasks at circadian-appropriate times-ofday. As we learn more about entrainment effects of light, improvements in lighting schedules or melatonin treatment for nightworkers should be made. Understanding the genes underlying the biological clock function has been a remarkable tour de force in molecular biology. As more precision in the mechanisms underlying biological clock function is revealed, we need to improve understanding of the output signals of the central clocks to the periphery. How are circadian rhythms in cell cycling, metabolism, or physiology entrained and maintained? What are the consequences of poor coupling among oscillators? Additional research is necessary to appreciate the health effects of faulty circadian output signals. Behavioral phenotype is the result of a gene and environment interactions. As we have become increasingly sophisticated about the mechanisms underlying gene expression, we need to simplify the environmental variables that affect gene expression to improve our studies. Studies of photoperiod on behavior are important because they allow precise environmental probing of the geneenvironment interactions. Future research will capitalize on the use of simple, yet precise, environmental manipulations to understand the behavioral effects of differential gene expression. Additional studies of the effects of global warming on organisms low on the food chain (e.g., plants and insects) are also warranted. If the timing of abundance of plants and insects shift, then this will affect breeding success of amphibian, reptiles, birds, and mammals that rely solely on photoperiod to time reproduction. Shifts in seasonal links among food, survival, and reproduction may already be affecting populations of amphibians and important pollinating insects, which could have enormous effects on our food availability and survival.
8/18/09 5:11:35 PM
References 73
Importantly, understanding biological clocks and their associated biological rhythms is critical for a full understanding of behavior. Behavior is not constantly expressed, but rather, displays a temporal organization across the day or year. In order to understand why a behavior occurs, an appreciation of the temporal organization of individuals is necessary. It is not appropriate to study learning and memory in nocturnal rats tested in the middle of our day. It is not appropriate to maintain animals on seasonally ambiguous photoperiods of 12 hours of light and 12 hours of dark per day. Not attending to biological rhythms will confound behavioral studies of humans and other animals.
REFERENCES Abe, M., Herzog, E. D., Yamazaki, S., Straume, M., Tei, H., Sakaki, Y., et al. (2002). Circadian rhythms in isolated brain regions. Journal of Neuroscience, 22, 350–356. Abizaid, A., Gao, Q., & Horvath, T. L. (2006). Thoughts for food: Brain mechanisms and peripheral energy balance. Neuron, 51, 691–702. Alstadhaug, K. B., Salvesen, R., & Bekkelund, S. I. (2005). Seasonal variation in migraine. Cephalalgia, 25, 811–816. Alvarez-Borda, B., Haripal, B., & Nottebohm, F. (2004). Timing of brainderived neurotrophic factor exposure affects life expectancy of new neurons. Proceedings of the National Academy of Sciences, USA, 101, 3957–3961. Anand, B. K., & Brobeck, J. R. (1951a). Hypothalamic control of food intake in rats and cats. Journal of Biological Medicine, 24, 123–140. Anand, B. K., & Brobeck, J. R. (1951b). Localization of a “feeding center ” in the hypothalamus of the rat. Proceedings of the Society of Experimental Biological Medicine, 77, 323–324. Aravich, P. F., & Sclafani, A. (1983). Paraventricular hypothalamic lesions and medial hypothalamic knife cuts produce similar hyperphagia syndromes. Behavioral Neuroscience, 97, 970–983. Archer, S. N., Robilliard, D. L., Skene, D. J., Smits, M., Williams, A., Arendt, J., et al. (2003). A length polymorphism in the circadian clock gene Per3 is linked to delayed sleep phase syndrome and extreme diurnal preference. Sleep, 26, 413–415. Aschoff, J. (1965). Circadian rhythms in man. Science, 148, 1427–1432. Aschoff, J. (1981). Annual rhythms in man. In J. Aschoff (Ed.), Handbook of behavioral neurobiology (Vol. 4, pp. 475–490). New York: Plenum Press. Babkoff, H., Caspy, T., Mikulincer, M., & Sing, H. C. (1991). Monotonic and rhythmic influences: A challenge for sleep deprivation research. Psychological Bulletin, 109, 411–428. Baggs, J. E., & Green, C. B. (2003). Nocturnin, a deadenylase in Xenopus laevis retina: A mechanism for posttranscriptional control of circadianrelated mRNA. Current Biology, 13, 189–198. Ball, G. F., Auger, C. J., Bernard, D. J., Charlier, T. D., Sartor, J. J., Riters, L. V., et al. (2004). Seasonal plasticity in the song control system: Multiple brain sites of steroid hormone action and the importance of variation in song behavior. Annals of the New York Academy of Sciences, 1016, 586–610. Ball, G. F., Riters, L. V., & Balthazart, J. (2002). Neuroendocrinology of song behavior and avian brain plasticity: Multiple sites of action of sex steroid hormones. Frontiers in Neuroendocrinology, 23, 137–178. Balsalobre, A., Damiola, F., & Schibler, U. (1998). A serum shock induces circadian gene expression in mammalian tissue culture cells. Cell, 93, 929–937.
c04.indd Sec4:73
Bao, A. M., Ji, Y. F., Van Someren, E. J., Hofman, M. A., Liu, R. Y., & Zhou, J. N. (2004). Diurnal rhythms of free estradiol and cortisol during the normal menstrual cycle in women with major depression. Hormones and Behavior, 45, 93–102. Barnea, A., & Nottebohm, F. (1994). Seasonal recruitment of hippocampal neurons in adult free-ranging black-capped chickadees. Proceedings of the National Academy of Sciences, USA, 91, 11217–11221. Barnea, A., & Nottebohm, F. (1996). Recruitment and replacement of hippocampal neurons in young and adult chickadees: An addition to the theory of hippocampal learning. Proceedings of the National Academy of Sciences, USA, 93, 714–718. Bartness, T. J., & Goldman, B. D. (1989). Mammalian pineal melatonin: A clock for all seasons. Experientia, 45, 939–945. Bartness, T. J., Goldman, B. D., & Bittman, E. L. (1991). SCN lesions block responses to systemic melatonin infusions in Siberian hamsters. American Journal of Physiology, 260(1, Pt. 2), R102–R112. Bartness, T. J., Powers, J. B., Hastings, M. H., Bittman, E. L., & Goldman, B. D. (1993). The timed infusion paradigm for melatonin delivery: What has it taught us about the melatonin signal, its reception, and the photoperiodic control of seasonal responses? Journal of Pineal Research, 15, 161–190. Bernardis, L. L., & Bellinger, L. L. (1998). The dorsomedial hypothalamic nucleus revisited: 1998 update. Proceedings of the Society of Experimental Biological Medicine, 218, 284–306. Berson, D. M., Dunn, F. A., & Takao, M. (2002). Phototransduction by retinal ganglion cells that set the circadian clock. Science, 295, 1070–1073. Bildt, C., & Michelsen, H. (2002). Gender differences in the effects from working conditions on mental health: A 4-year follow-up. International Archives of Occupational and Environmental Health, 75, 252–258. Bittman, E. L., Bartness, T. J., Goldman, B. D., & DeVries, G. J. (1991). Suprachiasmatic and paraventricular control of photoperiodism in Siberian hamsters. American Journal of Physiology, 260(1, Pt. 2), R90–R101. Bittman, E. L., Tubbiola, M. L., Foltz, G., & Hegarty, C. M. (1999). Effects of photoperiod and androgen on proopiomelanocortin gene expression in the arcuate nucleus of golden hamsters. Endocrinology, 140, 197–206. Bittman, E. L., & Weaver, D. R. (1990). The distribution of melatonin binding sites in neuroendocrine tissues of the ewe. Biology of Reproduction, 43, 986–993. Blank, J. L., Korytko, A. I., Freeman, D. A., & Ruf, T. P. (1994). Role of gonadal steroids and inhibitory photoperiod in regulating body weight and food intake in deer mice (Peromyscus maniculatus). Proceedings of the Society of Experimental Biological Medicine, 206, 396–403. Boulos, Z. Rosenwasser, A. M., & Terman, M. (1980). Feeding schedules and the circadian organization of behavior in the rat. Behavioural Brain Research, 1, 39–65. Brenowitz, E. A., & Beecher, M. D. (2005). Song learning in birds: Diversity and plasticity, opportunities and challenges. Trends in Neuroscience, 28, 127–132. Brenowitz, E. A., Lent, K., & Rubel, E. W. (2007). Auditory feedback and song production do not regulate seasonal growth of song control circuits in adult white-crowned sparrows. Journal of Neuroscience, 27, 6810–6814. Brewerton, T. D., & George, M. S. (1990). A study of the seasonal variation of migraine. Headache, 30, 511–513. Bronson, F. H. (1985). Mammalian reproduction: An ecological perspective. Biology of Reproduction, 32, 1–26. Bronson, F. H. (1988). Mammalian reproductive strategies: Genes, photoperiod and latitude. Reproduction and Nutritional Development, 28(2B), 335–347. Bronson, F. H. (1995). Seasonal variation in human reproduction: Environmental factors. Quarterly Review of Biology, 70, 141–164.
8/18/09 5:11:35 PM
74
Biological Rhythms
Bronson, F. H. (2004). Are humans seasonally photoperiodic? Journal of Biological Rhythms, 19, 180–192.
norepinephrine and clonidine. European Journal of Pharmacology, 232(2/3), 277–234.
Brown, E. N., & Czeisler, C. A. (1992). The statistical analysis of circadian phase and amplitude in constant-routine core-temperature data. Journal of Biological Rhythms, 7, 177–202.
Davidson, A. J., Sellix, M. T., Daniel, J., Yamazaki, S., Menaker, M., & Block, G. D. (2006). Chronic jet-lag increases mortality in aged mice. Current Biology, 16, R914–R916.
Buijs, R. M., Hermes, M. H., & Kalsbeek, A. (1998). The suprachiasmatic nucleus-paraventricular nucleus interactions: A bridge to the neuroendocrine and autonomic nervous system. Progress in Brain Research, 119, 365–382.
Davidson, A. J., Yamazaki, S., Arble, D. M., Menaker, M., & Block, G. D. (2006). Resetting of central and peripheral circadian oscillators in aged rats. Neurobiology of Aging. 29(3), 471 – 477.
Buijs, R. M., Pevet, P., Masson-Pevet, M., Pool, C. W., de Vries, G. J., Canguilhem, B., et al. (1986). Seasonal variation in vasopressin innervation in the brain of the European hamster (Cricetus cricetus). Brain Research, 371, 193–196. Buijs, R. M., van Eden, C. G., Goncharuk, V. D., & Kalsbeek, A. (2003). The biological clock tunes the organs of the body: Timing by hormones and the autonomic nervous system. Journal of Endocrinology, 177, 17–26. Caldwell, G. S., Glickman, S. E., & Smith, E. R. (1984). Seasonal aggression independent of seasonal testosterone in wood rats. Proceedings of the National Academy of Sciences, USA, 81, 5255–5257. Caldwell, H. K., & Albers, H. E. (2004). Effect of photoperiod on vasopressin-induced aggression in Syrian hamsters. Hormones and Behavior, 46, 444–449. Card, J. P. (2000). Pseudorabies virus and the functional architecture of the circadian timing system. Journal of Biological Rhythms, 15, 453–461. Carolei, A., Marini, C., De Matteis, G., Di Napoli, M., & Baldassarre, M. (1996). Seasonal incidence of stroke. Lancet, 347, 1702–1703. Carter, D. S., & Goldman, B. D. (1983). Antigonadal effects of timed melatonin infusion in pinealectomized male Djungarian hamsters (Phodopus sungorus sungorus): Duration is the critical parameter. Endocrinology, 113, 1261–1267. Cassone, V. M. (1990). Melatonin: Time in a bottle. Oxford Review of Reproductive Biology, 12, 319–367. Ceriani, M. F., Hogenesch, J. B., Yanovsky, M., Panda, S., Straume, M., & Kay, S. A. (2002). Genome-wide expression analysis in Drosophila reveals genes controlling circadian behavior. Journal of Neuroscience, 22, 9305–9319. Cheng, M. Y., Bittman, E. L., Hattar, S., & Zhou, Q. Y. (2005). Regulation of prokineticin 2 expression by light and the circadian clock. BMC Neuroscience, 6, 17. Cheng, M. Y., Bullock, C. M., Li, C., Lee, A. G., Bermak, J. C., Belluzzi, J., et al. (2002). Prokineticin 2 transmits the behavioural circadian rhythm of the suprachiasmatic nucleus. Nature, 417, 405–410. Cho, K. (2001). Chronic “jet lag” produces temporal lobe atrophy and spatial cognitive deficits. Nature Neuroscience, 4, 567–568. Cho, K., Ennaceur, A., Cole, J. C., & Suh, C. K. (2000). Chronic jet lag produces cognitive deficits. Journal of Neuroscience, 20, RC66. Claridge-Chang, A., Wijnen, H., Naef, F., Boothroyd, C., Rajewsky, N., & Young, M. W. (2001). Circadian regulation of gene expression systems in the Drosophila head. Neuron, 32, 657–671. Clifton, P. G., Rusk, I. N., & Cooper, S. J. (1991). Effects of dopamine D1 and dopamine D2 antagonists on the free feeding and drinking patterns of rats. Behavioral Neuroscience, 105, 272–281. Colquhoun, D. (1981). Rhythms in performance. In J. Aschoff (Ed.), Handbook of Behavioral Neurobiology, 4, 333-348. Conlon, M., Lightfoot, N., & Kreiger, N. (2007). Rotating shift work and risk of prostate cancer. Epidemiology, 18, 182–183. Costa, G. (1996). The impact of shift and night work on health. Applied Ergonomics, 27(1), 9–16. Currie, P. J. & Wilson, L. M. (1992). Yohimbine attenuates elonidineinduced feeding and macronutrient selection in genetically obese (ob/ ob) mice. Pharmacology, Biochemistry, and Behavior, 43, 1039–1046. Currie, P. J. & Wilson, L. M. (1993). Potentiation of dark onset feeding in obese mice (genotype ob/ob) following central injection of
c04.indd Sec4:74
De Koninck, J. (1991). [Biological rhythms associated with sleep and psychological adjustment]. Journal of Psychiatry and Neuroscience, 16, 115–122. DeCoursey, P. J., & Krulas, J. R. (1998). Behavior of SCN-lesioned chipmunks in natural habitat: A pilot study. Journal of Biological Rhythms, 13, 229–244. DeCoursey, P. J., Krulas, J. R., Mele, G., & Holley, D. C. (1997). Circadian performance of suprachiasmatic nuclei (SCN)-lesioned antelope ground squirrels in a desert enclosure. Physiology and Behavior, 62, 1099–1108. de la Iglesia, H. O., Meyer, J., Carpino, A., Jr., & Schwartz, W. J. (2000). Antiphase oscillation of the left and right suprachiasmatic nuclei. Science, 290, 799–801. de la Iglesia, H. O., Meyer, J., & Schwartz, W. J. (2003). Lateralization of circadian pacemaker output: Activation of left- and right-sided luteinizing hormone-releasing hormone neurons involves a neural rather than a humoral pathway. Journal of Neuroscience, 23, 7412–7414. Della Fera, M. A., & Baile, C. A. (1979). CCK-octapeptide injected in CSF causes satiety in sheep. Annals of Veterinary Research, 10(2/3), 234–236. DeMairan, J. J. (1729). Observation bontanique. [Botanical Observation]’ Academie de Royale Science, 35–36. Demas, G. E., Polacek, K. M., Durazzo, A., & Jasnow, A. M. (2004). Adrenal hormones mediate melatonin-induced increases in aggression in male Siberian hamsters (Phodopus sungorus). Hormones and Behavior, 46, 582–591. Dodd, A. N., Salathia, N., Hall, A., Kevei, E., Toth, R., Nagy, F., et al. (2005). Plant circadian clocks increase photosynthesis, growth, survival, and competitive advantage. Science, 309, 630–633. Drew, J. E., Barrett, P., Mercer, J. G., Moar, K. M., Canet, E., Delagrange, P., et al. (2001). Localization of the melatonin-related receptor in the rodent brain and peripheral tissues. Journal of Neuroendocrinology, 13, 453–458. Dube, M. G., Kalra, S. P., & Kalra, P. S. (1999). Food intake elicited by central administration of orexins/hypoeretins: Identification of hypothalamic sites of action, Brain Research, 842, 473–477. Dubocovich, M. L., Rivera-Bermudez, M. A., Gerdin, M. J., & Masana, M. I. (2003). Molecular pharmacology, regulation and function of mammalian melatonin receptors. Frontiers in Bioscience, 8, D1093–D1108. Dunlap, J. C., Loros, J. J., & DeCoursey, P. J. (2004). Chronobiology: Biological timekeeping. Sunderland, MA: Sinauer. Earnest, D. J., & Sladek, C. D. (1987). Circadian vasopressin release from perifused rat suprachiasmatic explants in vitro: Effects of acute stimulation. Brain Research, 422, 398–402. Egli, M., Bertram, R., Sellix, M. T., & Freeman, M. E. (2004). Rhythmic secretion of prolactin in rats: Action of oxytocin coordinated by vasoactive intestinal polypeptide of suprachiasmatic nucleus origin. Endocrinology, 145(7), 3386–3394. el-Hajj Fuleihan, G., Klerman, E. B., Brown, E. N., Choe, Y., Brown, E. M., & Czeisler, C. A. (1997). The parathyroid hormone circadian rhythm is truly endogenous: A general clinical research center study. Journal of Clinical Endocrinology and Metabolism, 82, 281–286. Ellis, G. B., & Turek, F. W. (1983). Testosterone and photoperiod interact to regulate locomotor activity in male hamsters. Hormones and Behavior, 17, 66–75. Enns, M. W., Cox, B. J., Levitt, A. J., Levitan, R. D., Morehouse, R., Michalak, E. E., et al. (2006). Personality and seasonal affective
8/18/09 5:11:35 PM
References 75 disorder: Results from the CAN-SAD study. Journal of Affective Disorders, 93(1/3), 35–42.
revisiting the Challenge Hypothesis. Hormones and Behavior, 51, 463–476.
Ferkin, M. H., & Gorman, M. R. (1992). Photoperiod and gonadal hormones influence odor preferences of the male meadow vole: Microtus pennsylvanicus. Physiological Behavior, 51, 1087–1091.
Grandison, L., & Guidotti, A. (1977). Stimulation of food intake by muscimol and beta endorphio. Neuropharmacology, 16, 533–536.
Forger, N. G., & Breedlove, S. M. (1987). Seasonal variation in mammalian striated muscle mass and motoneuron morphology. Journal of Neurobiology, 18, 155–165. Foster, R. G., Argamaso, S., Coleman, S., Colwell, C. S., Lederman, A., & Provencio, I. (1993). Photoreceptors regulating circadian behavior: A mouse model. Journal of Biological Rhythms, 8(Suppl.), S17–S23. Foster, R. G., Hankins, M. W., & Peirson, S. N. (2007). Light, photoreceptors, and circadian clocks. Methods Molecular Biology, 362, 3–28. Freedman, M. S., Lucas, R. J., Soni, B., von Schantz, M., Munoz, M., David-Gray, Z., et al. (1999). Regulation of mammalian circadian behavior by non-rod, non-cone, ocular photoreceptors. Science, 284, 502–504. Galea, L. A., Kavaliers, M., & Ossenkopp, K. P. (1996). Sexually dimorphic spatial learning in meadow voles Microtus pennsylvanicus and deer mice Peromyscus maniculatus. Journal of Experimental Biology, 199(Pt. 1), 195–200. Galea, L. A., Kavaliers, M., Ossenkopp, K. P., & Hampson, E. (1995). Gonadal hormone levels and spatial learning performance in the Morris water maze in male and female meadow voles, Microtus pennsylvanicus. Hormones and Behavior, 29, 106–125. Galea, L. A., Kavaliers, M., Ossenkopp, K. P., Innes, D., & Hargreaves, E. L. (1994). Sexually dimorphic spatial learning varies seasonally in two populations of deer mice. Brain Research, 635, 18–26. Galea, L. A., & McEwen, B. S. (1999). Sex and seasonal differences in the rate of cell proliferation in the dentate gyrus of adult wild meadow voles. Neuroscience, 89, 955–964. Ganguly, S., Coon, S. L., & Klein, D. C. (2002). Control of melatonin synthesis in the mammalian pineal gland: The critical role of serotonin acetylation. Cell and Tissue Research, 309, 127–137.
Green, D. J., & Gillette, R. (1982). Circadian rhythm of firing rate recorded from single cells in the rat suprachiasmatic brain slice. Brain Research, 245, 198–200. Green, J., Pollak, C. P., & Smith, G. P. (1987). The effect of desynchronization on meal patterns of humans living in time isolation. Physiological Behavior, 39, 203–209. Greives, T. J., Mason, A. O., Scotti, M. A., Levine, J., Ketterson, E. D., Kriegsfeld, L. J., et al. (2007). Environmental control of kisspeptin: Implications for seasonal reproduction. Endocrinology, 148, 1158–1166. Grimes, L. J., Melnyk, R. B., Martin, J. M., & Mrosovsky, N. (1981). Infradian cycles in glucose utilization and lipogenic enzyme activity in dormouse (Glis glis) adipocytes. General and Comparative Endocrinology, 45, 21–25. Groos, G., & Hendriks, J. (1982). Circadian rhythms in electrical discharge of rat suprachiasmatic neurones recorded in vitro. Neuroscience Letter, 34, 283–288. Hakim, H., DeBernardo, A. P., & Silver, R. (1991). Circadian locomotor rhythms, but not photoperiodic responses, survive surgical isolation of the SCN in hamsters. Journal of Biological Rhythms, 6, 97–113. Halberg, F. (1959). Physiological 24-hour periodicity in human beings and mice, the lighting regimen and daily routine. In R. B. Withrom (Ed.), Photoperiodism and related phenemena in plants and animals (pp. 803–878). Washington, DC: American Association for the Advancement of Science. Hannibal, J., & Fahrenkrug, J. (2002). Melanopsin: A novel photopigment involved in the photoentrainment of the brain’s biological clock? Annals of Medicine, 34, 401–407.
Gao, Q., & Horvath, T. I. (2007). Neurobiology of feeding and energy expenditure. Annual Review of Neuroscience, 30, 367–398.
Hansen, J. (2006). Risk of breast cancer after night- and shift work: Current evidence and ongoing studies in Denmark. Cancer Causes Control, 17, 531–537.
Garrett, J. W., & Campbell, C. S. (1980). Changes in social behavior of the male golden hamster accompanying photoperiodic changes in reproduction. Hormones and Behavior, 14, 303–318.
Harmer, S. L., Hogenesch, J. B., Straume, M., Chang, H. S., Han, B., Zhu, T., et al. (2000). Orchestrated transcription of key pathways in Arabidopsis by the circadian clock. Science, 290, 2110–2113.
Gerhold, L. M., Horvath, T. L., & Freeman, M. E. (2001). Vasoactive intestinal peptide fibers innervate neuroendocrine dopaminergic neurons. Brain Research, 919, 48–56.
Hattar, S., Lucas, R. J., Mrosovsky, N., Thompson, S., Douglas, R. H., Hankins, M. W., et al. (2003). Melanopsin and rod-cone photoreceptive systems account for all major accessory visual functions in mice. Nature, 424, 76–81.
Gillette, M. U., & Reppert, S. M. (1987). The hypothalamic suprachiasmatic nuclei: Circadian patterns of vasopressin secretion and neuronal activity in vitro. Brain Research Bulletin, 19, 135–139. Goldman, B. D. (2001). Mammalian photoperiodic system: Formal properties and neuroendocrine mechanisms of photoperiodic time measurement. Journal of Biological Rhythms, 16, 283–301. Golombek, D. A., Pevet, P., & Cardinali, D. P. (1996). Melatonin effects on behavior: Possible mediation by the central GABAergic system. Neuroscience Biobehavioral Review, 20, 403–412.
Hazlerigg, D. G., Andersson, H., Johnston, J. D., & Lincoln, G. A. (2004). Molecular characterization of the long-day response in the Soay sheep, a seasonal mammal. Current Biology, 14, 334–339. Hofman, M. A. (2004). The brain’s calendar: Neural mechanisms of seasonal timing. Biological Reviews of the Cambridge Philosophical Society, 79, 61–77. Hofman, M. A., & Swaab, D. F. (1995). Influence of aging on the seasonal rhythm of the vasopressin-expressing neurons in the human suprachiasmatic nucleus. Neurobiology and Aging, 16, 965–971.
Gomez, D. M., & Newman, S. W. (1991). Medial nucleus of the amygdala in the adult Syrian hamster: A quantitative Golgi analysis of gonadal hormonal regulation of neuronal morphology. Anatomical Record, 231, 498–509.
Horowitz, T. S., Cade, B. E., Wolfe, J. M., & Czeisler, C. A. (2003). Searching night and day: A dissociation of effects of circadian phase and time awake on visual selective attention and vigilance. Psychological Science, 14, 549–557.
Gorman, M. R., Ferkin, M. H., Nelson, R. J., & Zucker, I. (1993). Reproductive status influences odour preferences of the meadow vole (Microtus pennsylvanicus). Canadian Journal of Zoology, 71, 1748–1754.
Horvath, T. L., Cela, V., & Van der Beek, E. M. (1998). Gender-specific apposition between vasoactive intestinal peptide-containing axons and gonadotrophin-releasing hormone-producing neurons in the rat. Brain Research, 795, 277–281. Hoshooley, J. S., & Sherry, D. F. (2007). Greater hippocampal neuronal recruitment in food-storing than in non-food-storing birds. Developmental Neurobiology, 67, 406–414.
Gould, E., & Gross, C. G. (2002). Neurogenesis in adult mammals: Some progress and problems. Journal of Neuroscience, 22, 619–623. Goymann, W., Landys, M. M., & Wingfield, J. C. (2007). Distinguishing seasonal androgen responses from male-male androgen responsiveness-
c04.indd Sec4:75
Green, C. B., & Besharse, J. C. (1996). Identification of a novel vertebrate circadian clock-regulated gene encoding the protein nocturnin. Proceedings of the National Academy of Sciences, USA, 93, 14884–14888.
8/18/09 5:11:35 PM
76
Biological Rhythms
Hrushesky, W. J. (1990). Cancer chronotherapy: A drug delivery challenge. Progress in Clinical and Biological Research, 341A, 1–10. Hrushesky, W. J. (1993). Circadian cancer pharmacodynamics. Annali dell’Istituto Superiore di Sanita, 29, 705–710.
Kivimaki, M., Virtanen, M., Elovainio, M., Vaananen, A., KeltikangasJarvinen, L., & Vahtera, J. (2006). Prevalent cardiovascular disease, risk factors and selection out of shift work. Scandanavian Journal of Work and Environmental Health, 32, 204–208.
Hrushesky, W. J. (1995). Cancer chronotherapy: Is there a right time in the day to treat? Journal of Infusional Chemotherapy, 5, 38–43.
Klein, D. C. (1985). Photoneural regulation of the mammalian pineal gland. Ciba Found Symp, 117, 38–56.
Huang, L., DeVries, G. J., & Bittman, E. L. (1998). Photoperiod regulates neuronal bromodeoxyuridine labeling in the brain of a seasonally breeding mammal. Journal of Neurobiology, 36, 410–420.
Klein, D. C., & Moore, R. Y. (1979). Pineal N-acetyltransferase and hydroxyindole-O-methyltransferase: Control by the retinohypothalamic tract and the suprachiasmatic nucleus. Brain Research, 174, 245–262.
Hwang, S. Y., & Lee, J. H. (2005). Comparison of cardiovascular risk profile clusters among industrial workers. Taehan Kanho Hakhoe Chi, 35, 1500–1507. Illnerova, H., Hoffmann, K., & Vanecek, J. (1984). Adjustment of pineal melatonin and N-acetyltransferase rhythms to change from long to short photoperiod in the Djungarian hamster Phodopus sungorus. Neuroendocrinology, 38, 226–231.
Klein, D. C., Smoot, R., Weller, J. L., Higa, S., Markey, S. P., Creed, G. J., et al. (1983). Lesions of the paraventricular nucleus area of the hypothalamus disrupt the suprachiasmatic leads to spinal cord circuit in the melatonin rhythm generating system. Brain Research Bulletin, 10, 647–652.
Jacobs, L. F. (1996). The economy of winter: Phenotypic plasticity in behavior and brain structure. Biological Bulletin, 191, 92–100.
Knobil, E., & Hotchkiss, J. (1988). The menstrual cycle and its neuroendocrine control. In E. Knobil & J. D. Neill (Eds.), The physiology of reproduction (Vol. 2, pp. 1971–1994). New York: Raven Press.
Jacobs, L. F., Gaulin, S. J., Sherry, D. F., & Hoffman, G. E. (1990). Evolution of spatial cognition: Sex-specific patterns of spatial behavior predict hippocampal size. Proceedings of the National Academy of Sciences, USA, 87, 6349–6352.
Koda, S., Yasuda, N., Sugihara, Y., Ohara, H., Udo, H., Otani, T., et al. (2000). [Analyses of work-relatedness of health problems among truck drivers by questionnaire survey]. Sangyo Eiseigaku Zasshi, 42(1), 6–16.
Jasnow, A. M., Huhman, K. L., Bartness, T. J., & Demas, G. E. (2000). Short-day increases in aggression are inversely related to circulating testosterone concentrations in male Siberian hamsters (Phodopus sungorus). Hormones and Behavior, 38, 102–110.
Kolmodin-Hedman, B., & Swensson, A. (1975). Problems related to shift work. A field study of Swedish railroad workers with irregular work hours. Scandanavian Journal of Work and Environmental Health, 1, 254–262.
Johnston, J. D., Cagampang, F. R., Stirland, J. A., Carr, A. J., White, M. R., Davis, J. R., et al. (2003). Evidence for an endogenous per1- and ICERindependent seasonal timer in the hamster pituitary gland. FASEB Journal, 17, 810–815.
Kramer, A., Yang, F. C., Snodgrass, P., Li, X., Scammell, T. E., Davis, F. C., et al. (2001). Regulation of daily locomotor activity and sleep by hypothalamic EGF receptor signaling. Science, 294, 2511–2515.
Kalra, S. P., Dube, M. G., Pu, S., Xu, B., Horvath, T. L., & Kalra, P. S. (1999). Interacting appetite-regulating pathways in the hypothalamic regulation of body weight. Endocrine Reviews, 20, 68–100.
Kramer, C., Loros, J. J., Dunlap, J. C., & Crosthwaite, S. K. (2003). Role for antisense RNA in regulating circadian clock function in Neurospora crassa. Nature, 421, 948–952.
Kalsbeek, A., & Buijs, R. M. (2002). Output pathways of the mammalian suprachiasmatic nucleus: Coding circadian time by transmitter selection and specific targeting. Cell Tissue Research, 309, 109–118.
Kriegsfeld, L. J., Dawson, T. M., Dawson, V. L., Nelson, R. J., & Snyder, S. H. (1997). Aggressive behavior in male mice lacking the gene for neuronal nitric oxide synthase requires testosterone. Brain Research, 769, 66–70.
Kalsbeek, A., Fliers, E., Franke, A. N., Wortel, J., & Buijs, R. M. (2000). Functional connections between the suprachiasmatic nucleus and the thyroid gland as revealed by lesioning and viral tracing techniques in the rat. Endocrinology, 141, 3832–3841.
Kriegsfeld, L. J., Korets, R., & Silver, R. (2003). Expression of the circadian clock gene Period 1 in neuroendocrine cells: An investigation using mice with a Per1:GFP transgene. European Journal of Neuroscience, 17, 212–220.
Kalsbeek, A., Teclemariam-Mesbah, R., & Pevet, P. (1993). Efferent projections of the suprachiasmatic nucleus in the golden hamster (Mesocricetus auratus). Journal of Comparative Neurology, 332, 293–314.
Kriegsfeld, L. J., Leak, R. K., Yackulic, C. B., LeSauter, J., & Silver, R. (2004). Organization of suprachiasmatic nucleus projections in Syrian hamsters (Mesocricetus auratus): An anterograde and retrograde analysis. Journal of Comparative Neurology, 468, 361–379.
Kalsbeek, A., van Heerikhuize, J. J., Wortel, J., & Buijs, R. M. (1996). A diurnal rhythm of stimulatory input to the hypothalamo-pituitaryadrenal system as revealed by timed intrahypothalamic administration of the vasopressin V1 antagonist. Journal of Neuroscience, 16, 5555–5565. Kanabrocki, E. L., Hermida, R. C., Torossov, M., Haseman, M. B., Bettis, K., Young, R. M., et al. (2006). Chronotherapy of ovarian cancer: Effect on blood variables and serum cytokines. A case report. La Clinica Terapeutica, 157, 349–354. Karlsson, B., Alfredsson, L., Knutsson, A., Andersson, E., & Toren, K. (2005). Total mortality and cause-specific mortality of Swedish shift- and dayworkers in the pulp and paper industry in 1952–2001. Scandanavian Journal of Work and Environmental Health, 31, 30–35. Khalsa, S. B. S., Jewett, M. E., Duffy, J. F., & Czeisler, C. A. (2000). The timing of the human circadian clock is accurately represented by the core body temperature rhythm following phase shifts to a three-cycle light stimulus near the critical zone. Journal of Biological Rhythms, 15, 524–530. Kissileff, H. R. (1970). Free feeding in normal and “recovered lateral” rats monitored by a pellet-detecting eatometer. Physiological Behavior, 5, 163–173.
c04.indd Sec4:76
Kriegsfeld, L.J. & Silver, R. (2006). The regulation of neuroendocrine function: Timing is everything. Hormones & Behavior, 49, 557–574. Kriegsfeld, L. J., Silver, R., Gore, A. C., & Crews, D. (2002). Vasoactive intestinal polypeptide contacts on gonadotropin-releasing hormone neurones increase following puberty in female rats. Journal of Neuroendocrinology, 14, 685–690. Kubo, T., Ozasa, K., Mikami, K., Wakai, K., Fujino, Y., Watanabe, Y., et al. (2006). Prospective cohort study of the risk of prostate cancer among rotating-shift workers: Findings from the Japan collaborative cohort study. American Journal of Epidemiology, 164, 549–555. Kuriyama, K., Uchiyama, M., Suzuki, H., Tagaya, H., Ozaki, A., Aritake, S., et al. (2003). Circadian fluctuation of time perception in healthy human subjects. Neuroscience Research, 46(1), 23–31. Larsen, P. J., Enquist, L. W., & Card, J. P. (1998). Characterization of the multisynaptic neuronal control of the rat pineal gland using viral transneuronal tracing. European Journal of Neuroscience, 10, 128–145. Lavenex, P., Steele, M. A., & Jacobs, L. F. (2000). Sex differences, but no seasonal variations in the hippocampus of food-caching squirrels: A stereological study. Journal of Comparative Neurology, 425, 152–166.
8/18/09 5:11:36 PM
References 77 Lavery, D. J., Lopez-Molina, L., Margueron, R., Fleury-Olela, F., Conquet, F., Schibler, U., et al. (1999). Circadian expression of the steroid 15 alphahydroxylase (Cyp2a4) and coumarin 7-hydroxylase (Cyp2a5) genes in mouse liver is regulated by the PAR leucine zipper transcription factor DBP. Molecular and Cellular Biochemistry, 19, 6488–6499. Leak, R. K., & Moore, R. Y. (2001). Topographic organization of suprachiasmatic nucleus projection neurons. Journal of Comparative Neurology, 433, 312–334.
London, S. E., Monks, D. A., Wade, J., & Schlinger, B. A. (2006). Widespread capacity for steroid synthesis in the avian brain and song system. Endocrinology, 147, 5975–5987.
Lee, R., & Balick, M. J. (2006). Chronobiology: It’s about time. Explore, 2, 442–445.
Lowrey, P. L., Shimomura, K., Antoch, M. P., Yamazaki, S., Zemenides, P. D., Ralph, M. R., et al. (2000). Positional syntenic cloning and functional characterization of the mammalian circadian mutation tau. Science, 288, 483–492.
Lehman, M. N., Goodman, R. L., Karsch, F. J., Jackson, G. L., Berriman, S. J., & Jansen, H. T. (1997). The GnRH system of seasonal breeders: Anatomy and plasticity. Brain Research Bulletin, 44, 445–457.
Lucas, R. J., Freedman, M. S., Munoz, M., Garcia-Fernandez, J. M., & Foster, R. G. (1999). Regulation of the mammalian pineal by non-rod, non-cone, ocular photoreceptors. Science, 284, 505–507.
Lehman, M. N., Silver, R., Gladstone, W. R., Kahn, R. M., Gibson, M., & Bittman, E. L. (1987). Circadian rhythmicity restored by neural transplant. Immunocytochemical characterization of the graft and its integration with the host brain. Journal of Neuroscience, 7, 1626–1638.
Lucas, R. J., Hattar, S., Takao, M., Berson, D. M., Foster, R. G., & Yau, K. W. (2003). Diminished pupillary light reflex at high irradiances in melanopsin-knockout mice. Science, 299, 245–247.
Leibowitz, S. F., Weiss, G. F., Walsh, U. A., & Viswanath, D. (1989). Medial hypothalamic serotonin: Role in circadian patterns of feeding and macronutrient selection. Brain Research, 503, 132–140. Leonard, C., Fanning, N., Attwood, J., & Buckley, M. (1998). The effect of fatigue, sleep deprivation and onerous working hours on the physical and mental wellbeing of pre-registration house officers. Irish Journal of Medical Science, 167(1), 22–25.
Madison, D. M. (1984). Group nesting and its ecological and evolutionary significance in overwintering microtine rodents. Bulletin of the Carnegie Museum of Natural History, 10, 267–274. Mason, A. O., Greives, T. J., Scotti, M. A., Levine, J., Frommeyer, S., Ketterson, E. D., et al. (2007). Suppression of kisspeptin expression and gonadotropic axis sensitivity following exposure to inhibitory day lengths in female Siberian hamsters. Hormones and Behavior, 52, 492–498.
Levi, F. (1994). Chronotherapy of cancer: Biological basis and clinical application. Pathologie et Biologie, 42, 338–341.
Maywood, E. S., Bittman, E. L., & Hastings, M. H. (1996). Lesions of the melatonin- and androgen-responsive tissue of the dorsomedial nucleus of the hypothalamus block the gonadal response of male Syrian hamsters to programmed infusions of melatonin. Biology of Reproduction, 54, 470–477.
Levi, F. (2001). Circadian chronotherapy for human cancers. Lancet Oncol, 2, 307–315.
McDonald, M. J., & Rosbash, M. (2001). Microarray analysis and organization of circadian gene expression in Drosophila. Cell, 107, 567–578.
Levi, F. (2002). From circadian rhythms to cancer chronotherapeutics. Chronobiology International, 19, 1–19.
McShea, W. J. (1990). Social tolerance and proximate mechanisms of dispersal in winter groups of meadow voles (Microtus pennsylvanicus). Animal Behaviour, 39, (346–351).
Levi, F. (1987). [Chronobiology and cancer]. Pathologie et Biologie, 35, 960–968.
Lewis, D., Freeman, D. A., Dark, J., Wynne-Edwards, K. E., & Zucker, I. (2002). Photoperiodic control of oestrous cycles in Syrian hamsters: Mediation by the mediobasal hypothalamus. Journal of Neuroendocrinology, 14, 294–299.
Melnyk, R. B., Mrosovsky, N., & Martin, J. M. (1983a). Spontaneous obesity and weight loss: Insulin action in the dormouse. American Journal of Physiology, 245, R396–R402.
Lewy, A. J., Lefler, B. J., Emens, J. S., & Bauer, V. K. (2006). The circadian basis of winter depression. Proceedings of the National Academy of Sciences, USA, 103, 7414–7419.
Melnyk, R. B., Mrosovsky, N., & Martin, J. M. (1983b). Spontaneous obesity and weight loss: Insulin binding and lipogenesis in the dormouse. American Journal of Physiology, 245, R403–R407.
Li, X. C., Jarvis, E. D., Alvarez-Borda, B., Lim, D. A., & Nottebohm, F. (2000). A relationship between behavior, neurotrophin expression, and new neuron survival. Proceedings of the National Academy of Sciences, USA, 97, 8584–8589.
Messager, S., Hazlerigg, D. G., Mercer, J. G., & Morgan, P. J. (2000). Photoperiod differentially regulates the expression of Per1 and ICER in the pars tuberalis and the suprachiasmatic nucleus of the Siberian hamster. European Journal of Neuroscience, 12, 2865–2870.
Lincoln, G. A., Andersson, H., & Loudon, A. (2003). Clock genes in calendar cells as the basis of annual timekeeping in mammals: A unifying hypothesis. Journal of Endocrinology, 179, 1–13.
Meyer-Bernstein, E. L., Jetton, A. E., Matsumoto, S. I., Markuns, J. F., Lehman, M. N., & Bittman, E. L. (1999). Effects of suprachiasmatic transplants on circadian rhythms of neuroendocrine function in golden hamsters. Endocrinology, 140, 207–218.
Lincoln, G. A., & Clarke, I. J. (1997). Refractoriness to a static melatonin signal develops in the pituitary gland for the control of prolactin secretion in the ram. Biology of Reproduction, 57, 460–467. Lincoln, G. A., Johnston, J. D., Andersson, H., Wagner, G., & Hazlerigg, D. G. (2005). Photorefractoriness in mammals: Dissociating a seasonal timer from the circadian-based photoperiod response. Endocrinology, 146, 3782–3790. Lincoln, G. A., Messager, S., Andersson, H., & Hazlerigg, D. (2002). Temporal expression of seven clock genes in the suprachiasmatic nucleus and the pars tuberalis of the sheep: Evidence for an internal coincidence timer. Proceedings of the National Academy of Sciences, USA, 99, 13890–13895. Liu, Q. S., Li, J. Y., & Wang, D. H. (2007). Ultradian rhythms and the nutritional importance of caecotrophy in captive Brandt’s voles (Lasiopodomys brandtii). Journal of Comparative Physiology. Series B, 177, 423–432.
c04.indd Sec4:77
Lloyd, D., Lemar, K. M., Salgado, L. E., Gould, T. M., & Murray, D. B. (2003). Respiratory oscillations in yeast: Mitochondrial reactive oxygen species, apoptosis and time; a hypothesis. FEMS Yeast Research, 3, 333–339.
Mink, J. W., Blumenschine, R. J., & Adams, D. B. (1981). Ratio of central nervous system to body metabolism in vertebrates: Its constancy and functional basis. American Journal of Physiology, 241, R203–R212. Mistlberger, R. E., & Skene, D. J. (2004). Social influences on mammalian circadian rhythms: Animal and human studies. Biological Reviews of the Cambridge Philosophical Society, 79, 533–556. Mistlberger, R. E., & Skene, D. J. (2005). Nonphotic entrainment in humans? Journal of Biological Rhythms, 20, 339–352. Monnikes, H., Heymann-Monnikes. I., & Tache, Y. (1992). CRF in the paraventricular nucleus of the hypothalamus induces dose-related behavioral profile in rats. Brain Research, 574, 70–76. Moore, R. Y., & Eichler, V. B. (1972). Loss of a circadian adrenal corticosterone rhythm following suprachiasmatic lesions in the rat. Brain Research, 42, 201–206.
8/18/09 5:11:36 PM
78
Biological Rhythms
Moore, R. Y., & Klein, D. C. (1974). Visual pathways and the central neural control of a circadian rhythm in pineal serotonin N-acetyltransferase activity. Brain Research, 71, 17–33. Mori, T. Nagai, K., Hara, M., & Nakagawa, H. (1985). Time-dependent effect of insulin in suprachiasmatic nucleus on blood glucose. American Journal of Physiology, 249(1 Pt 2), R23–R30. Morikawa, Y., Nakagawa, H., Miura, K., Soyama, Y., Ishizaki, M., Kido, T., et al. (2005). Shift work and the risk of diabetes mellitus among Japanese male factory workers. Scandanavian Journal of Work and Environmental Health, 31, 179–183.
Ouyang, Y., Andersson, C. R., Kondo, T., Golden, S. S., & Johnson, C. H. (1998). Resonating circadian clocks enhance fitness in cyanobacteria. Proceedings of the National Academy of Sciences, USA, 95, 8660–8664. Panda, S., Provencio, I., Tu, D. C., Pires, S. S., Rollag, M. D., Castrucci, A. M., et al. (2003). Melanopsin is required for non-image-forming photic responses in blind mice. Science, 301, 525–527. Panda, S., Sato, T. K., Castrucci, A. M., Rollag, M. D., DeGrip, W. J., Hogenesch, J. B., et al. (2002). Melanopsin (Opn4) requirement for normal light-induced circadian phase shifting. Science, 298, 2213–2216.
Morin, L. P., & Allen, C. N. (2006). The circadian visual system, 2005. Brain Research Reviews, 51, 1–60.
Patel, D. (2006). Shift work, light at night and risk of breast cancer. Occupational Medicine, 56, 433.
Morin, L. P., Goodless-Sanchez, N., Smale, L., & Moore, R. Y. (1994). Projections of the suprachiasmatic nuclei, subparaventricular zone and retrochiasmatic area in the golden hamster. Neuroscience, 61, 391–410.
Perrot-Sinal, T. S., Kavaliers, M., & Ossenkopp, K. P. (1998). Spatial learning and hippocampal volume in male deer mice: Relations to age, testosterone and adrenal gland weight. Neuroscience, 86, 1089–1099.
Mosko, S. S., & Moore, R. Y. (1979). Neonatal suprachiasmatic nucleus lesions: Effects on the development of circadian rhythms in the rat. Brain Research, 164, 17–38.
Poole, C. J., Wright, A. D., & Nattrass, M. (1992). Control of diabetes mellitus in shift workers. British Journal of Industrial Medicine, 49, 513–515.
Munakata, M., Ichi, S., Nunokawa, T., Saito, Y., Ito, N., Fukudo, S., et al. (2001). Influence of night shift work on psychologic state and cardiovascular and neuroendocrine responses in healthy nurses. Hypertension Research, 24(1), 25–31. Musshoff, U., Riewenherm, D., Berger, E., Fauteck, J. D., & Speckmann, E. J. (2002). Melatonin receptors in rat hippocampus: Molecular and functional investigations. Hippocampus, 12(2), 165–173. Nagai, K., Nagai, N., Sugahara, K., Niijima, A., & Nakagawa, H. (1994). Circadian rhythms and energy metabolism with special reference to the suprachiasmatic nucleus. Neuroscience Biobehavioral Review, 18, 579–584. Nagai, K., Nishio, T., Nakagawa, H., Nakamura, S., & Fukuda, Y. (1978). Effect of bilateral lesions of the suprachiasmatic nuclei on the circadian rhythm of food-intake. Brain Research, 142, 384–389. Nagoshi, E., Saini, C., Bauer, C., Laroche, T., Naef, F., & Schibler, U. (2004). Circadian gene expression in individual fibroblasts: Cellautonomous and self-sustained oscillators pass time to daughter cells. Cell, 119, 693–705. Nelson, R. J. (1987). Photoperiod-nonresponsive morphs: A possible variable in microtine population density fluctuations. American Naturalist, 130, 350–369.
Popov, V. I., Bocharova, L. S., & Bragin, A. G. (1992). Repeated changes of dendritic morphology in the hippocampus of ground squirrels in the course of hibernation. Neuroscience, 48, 45–51. Pravosudov, V. V., & Clayton,N. S. (2002). A test of the adaptive specialization hypothesis: population differences in caching, memory, and the hippocampus in black-capped chickadees (Poecile atricapilla). Behavioral Neuroscience, 116, 515–522. Prendergast, B. J., Kriegsfeld, L. J., & Nelson, R. J. (2001). Photoperiodic polyphenisms in rodents: Neuroendocrine mechanisms, costs, and functions. Quarterly Review of Biology, 76, 293–325. Pyter, L. M., Reader, B. F., & Nelson, R. J. (2005). Short photoperiods impair spatial learning and alter hippocampal dendritic morphology in adult male white-footed mice (Peromyscus leucopus). Journal of Neuroscience, 25, 4521–4526. Pyter, L. M., Trainor, B. C., & Nelson, R. J. 2006. Testosterone and Photo period interact to affect spatial learning and memory in adult male whitefooted mice (Peromysus leucopus). European Journal of Neuroscience, 23, 3056–3062. Ralph, M. R., Foster, R. G., Davis, F. C., & Menaker, M. (1990). Transplanted suprachiasmatic nucleus determines circadian period. Science, 247, 975–978.
Nelson, R. J., Badura, L. L., & Goldman, B. D. (1990). Mechanisms of seasonal cycles of behavior. Annual Review of Psychology, 41, 81–108.
Ralph, M. R., & Menaker, M. (1988). A mutation of the circadian system in golden hamsters. Science, 241, 1225–1227.
Nelson, R. J., & Chiavegatto, S. (2001). Molecular basis of aggression. Trends in Neuroscience, 24, 713–719.
Rasika, S., Alvarez-Buylla, A., & Nottebohm, F. (1999). BDNF mediates the effects of testosterone on the survival of new neurons in an adult brain. Neuron, 22, 53–62.
Nelson, R. J., Demas, G. E., Huang, P. L., Fishman, M. C., Dawson, V. L., Dawson, T. M., et al. (1995). Behavioural abnormalities in male mice lacking neuronal nitric oxide synthase. Nature, 378, 383–386. Nelson, R. J., & Trainor, B. C. (2007). Neural mechanisms of aggression. NatureReview of Neuroscience, 8, 536–546. Newman, G. C., & Hospod, F. E. (1986). Rhythm of suprachiasmatic nucleus 2-deoxyglucose uptake in vitro. Brain Research, 381, 345–350. Nicholson, C. (1999). Signals that go with the flow. Trends of Neuroscience, 22(4), 143–145. Nottebohm, F. (1981). A brain for all seasons: Cyclical anatomical changes in song control nuclei of the canary brain. Science, 214, 1368–1370. Nottebohm, F. (2005). The neural basis of birdsong. PLoS Biology, 3, E164. Nunez, A. A., & Stephan, F. K. (1977). The effects of hypothalamic knife cuts on drinking rhythms and the estrus cycle of the rat. Behavioral Biology, 20, 224–234. O’Leary, E. S., Schoenfeld, E. R., Stevens, R. G., Kabat, G. C., Henderson, K., Grimson, R., et al. (2006). Shift work, light at night, and breast cancer on Long Island, New York. American Journal of Epidemiology, 164, 358–366.
c04.indd Sec4:78
Reddy, A. B., Karp, N. A., Maywood, E. S., Sage, E. A., Deery, M., O’Neill, J. S., et al. (2006). Circadian orchestration of the hepatic proteome. Current Biology, 16, 1107–1115. Refinetti, R., & Menaker, M. (1992). The circadian rhythm of body temperature. Physiological Behavior, 51, 613–637. Reiter, R. J., & Tan, D. X. (2002). Role of CSF in the transport of melatonin. Journal of Pineal Research, 33, 61. Revel, F. G., Ansel, L., Klosen, P., Saboureau, M., Pevet, P., Mikkelsen, J. D., et al. (2007). Kisspeptin: A key link to seasonal breeding. Reviews in Endocrine and Metabolic Disorders, 8(1), 57–65. Revel, F. G., Saboureau, M., Masson-Pevet, M., Pevet, P., Mikkelsen, J. D., & Simonneaux, V. (2006). Kisspeptin mediates the photoperiodic control of reproduction in hamsters. Current Biology, 16, 1730–1735. Ripperger, J. A., & Schibler, U. (2006). Rhythmic CLOCK-BMAL1 binding to multiple E-box motifs drives circadian Dbp transcription and chromatin transitions. Nature Reviews: Genetics, 38, 369–374. Robinson, N., Yateman, N. A., Protopapa, L. E., & Bush, L. (1990). Employment problems and diabetes. Diabetic Medicine, 7(1), 16–22.
8/18/09 5:11:37 PM
References 79 Roenneberg, T., Daan, S., & Merrow, M. (2003). The art of entrainment. Journal of Biological Rhythms, 18, 183–194. Roenneberg, T., & Merrow, M. (2002). “What watch? . . such much!” Complexity and evolution of circadian clocks. Cell Tissue Research, 309, 3–9. Rosenthal, N. E., Sack, D. A., Gillin, J. C., Lewy, A. J., Goodwin, F. K., Davenport, Y., et al. (1984). Seasonal affective disorder. A description of the syndrome and preliminary findings with light therapy. Archives of General Psychiatry, 41(1), 72–80. Roth, T. C., & Pravosudov, V. V. (2008). Hippocampal volumes and neuron numbers increase along a gradient of environmental harshness: A large-scale comparison. Proceedings of the Royal Society B, Published online doi:10.1098/rsbp2008.1184. Rowland, N. (1976). Endogenous circadian rhythms in rats recovered from lateral hypothalamic lesions. Physiological Behavior, 16, 257–266. Ruby, N. F., Brennan, T. J., Xie, X., Cao, V., Franken, P., Heller, H. C., et al. (2002). Role of melanopsin in circadian responses to light. Science, 298, 2211–2213. Rusak, B., & Zucker, I. (1979). Neural regulation of circadian rhythms. Physiological Review, 59, 449–526.
Sills, T. L., & Vaccarino, F. J. (1996). Individual differences in the feeding response to CCKB antagonists: Role of the nucleus accumbens. Peptides, 17, 593–599. Silver, R., Lehman, M. N., Gibson, M., Gladstone, W. R., & Bittman, E. L. (1990). Dispersed cell suspensions of fetal SCN restore circadian rhythmicity in SCN-lesioned adult hamsters. Brain Research, 525, 45–58. Silver, R., LeSauter, J., Tresco, P. A., & Lehman, M. N. (1996). A diffusible coupling signal from the transplanted suprachiasmatic nucleus controlling circadian locomotor rhythms. Nature, 382, 810–813. Simansky, K. J. (1996). Serotonergic control of the organization of feeding and satiety. Behavioral Brain Research, 73(1–2), 37–42. Skinner, D. C., & Caraty, A. (2002). Measurement and possible function of GnRH in cerebrospinal fluid in ewes. Reproduction Supplement, 59, 25–39.
Sahu, A. (2004). Minireview: A hypothalamic role in energy balance with special emphasis on leptin. Endocrinology, 145, 2613–2620.
Skinner, D. C., & Malpaux, B. (1999). High melatonin concentrations in third ventricular cerebrospinal fluid are not due to Galen vein blood recirculating through the choroid plexus. Endocrinology, 140, 4399–4405.
Sanborn, C., Currie, A. C., & Bailey, C. C. (1982). Shift work: How to adjust patterns of diabetes care. Occupational Health Nurses, 30(12), 25–28.
Skipper, J. K., Jr., Jung, F. D., & Coffey, L. C. (1990). Nurses and shiftwork: Effects on physical health and mental depression. Journal of Advanced Nursing, 15, 835–842.
Satoh, N., Ogawa, Y., Katsuura, G., Hayase, M., Tsuji, T., Imagawa, K., et al. (1997). The arcuate nucleus as a primary site of satiety effect of leption in rats. Neuroscience Letter, 224(3), 149–152.
Smith, G. P., & Gibbs, J. (1985). The satiety effect of cholecystokin. Recent progress and current problems. Annals of the New York Academy of Sciences, 448, 417–423.
Schick, R. R., Samsami, S., Zimmermann, J. P., Eberl, T., Endres, C., Schusdziarra, V., et al. (1993). Effect of galanin on food intake in rats: Involvement of lateral and ventromedial hypothalamic sites. American Journal of Physiology, 264(2 Pt 2), R355–R361.
Smulders, T. V., Sasson, A. D., & DeVoogd, T. J. (1995). Seasonal variation in hippocampal volume in a food-storing bird, the black-capped chickadee. Journal of Neurobiology, 27, 15–25.
Schlinger, B. A., Soma, K. K., & London, S. E. (2001). Neurosteroids and brain sexual differentiation. Trends in Neuroscience, 24, 429–431. Schulz, H., & Lavie, P. (1985). Ultradian rhythms in physiology and behavior. Berlin, Germany: Springer Verlag. Scott, C. J., Jansen, H. T., Kao, C. C., Kuehl, D. E., & Jackson, G. L. (1995). Disruption of reproductive rhythms and patterns of melatonin and prolactin secretion following bilateral lesions of the suprachiasmatic nuclei in the ewe. Journal of Neuroendocrinology, 7, 429–443. Scotti, M.-A., L., Belén, J., Jackson, J.E., & Demas, G.E. (2008). The role of androgens in the mediation of seasonal territorial aggression in male Siberian hamsters (Phodopus sungorus), Physiology & Behavior, 95(5), 633–640. Segawa, K., Nakazawa, S., Tsukamoto, Y., Kurita, Y., Goto, H., Fukui, A., et al. (1987). Peptic ulcer is prevalent among shift workers. Digestive Diseases and Sciences, 32, 449–453. Sherry, D. F., Forbes, M. R., Khurgel, M., & Ivy, G. O. (1993). Females have a larger hippocampus than males in the brood-parasitic brownheaded cowbird. Proceedings of the National Academy of Sciences, USA, 90, 7839–7843. Sherry, D. F., Jacobs, L. F., & Gaulin, S. J. (1992). Spatial memory and adaptive specialization of the hippocampus. Trends in Neuroscience, 15, 298–303. Shibata, S., Oomura, Y., Kita, H., & Hattori, K. (1982). Circadian rhythmic changes of neuronal activity in the suprachiasmatic nucleus of the rat hypothalamic slice. Brain Research, 247, 154–158. Shinohara, K., Honma, S., Katsuno, Y., Abe, H., & Honma, K. (1994). Circadian rhythms in the release of vasoactive intestinal polypeptide and arginine-vasopressin in organotypic slice culture of rat suprachiasmatic nucleus. Neuroscience Letter, 170, 183–186.
c04.indd Sec4:79
Sigmon, S. T., Pells, J. J., Schartel, J. G., Hermann, B. A., Edenfield, T. M., LaMattina, S. M., et al. (2007). Stress reactivity and coping in seasonal and nonseasonal depression. Behavioral Research Therapy, 45, 965–975.
Soma, K. K., Alday, N. A., Hau, M., & Schlinger, B. A. (2004). Dehydroepiandrosterone metabolism by 3beta-hydroxysteroid dehydrogenase/Delta5-Delta4 isomerase in adult zebra finch brain: Sex difference and rapid effect of stress. Endocrinology, 145, 1668–1677. Stanley, B. G., Daniel, D. R., Chin, A. S., & Leibowitz, S. F. (1985). Paraventricular nucleus injections of peptide, Y. Y., and neuropeptide Y preferentially enhance carbohydrate ingestion. Peptides, 6, 1205–1211. Stanley, B. G., Ha, L. H., Spears, L. C., & Dee, M. G. II. (1993). Lateral hypothalamic injections of glutamate, kainic acid, D., L-alpha-amino3-hydroxy-5-methyl-isoxazole propionic acid or N-methyl-D-aspartic acid rapidly elicit intense transient eating in rats. Brain Research, 613, 88–95. Stanley, B. G., & Leibowitz, S. F. (1985). Neuropeptide Y injected in the paraventricular hypothalamus: A powerful stimulant of feeding behavior. Proceedings of the National Academy of Sciences, USA, 82, 3940–3943. Stephan, F. K., Berkley, K. J., & Moss, R. L. (1981). Efferent connections of the rat suprachiasmatic nucleus. Neuroscience, 6, 2625–2641. Stephan, F. K., & Zucker, I. (1972). Rat drinking rhythms: Central visual pathways and endocrine factors mediating responsiveness to environmental illumination. Physiological Behavior, 8, 315–326. Stokkan, K. A., Yamazaki, S., Tei, H., Sakaki, Y., & Menaker, M. (2001). Entrainment of the circadian clock in the liver by feeding. Science, 291, 490–493. Storch, K. F., Lipan, O., Leykin, I., Viswanathan, N., Davis, F. C., Wong, W. H., et al. (2002). Extensive and divergent circadian gene expression in liver and heart. Nature, 417, 78–83. Swann, J. M., & Turek, F. W. (1985). Multiple circadian oscillators regulate the timing of behavioral and endocrine rhythms in female golden hamsters. Science, 228, 898–900.
8/18/09 5:11:37 PM
80
Biological Rhythms
Swann, J. M., & Turek, F. W. (1988). Transfer from long to short days reduces the frequency of pulsatile luteinizing hormone release in intact but not in castrated male golden hamsters. Neuroendocrinology, 47, 343–349.
Vasudevan, N., & Pfaff, D. W. (2007). Membrane-initiated actions of estrogens in neuroendocrinology: Emerging principles. Endocrine Reviews, 28, 1–19.
Takimoto, C. H. (2006). Chronomodulated chemotherapy for colorectal cancer: Failing the test of time? European Journal of Cancer, 42, 574–581.
Venuta, M., Barzaghi, L., Cavalieri, C., Gamberoni, T., & Guaraldi, G. P. (1999). [Effects of shift work on the quality of sleep and psychological health based on a sample of professional nurses]. Giornale Italiano di Medicina del Lavoro ed Ergonomia, 21, 221–225.
Tessonneaud, A., Locatelli, A., Caldani, M., & Viguier-Martinez, M. C. (1995). Bilateral lesions of the suprachiasmatic nuclei alter the nocturnal melatonin secretion in sheep. Journal of Neuroendocrinology, 7, 145–152.
Vielhaber, E., Eide, E., Rivers, A., Gao, Z. H., & Virshup, D. M. (2000). Nuclear entry of the circadian regulator mPER1 is controlled by mammalian casein kinase I epsilon. Molecular and Cellular Biology, 20, 4888–4899.
Thrun, L. A., Moenter, S. M., O’Callaghan, D., Woodfill, C. J., & Karsch, F. J. (1995). Circannual alterations in the circadian rhythm of melatonin secretion. Journal of Biological Rhythms, 10, 42–54.
Vitaterna, M. H., Takahashi, J. S., & Turek, F. W. (2001). Overview of circadian rhythms. Alcohol Research and Health, 25(2), 85–93.
Toh, K. L., Jones, C. R., He, Y., Eide, E. J., Hinz, W. A., Virshup, D. M., et al. (2001). An hPer2 phosphorylation site mutation in familial advanced sleep phase syndrome. Science, 291, 1040–1043. Trainor, B. C., Lin, S., Finy, M. S., Rowland, M. R., & Nelson, R. J. (2007). Photoperiod reverses the effects of estrogens on male aggression via genomic and nongenomic pathways. Proceedings of the National Academy of Sciences, USA, 104, 9840–9845. Tramontin, A. D., & Brenowitz, E. A. (2000). Seasonal plasticity in the adult brain. Trends in Neuroscience, 23, 251–258. Tricoire, H., Moller, M., Chemineau, P., & Malpaux, B. (2003). Origin of cerebrospinal fluid melatonin and possible function in the integration of photoperiod. Reproductive Suppl, 61, 311–321. Turek, F. W. (1977). The interaction of the photoperiod and testosterone in regulating serum gonadotropin levels in castrated male hamsters. Endocrinology, 101, 1210–1215. Turek, F. W., Joshu, C., Kohsaka, A., Lin, E., Ivanova, G., McDearmon, E., et al. (2005). Obesity and metabolic syndrome in circadian clock mutant mice. Science, 308. 1043–1045. Turek, F. W., & Zee, P. C. (1999). Regulation of sleep and circadian rhythms. In C. Lenfant (Ed.), Lung Biology in Health and Disease, (pp. 1–19) New York: Informa Health Care. Ueda, H. R., Hayashi, S., Chen, W., Sano, M., Machida, M., Shigeyoshi, Y., et al. (2005). System-level identification of transcriptional circuits underlying mammalian circadian clocks. Nature Genetics, 37(2), 187–192. Van Cauter, E., & Refetoff, S. (1985). Multifactorial control of the 24-hour secretory profiles of pituitary hormones. Journal of Endocrinological Investigation, 8, 381–391. Van den Pol. A. N., & Powley, T. (1979). A fine-grained anatomical analysis of the role of the rat suprachiasmatic nucleus in circadian rhythms of feeding and drinking. Brain Research, 160, 307–326. Van der Beek, E. M., Horvath, T. L., Wiegant, V. M., Van den Hurk, R., & Buijs, R. M. (1997). Evidence for a direct neuronal pathway from the suprachiasmatic nucleus to the gonadotropin-releasing hormone system: Combined tracing and light and electron microscopic immunocytochemical studies. Journal of Comparative Neurology, 384, 569–579. Van der Beek, E. M., Wiegant, V. M., van der Donk, H. A., van den Hurk, R., & Buijs, R. M. (1993). Lesions of the suprachiasmatic nucleus indicate the presence of a direct vasoactive intestinal polypeptide-containing projection to gonadotrophin-releasing hormone neurons in the female rat. Journal of Neuroendocrinology, 5, 137–144. Van der Veen, D. R., Minh, N. L., Gos, P., Arneric, M., Gerkema, M. P., & Schibler, U. (2006). Impact of behavior on central and peripheral circadian clocks in the common vole Microtus arvalis, a mammal with ultradian rhythms. Proceedings of the National Academy of Sciences, USA, 103, 3393–3398. Van Gelder, R. N. (2001). Non-visual ocular photoreception. Ophthalmic Genetics, 22, 195–205.
c04.indd Sec4:80
Voss, U. (2004). Functions of sleep architecture and the concept of protective fields. Review of Neuroscience, 15, 33–46. Vrang, N., Larsen, P. J., & Mikkelsen, J. D. (1995). Direct projection from the suprachiasmatic nucleus to hypophysiotrophic corticotropin-releasing factor immunoreactive cells in the paraventricular nucleus of the hypothalamus demonstrated by means of Phaseolus vulgaris-leucoagglutinin tract tracing. Brain Research, 684, 61–69. Wang, H., Ko, C. H., Koletar, M. M., Ralph, M. R., & Yeomans, J. (2007). Casein kinase I epsilon gene transfer into the suprachiasmatic nucleus via electroporation lengthens circadian periods of tau mutant hamsters. European Journal of Neuroscience, 25, 3359–3366. Wang, H., Sekine, M., Chen, X., & Kagamimori, S. (2002). A study of weekly and seasonal variation of stroke onset. International Journal of Biometeorology, 47, 13–20. Wang, Y., Levi, C. R., Attia, J. R., D’Este, C. A., Spratt, N., & Fisher, J. (2003). Seasonal variation in stroke in the Hunter Region, Australia: A 5-year hospital-based study, 1995–2000. Stroke, 34, 1144–1150. Watanabe, K., Koibuchi, N., Ohtake, H., & Yamaoka, S. (1993). Circadian rhythms of vasopressin release in primary cultures of rat suprachiasmatic nucleus. Brain Research, 624, 115–120. Watts, A. G., & Swanson, L. W. (1987). Efferent projections of the suprachiasmatic nucleus: II. Studies using retrograde transport of fluorescent dyes and simultaneous peptide immunohistochemistry in the rat. Journal of Comparative Neurology, 258, 230–252. Weaver, D. R. (1998). The suprachiasmatic nucleus: A 25-year retrospective. Journal of Biological Rhythms, 13, 100–112. Weaver, D. R., Carlson, L. L., & Reppert, S. M. (1990). Melatonin receptors and signal transduction in melatonin-sensitive and melatonininsensitive populations of white-footed mice (Peromyscus leucopus). Brain Research, 506, 353–357. Wehr, T. A. (1998). Effect of seasonal changes in daylength on human neuroendocrine function. Hormones and Research, 49(3–4), 118–124. Weiler, E. (1992). Seasonal changes in adult mammalian brain weight. Naturwissenschaften, 79, 474–476. Weingarten, H. P., Chang, P. K., & McDonald, T. J. (1985). Comparison of the metabolic and behavioral disturbances following paraventricularand ventromedial-hypothalamic lesions. Brain Research Bulletin, 14, 551–559. Weitzman, E. D., Fukushima, D., Nogeire, C., Roffwarg, H., Gallagher, T. F., & Hellman, L. (1971). Twenty-four hour pattern of the episodic secretion of cortisol in normal subjects. Journal of Clinical Endocrinology and Metabolism, 33, 14–22. Welsh, D. (2004). Single cell circadian rhythms of luminescence in cultures from MPER2-LUC knockin mice. 74, 18. SRBR annual meeting. British Columbia, Canada: Whistler. Welsh, D. K., Logothetis, D. E., Meister, M., & Reppert, S. M. (1995). Individual neurons dissociated from rat suprachiasmatic nucleus express independently phased circadian firing rhythms. Neuron, 14, 697–706.
8/18/09 5:11:38 PM
References 81 Wen, J. C., Hotchkiss, A. K., Demas, G. E., & Nelson, R. J. (2004). Photoperiod affects neuronal nitric oxide synthase and aggressive behaviour in male Siberian hamsters (Phodopus sungorus). Journal of Neuroendocrinology, 16, 916–921. Wingfield, J. C., Jacobs, J., & Hillgarth, N. (1997). Ecological constraints and the evolution of hormone-behavior interrelationships. Annal of New York Academy of Science, 807, 22–41. Woelfle, M. A., Ouyang, Y., Phanvijhitsiri, K., & Johnson, C. H. (2004). The adaptive value of circadian clocks: An experimental assessment in cyanobacteria. Current Biology, 14, 1481–1486. Wolk, R., Gami, A. S., Garcia-Touchard, A., & Somers, V. K. (2005). Sleep and cardiovascular disease. Current Problems in Cardiology, 30, 625–662. Wright, K. P., Jr., Hull, J. T., Hughes, R. J., Ronda, J. M., & Czeisler, C. A. (2006). Sleep and wakefulness out of phase with internal biological time impairs learning in humans. Journal of Cognitive Neuroscience, 18, 508–521. Xiong, J. J., Karsch, F. J., & Lehman, M. N. (1997). Evidence for seasonal plasticity in the gonadotropin-releasing hormone (GnRH) system of the ewe: Changes in synaptic inputs onto GnRH neurons. Endocrinology, 138, 1240–1250. Yagita, K., Tamanini, F., van Der Horst, G. T., & Okamura, H. (2001). Molecular mechanisms of the biological clock in cultured fibroblasts. Science, 292, 278–281.
c04.indd Sec4:81
Yamamoto, H., Nagai, K., & Nakagawa, H. (1987). Role of SCN in daily rhythms of plasma glucose, FFA, insulin and glucagon. Chronobiology International, 4, 483–491. Yamazaki, S., Numano, R., Abe, M., Hida, A., Takahashi, R., Ueda, M., et al. (2000). Resetting central and peripheral circadian oscillators in transgenic rats. Science, 288, 682–685. Yan, L., Foley, N. C., Bobula, J. M., Kriegsfeld, L. J., & Silver, R. (2005). Two antiphase oscillations occur in each suprachiasmatic nucleus of behaviorally split hamsters. Journal of Neuroscience, 25, 9017–9026. Yaskin, V. (1984). Seasonal changes in brain morphology in small mammals. Pittsburgh, PA: Carnegie Museum of Natural History. Yoo, S. H., Yamazaki, S., Lowrey, P. L., Shimomura, K., Ko, C. H., Buhr, E. D., et al. (2004). PERIOD2::LUCIFERASE real-time reporting of circadian dynamics reveals persistent circadian oscillations in mouse peripheral tissues. Proceedings of the National Academy of Sciences, USA, 101, 5339–5346. Zucker, I., & Boshes, M. (1982). Circannual body weight rhythms of ground squirrels: Role of gonadal hormones. American Journal of Physiology, 243, R546–R551.
8/18/09 5:11:38 PM
Chapter 5
NEUROPHARMACOLOGY GARY L. WENK AND YANNICK MARCHALANT
given to people at this time produced more harm than good.) Finally, the term came to be associated with drugs that alter the mind in some manner, either positively or negatively.
This chapter is about drugs that affect the brain, and its purpose is to demonstrate that we can use our understanding of how drugs work in order to gain a better appreciation of how the brain works. Drugs alter brain function; therefore, it is possible to learn about normal brain function by studying how selected drugs with specific actions alter behavior and mental function (Wenk, 2003). Neuropharmacology is a discipline that uses drugs as tools to study the brain and better understand its functions. For our purposes, a drug is anything that we take into the body that ultimately affects brain function. This includes licit and illicit drugs, overthe-counter drugs, as well as nutrients such as vitamins, herbs, chocolate, and caffeine. The distinction between what is considered a nutrient, that is, something that the body needs, and a drug, that is, something that the mind might crave, has become quite blurred. Many people consider nicotine, chocolate, and caffeine essential parts of their daily diets. This chapter is organized according to the anatomy of the brain rather than according to the classes of the drugs. Each section begins with a brief discussion of the anatomy of the particular neurotransmitter system, the role of this system in specific brain regions, and how it is possible to manipulate brain function through the production, release, and inactivation of drugs.
Drugs that affect the brain are fundamental to most cultures and the routine use of stimulants and depressants is so omnipresent that most of us don’t even consider such substances to be drugs, but rather actual nutrients. Indeed, the distinction between drug and nutrient becomes more blurred with each generation. Many people cannot get through the day without the assistance of coffee, tea, tobacco, alcohol, cocoa, or marijuana. Throughout these pages, I will regard anything you take into your body as a drug, whether it’s obviously nutritious or not. As you will see, even molecules that are obviously nutrients, such as essential amino acids, have properties that can be ascribed to a psychoactive drug. Like nutrients, many drugs that affect brain function come from plants that grow all around us. Plant alkaloids (substances containing nitrogen and carbon)—the active ingredients of plants—are related structurally to the neurotransmitters your body uses, and so they can interact with the receptors on your neurons to influence brain function. This is a very important principle that will emerge as you read this chapter: A drug acts on the brain only if it in some way resembles an actual neurotransmitter, or if it is able to interact with an essential biochemical process in the brain that influences the production, release, or inactivation of a neurotransmitter. Plant alkaloids are essentially modified amino acids similar to those used by the brain and body. They resemble the natural chemicals produced and used by the brain. People in ancient cultures were very aware of these natural plant products and their unique properties; they often sought them out for remedies for a variety of illnesses. As a distinction, we will consider substances in plants that do not affect the brain either inert or nutritious.
All of the drugs discussed in this chapter affect the brain and therefore behavior. Because of this property, they are called psychoactive, a term that first appeared in 1548 in the title to a collection of prayers of comfort for the dead by Reinhard Lorichius, titled: “Psychopharmakon, Hoc Est: Medicina Animae.” Within these prayers, the term refers to a type of spiritual medicine that was used principally in miserable or hopeless situations near the end of life. The Greek word pharmakos referred to a human scapegoat—the person who was sacrificed, usually figuratively but sometimes literally by ritual stoning, as a remedy for the illness of another person, usually someone far more important in the society. Later, around 600 b.c. or so, the term came to refer to the drug, or poison, that was being given rather than the person. (Many of the psychoactive drugs
Ancient cultures’ use of plant extracts as medicines to affect the mind was likely the beginning of our concept of how our brain functions. The earliest cultures believed mental illness was something caused by evil spirits or was 82
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c05.indd 82
8/17/09 2:00:11 PM
Principles of Neuropharmacology
a punishment delivered by an angry deity. In the middle part of the twentieth century, effective tranquilizers were introduced for the treatment of the mentally ill. The realization that it might be possible to cure mental illness in the same way that one cured physical illness was slow to gain general approval because of the wide-ranging, and for some quite frightening, implications about what this meant regarding the nature of the human brain. In the future, drugs will likely be used to do more than simply correct neurochemical imbalances but will actually be used to enhance our cognitive abilities. The reasons underlying the decision to take neuropharmaceuticals are likely as varied as the number of people using them. For example, people may take powerful antidepressant drugs when they cannot cope with the world as an imperfect place. Sometimes drugs are taken by people who are bored or distressed by tedious tasks. Soldiers have a long history of taking drugs to lessen the impact of long periods of intense terror. Before considering the neuropharmacology of specific neural systems and brain regions, some basic principles must be considered. First, drugs that affect the brain should not be viewed as being either good or bad. They are simply chemicals—no more, no less. They have actions within the brain that we desire or would like to avoid. Second, every drug has multiple effects. Because the brain is so complex and because drugs act in many different areas of the brain and body at the same time, they will often have many different effects on brain function and behavior. Third, the effects of a drug on the brain will always depend on the amount consumed or injected. Varying the dose of any particular drug will change the magnitude and the character of the effects of the drug. This principle is called the dose-response effect. In general, greater doses lead to greater, or sometimes completely opposite, effects. Finally, the effects of a drug on the brain are greatly influenced by the individual’s genetic and drug-taking experience and the expectations that the person has about the consequences of the drug-taking experience. For example, if you respond strongly to one drug, you’re likely to respond strongly to many drugs, and this trait is likely shared by one of your parents. Also, if you expect that a drug will act in a certain way on your brain and behavior, then it is much more likely to do so; this is referred to as the placebo effect. It’s ironic that the brain is the organ that decides for itself how it will experience the drug (for a more detailed discussion on placebo effects, see Chapter 63).
PRINCIPLES OF NEUROPHARMACOLOGY The part of the body where the drug acts to produce its effect is called the site of action. Drugs differ in their sites of action. This chapter focuses on those drugs that directly
c05.indd Sec1:83
83
influence neural function. Often the behavioral effects of a drug provide clues to its site of action within the brain. For example, drugs that affect sleep usually alter activity in the reticular activating system. Another clue to site of action is afforded by the unequal distribution of neurotransmitters in the brain (Figure 5.1). For example, dopamine is highly concentrated in the basal ganglia, which controls movement. Therefore, administration of drugs that affect the dopamine system may affect the control of movement. Many drugs that might potentially influence brain function are never able to enter the central nervous system due to the presence of barriers; the most important is the blood-brain barrier, which is made up of many parts, the most important being a special type of capillaries. These capillaries have tight junctions between each other, have no fenestra, do not perform pinocytosis, exhibit a thick basement membrane made of an amorphous mucopolysaccharide and finally are covered by astrocyte processes that juxtapose on the basement membrane. These features allow the entry of drugs that are lipid soluble. The relative affinity of a drug for either lipid or water environments is known as its partition coefficient. The partition coefficient plays an important role in how drugs affect brain function. Very lipid-soluble drugs enter the brain very rapidly; they also tend to exit the brain rapidly, which limits the duration of their action (Meyer & Quenzer, 2005). Once a drug has entered the brain, its site of action is often a receptor protein within the synapse. The brain’s response to a drug is proportional to the fraction of receptors occupied. Drugs that bind to receptors and produce a pharmacological action are called agonists; drugs that bind to receptors but produce no pharmacological action are called antagonists. Receptors for several neurotransmitters, for example, acetylcholine, gamma-amino butyric acid (GABA), and glycine, have homologous structures and may share an evolutionary ancestor. An understanding of the evolution of the neurotransmitters and their receptors sometimes gives clues to their function. Once the drug has interacted with its respective receptor protein, its actions are terminated either enzymatically or by simple diffusion away from the synapse. Most neuroactive compounds are not transported into neuronal terminals. Some examples for selected neurotransmitter systems are discussed later. The removal of a drug from the brain is frequently accompanied by biological and behavioral changes that are opposite to those produced by the drug; that is, the brain always “pushes back.” For example, the euphoria induced by cocaine and amphetamine is often a prelude to severe depression. Many biological factors such as age and weight play a crucial role in how drugs affect the brain, and influence behavior and personality, an emergent property of the brain. This concept was probably best described by Wilder in his Law of Initial
8/17/09 2:00:11 PM
84
Neuropharmacology Corpus Callosum
P
A D N Brain Stem S
Cerebellum
Figure 5.1 Schematic anatomy of neurotransmitter systems. Note: A = Acetylcholine neurons within the basal forebrain region project topographically to the cortex, hippocampus, amygdala, and olfactory bulbs; D = Dopamine neurons within the rostral midbrain project into the ipsilateral striatum and frontal cortex; N = Norepinephrine neurons originate within a small region, the locus coeruleus, in the floor of the fourth ventricle and project into virtually all regions of the ipsilateral hemisphere, with the exception of the basal ganglia; S = Serotonin neurons originate with a scattered group of nuclei that lie along the midline of the pons and medulla and project both caudally into the brain stem and rostrally into all regions of the brain; P = Peptide-containing neurons tend to be more diffusely scattered as interneurons, although there are notable exceptions.
Value (Wilder, 1958). Each person has an initial level of excitation; the degree of response to a drug depends on this initial level. For example, euphoria is observed in patients suffering from pain, anxiety, or tension when they are given small doses of morphine. In contrast, a similar dose given to a happy, pain-free individual often precipitates mild anxiety and fear. Catatonic patients may respond with a burst of animation and spontaneity to an intravenous injection of barbiturates. Sedative drugs create more anxiety in outgoing, athletic people, as compared to passive, intellectual types. In the remainder of this chapter, we examine the intersection of neuroanatomy, neurochemistry, and neuropharmacology, beginning with the first neurotransmitter system discovered, acetylcholine.
ACETYLCHOLINE Acetylcholine is an important neurotransmitter for many species within all kingdoms. The precursors of acetylcholine synthesis, choline and coenzyme A, have been found in both prokaryotes and eukaryotes, including a strain of Pseudomonas fluorescens isolated from the juice of fermenting cucumbers as well as in the blue-green algae, Oscillatoria
c05.indd Sec1:84
agardhii, where it may be involved with photosynthesis. Acetylcholine stimulates silk production in spiders and limb regeneration in salamanders (Venter et al., 1988). Therefore, it is difficult to ascribe a particular function to this molecule in nature. Acetylcholine neurons play an important role within the parasympathetic and sympathetic nervous system; their role in the periphery accounts for many significant side-effect profiles of drugs that also affect central cholinergic cells. Within the brain, there are numerous cholinergic systems; however, two are particularly important: the basal forebrain cholinergic system, which projects topographically to the cortex, hippocampus, and other limbic structures that influence memory, attention, and mood; and the intra-striatal collection of short-axon interneurons that influence control of movement (for review, see Wenk, 1997; see Chapter 17). Cholinergic neurons in the basal forebrain region that innervate the hippocampus and cortex are vulnerable to degenerative processes associated with Alzheimer ’s disease (AD) and may become dysfunctional during the early stages of the disease process (Davis et al., 1999; Whitehouse, Price, Clark, Coyle, & DeLong, 1981). The extent to which this neurotransmitter system is impaired may correlate with the severity of selected cognitive symptoms associated with dementia. For example, dysfunction of cholinergic input to the cortex may contribute to a deficit in attentional abilities (Sarter, Gehring, & Kozak, 2006); alterations in the projection to the central nucleus of the amygdala may underlie emotional changes (Power, Vazdarjanova, & McGaugh, 2003); and the dysfunction of cholinergic inputs to the hippocampus clearly underlies the presence of amnesia (Olton, Wenk, Church, & Meck, 1988; Wenk, 2007). A deficit in cholinergic biomarkers, including a decline in level of cholinergic synthetic enzyme, choline acetyltransferase activity, transmitter production, and release are commonly reported biochemical changes within the brains of patients with AD. It is important to recognize that the loss of these biomarkers does not herald the death of the neuron; an injured neuron will often reduce the production of its luxury systems related to neurotransmitter function in preference to biochemical processes that are essential for recovery. The persistence of the intact neurons offers an opportunity to rescue them from continued degeneration. As such, experimental manipulation of the functional integrity of cholinergic neurons in the basal forebrain of young rats has been used as an animal model for this component of AD pathology (Wenk, 2006; Wenk et al., 1994). Moreover, drug therapies designed to attenuate memory deficits associated with AD have focused on alleviating these impairments in the cholinergic synaptic function. Similar approaches have been used to compensate for presumed impairments in other neurotransmitter systems.
8/17/09 2:00:11 PM
Acetylcholine
These neurons synthesize acetylcholine from choline, obtained from the diet and acetyl groups that originate in mitochondria from the metabolism of glucose, and are transported into the cytoplasm attached to coenzyme A. The synthesis occurs within the cytoplasm, and the product is stored in synaptic vesicles or loosely bound to the cytoplasmic membrane for fast release. The vesicular pool is released only when the cytoplasmic pool is expended. Production via the enzyme choline acetyltransferase is controlled by end-production inhibition; the availability of choline and acetyl moiety is not rate limiting under normal conditions. Therefore, administration of choline via the diet does not increase acetylcholine production or release (Wild & Benzel, 1994). Once released, acetylcholine’s action within the synapse is terminated by the acetylcholinesterase enzyme; about 40% of the choline produced is actively taken up into the terminal to be reused again for synthesis of acetylcholine. The blockade of choline uptake by hemicholinium-3 is lethal because the reduced acetylcholine synthesis produces an imbalance in autonomic function and the loss of the ability of the motor neurons to contract the diaphragm; in contrast, this drug has no behavioral effects because it does not cross the blood-brain barrier. The release of acetylcholine from the presynaptic terminal can be inhibited by the toxin released from the Clostridium botulinum bacteria; death is caused by loss of diaphragmatic contraction, leading to asphyxiation. Once released, acetylcholine can act on two quite different protein receptors that have been designated (as have most neuropharmaceuticals) according to the compounds that were originally used to manipulate them, that is, nicotine and muscarine. Nicotinic receptors are directly coupled to a sodium-conducting channel and produce a rapid increase in sodium ion conductance and depolarization of the postsynaptic membrane. In contrast, muscarinic receptor stimulation leads to slower depolarization or hyperpolarization depending on the nature of the secondary messengers. Muscarinic receptors have been further subdivided (as have most other neurotransmitter receptors) according to their affinities for various newly investigated drugs. These receptors have not been found within kingdoms for protoctista and fungi although acetylcholine is produced by some members. These findings have led to some intriguing hypotheses regarding the evolution of receptors and the limitation of our chemical tools to investigate them. For example, nicotinic and muscarinic receptors have been found in peanut worms (whose fossils date back 500 million years), spoon worms, leeches, and earthworms. However, there is no evidence that the two receptors are related; muscarinic and nicotinic receptors differ in size, structure, and mechanism of action. Most of the acetylcholine receptors in the brain are muscarinic, while less
c05.indd Sec2:85
85
than 10% are nicotinic (Cooper, Bloom, & Roth, 2002). Yet, stimulation of nicotinic receptors is clearly much more rewarding than stimulation of muscarinic receptors. The explanation for their divergent consequences underlies a basic principle in neuropharmacology that mimics the real estate industry: location is all that matters. Nicotine is a very potent agonist; as little as 60 mg can be fatal to an adult. Curare, a resinous extract of the plants Chondrodendron tomentosum and Strychnos toxifera from the Orinoco and Amazon basins in South America, is an antagonist at the nicotinic-type acetylcholine receptor. Because it does not cross the blood-brain barrier, its actions are expressed primarily on the autonomic ganglia and at the neuromuscular synapse. The drug is lethal because it blocks the neuromuscular nicotinic receptors located on the diaphragm that allow breathing; therefore, death is by asphyxiation. Muscarine, carbechol, and oxotremorine are agonists at the muscarinic-type acetylcholine receptors; atropine and scopolamine are antagonists at this receptor. Once acetylcholine is released into the synapse, its action is terminated by the enzyme acetylcholinesterase. This enzyme is produced by cholinergic neurons and released from the cytoplasm into the extracellular space where it is found in high concentration and sometimes taken up into noncholinergic nerve terminals. This enzyme can very quickly inactivate synaptic acetylcholine at the rate of approximately 25,000 molecules per second. Thus, even its partial inhibition will have a profound effect on synaptic levels of acetylcholine and postsynaptic stimulation of acetylcholine receptors. Physostigmine is a reversible inhibitor of this enzyme; because acetylcholine is usually inactivated by this enzyme, its levels increase quickly within the synapse. In the presence of physostigmine, or any other acetylcholinesterase inhibitor, acetylcholine will either diffuse out of the synapse or be catabolized by other esterase enzymes, such as butylcholinesterase. The widespread distribution of neuronal systems that use acetylcholine indicates that these systems play a significant role in many brain functions. Its role in neuroplasticity has been best studied. The blockade of the muscarinic receptors within the brain by scopolamine impairs memory and produces mental confusion due to its actions within the hippocampus and neocortex. (A not very amusing side note: Some news reports claim that thieves sometimes add scopolamine to chewing gum, chocolate, or drinks of unsuspecting people, or blow it into their faces, to immobilize and then rob them.) In contrast, drugs that enhance the action of acetylcholine at muscarinic receptors by preventing its catabolism, for example, physostigmine, may enhance memory and attentional abilities. Several plants that grow wild contain scopolamine, atropine, or related molecules, such as jimson weed (Datura stramonium),
8/17/09 2:00:12 PM
86
Neuropharmacology
henbane (Hyoscyamus niger), and mandrake (Mandragora officinarum). The antagonism of muscarinic receptors within the neocortex slows neural activity and makes the user drowsy; this action within the hippocampus impairs plasticity. The deadly nightshade, Atropa belladonna, was given its name by Carl von Linné in the eighteenth century to indicate the poisonous nature of this plant. Atropos was one of the Greek “fates,” or “daughters of necessity” who also included Clotho, who spun the thread of life, and Lachesis, who allotted each man his portion of life. Atropos cut the thread of life at the appointed time. Muscarinic receptors are expressed by the smooth muscles that encircle the iris; their antagonism allows the pupils to dilate. The presence of muscarinic receptors within the motor cortex and the basal ganglia explains why scopolamine also produces slurred speech and generally impaired motor abilities. Higher doses of scopolamine can produce feelings of unpleasantness and visual and auditory hallucinations. The hallucinations are of ordinary objects and not mythic or other-worldly like those produced by drugs that directly affect serotonin receptors. This may provide insight into the function of acetylcholine and the influence of the location of muscarinic receptors within the normal brain. Stimulation of muscarinic receptors may produce euphoria and subjective sensory changes. The alkaloid muscarine is present in the mushroom Amanita muscaria; it can produce delirium and hallucinations when eaten. Another agonist of the muscarinic receptor is found in the areca nut of the betel pepper tree of Asia. The active ingredient is arecoline. The nut is used as a mild euphoriant and antitussive (cough suppressant) throughout Southeast Asia; these uses are consistent with the presence of muscarinic acetylcholine receptors with the limbic system and coughing centers of the brain. One of the best-studied agonists of the nicotinic acetylcholine receptor is, of course, nicotine. Nicotine occurs in more than 64 species of plants around the world, including the tobacco plant. The first use of tobacco was to treat persistent headaches, colds, and abscesses and sores on the head. Tobacco emetics were used to treat flatulence, and the smoke was inhaled deeply in order to lessen bad coughs. Jean Nicot sent some tobacco to Catherine de Medici, who was then queen to Henry II of France; she reported that it helped treat her migraines and the plant took on the title of herbe sainte or holy plant. Nicot got credit for the discovery and in 1565 Linnaeus named the genus Nicotiana in his honor. In the 1890s, the U.S. Pharmacopeia dropped nicotine from its list of useful therapeutic agents. The alkaloid nicotine is likely utilized by the tobacco plant as a defense against insects that would express this type of receptor in their body and be dose-dependently vulnerable to its toxicity. Tolerance and dependency develops from
c05.indd Sec2:86
its chronic use. Most cigarettes contain about 1 to 2 mg of nicotine. Because nicotine is quite volatile and heat labile, only about 20% of this dose is actually inhaled into the body; however, due to its exceptional lipid solubility, at least 90% of the inhaled nicotine will be absorbed. Nicotine can be rapidly absorbed from mouth, lungs, or intact skin. Once the smoke is inhaled, it is absorbed via pulmonary alveoli and transported to the brain within 2 to 7 seconds. This makes smoking tobacco as efficient as an intravenous injection in terms of getting nicotine to its site of action within the brain. Nicotine is also quite toxic; 60 mg is considered a lethal dose for a human, and death takes only a few minutes to occur. The actions of nicotine at the synapse are complicated. Initially, at lower doses, nicotine stimulates the receptor; then, at higher doses it induces a depolarization blockade, or inactivation, of these receptors from further stimulation. The extent that desensitization occurs is influenced by the subunit composition of the receptor. Nicotine receptors are composed of five subunits that form an ion channel (Siegel, Albers, Brady, & Price, 2006). The brain expresses at least 12 different nicotinic receptor subunits. Thus, prolonged exposure to nicotine, for example, by continued smoking of tobacco products, would lead to a complex pattern of receptor stimulation and desensitization across brain regions. When this inactivation occurs in the periphery, it is associated with general muscle weakness as the function of the neuromuscular junction is impaired. Chronic exposure to nicotine leads to the rather paradoxical condition of an upregulation in the number of nicotinic receptors that may exist in a desensitized conformational state. The altered synaptic function within the brain leads to tremors. Once again, death is most often due to paralysis of the respiratory muscles. Nicotine affects cortical function in a complex dose-dependent fashion; low doses activate the left hemisphere and stimulate activity, while high doses activate the right hemisphere and are associated with sedative effects. Therefore, when doing boring tasks, a low dose of nicotine can increase subjective arousal. In contrast, during anxious or stressful situations, smokers may actually reduce subjective stress by activating the right hemisphere and producing sedation. Sixty percent of adults with attention deficit disorder are smokers, compared to less than 30% of the rest of the population, implying that these adults are finding a pharmacological substitute for their childhood medications. Nicotine also produces a dose-related self-report of euphoria that is most pronounced following overnight abstinence. This may explain why heavy smokers like to light up as soon as they awake. Throughout the day, smokers carefully, and probably unconsciously, control the amount of nicotine that reaches the brain by the number of cigarettes they use per hour, by altering the rate at which
8/17/09 2:00:12 PM
Catecholamines
they take a puff, and by the volume of their inhalation. This careful titration may optimize the amount of cholinergic receptor stimulation and cortical activation. In 1948, the Journal of the American Medical Association stated that “from a psychological point of view, in all probability more can be said in behalf of smoking as a form of escape from tension than against it.” Today, our perception of nicotine use has been altered principally by the consequences of the “vehicle” for nicotine administration, tobacco. Tobacco causes almost one U.S. death every minute or the equivalent of four major airline crashes daily.
CATECHOLAMINES The neurotransmitter systems considered in this section— dopamine and norepinephrine—are found in both the peripheral nervous system and central nervous system (Figure 5.2). Once again, I discuss the function of these
1
87
two neurotransmitter systems by examining the consequences of stimulation and antagonism of their function by a rather large series of drugs. A consistent pattern of effects emerges demonstrating that dopamine is intimately related to reward and norepinephrine function underlies the components of arousal (for details, see Chapter 40). We know a great deal about the functions of these two neurotransmitter systems primarily because so many drugs have been discovered that can modify their function. Catecholamines are monoamines, that is, they contain at least one nitrogencontaining amine group. The term catechol refers to the presence of a six-carbon ring that has two hydroxy groups attached at adjacent positions to each other. Catecholamines occur extensively throughout nature and have been identified as neurotransmitters in animals as diverse as insects, crustacea, arachnids (spiders), and primates (Venter et al., 1988). Norepinephrine is predominant in the brain and peripheral nervous system in mammals, while dopamine is predominant in species that evolved prior to mollusks.
Blood-brain barrier Ca⫹⫹ 4 5 2 6
Pre-synaptic neuron
3 11 8
Ca⫹⫹ Na⫹
Post-synaptic neuron
7
9
10
Figure 5.2 The life cycle of a typical neurotransmitter. Note: (1) Nutrients, such as amino acids, glucose, fats, and dipeptides, from the diet are transported actively out of the arterial supply to the brain through the blood-brain barrier and, via astrocytes, into the neuron. (2) Enzymatic conversion, for example, occurs by hydroxylation, decarboxylation or esterification, and so on. (3) Active transport occurs into the synaptic vesicles for storage and later release. (4) Activation of the presynaptic neuron leads to the arrival of an action potential that induces the opening of voltage-controlled ion channels, in particular, calcium ion channels; (5) the entering calcium ions induce the fusion of the synaptic vessel to the presynaptic membrane and the release of the neurotransmitter into the synaptic space. (6) The neurotransmitter molecule briefly interacts with the specific proteins on the surface of the postsynaptic cell membrane and induces a large variety of potential responses, for example, opening or closing ion channels, de- or hyperpolarization of the membrane, and so on. (7) Secondary messengers may be produced due to the activation of enzymes that initiate a cascade of molecular processes. The consequences of this cascade are quite diverse, ranging from locally influencing the
c05.indd Sec3:87
function of nearby receptors to alterations in the genome of the neuron itself. (8) The actions of the neurotransmitter must be terminated in order to permit the communication between two neurons. This is accomplished principally by actively reabsorbing the transmitter molecule back into the presynaptic terminal using dedicated protein complexes that are on the surface of the presynaptic membrane. (9) A secondary method of transmitter inactivation is by enzymatic conversion of the molecule so that it is no longer able to interact with its receptor. For example, the neurotransmitter molecule may be hydrolyzed, ionized, or conjugated onto a larger, and more water-soluble, molecule. (10) Once the neurotransmitter is enzymatically converted, it is removed from the brain and metabolites of neuronal function can be detected in the body fluids. (11) Drugs can interact with any of these processes and impair, or even sometimes enhance, the production, storage, release, receptor function, re-uptake, and inactivation processes. Fluctuations in the levels of the metabolites of different neurotransmitters can be monitored in order to judge the integrity of specific neural systems.
8/17/09 2:00:12 PM
88
Neuropharmacology
In mammals, norepinephrine is distributed throughout the autonomic nervous system. Its presence there contributes significantly to the side effects, often undesired, of many drugs that alter the function of norepinephrine neurons in the brain. Norepinephrine is released from the postganglionic sympathetic neurons into the body organs as well as into blood vessels and hair follicles on skin. The activation of norepinephrine input to the skin is responsible for the goosebump response to frightening sights and sounds. Within the brain, almost all of the brain’s norepinephrine neurons are located in the locus coeruleus within the floor of the fourth ventricle. The name of this brain region is related to the fact that these norepinephrine neurons concentrate copper into a matrix of melanin pigment. Although copper is a required cofactor for one of the enzymes necessary for the synthesis of norepinephrine, the concentration of the copper far exceeds what is required for neurotransmitter synthesis. Unfortunately, the presence of this transition metal contributes to an age-associated vulnerability to oxidative stress. The axonal projections out of the locus coeruleus into the brain follow two ascending projections: The ventral pathway provides norepinephrine to the hypothalamus, septal area, pituitary, substantia nigra, and mammillary bodies. The dorsal pathway projects to the entire neocortex, hippocampus, thalamus, amygdala, superior and inferior colliculi, and the medial and lateral geniculate nuclei of the thalamus. Dopamine neurons exist within a small region of the midbrain and, although there are three to five times more dopamine neurons in the brain stem than norepinephrine neurons, they do not project as widely throughout the brain as do norepinephrine neurons. Although there are about 10 different groups of dopaminergic neurons within the brain, three general, long-axon pathways are recognized that ascend into the forebrain. The nigrostriatal pathway originates within the substantia nigra and projects to the striatum. The region was given the name substantia nigra, or dark substance, because it concentrates the transition metal iron that, similar to the situation seen in the locus coeruleus, is combined with a melanin pigment. The oxidation of iron contributes to the color of this brain region and also confers a degree of neuronal vulnerability to these cells (Molina-Holgado, Hider, Gaeta, Williams, & Francis, 2007). The degeneration of this pathway is associated with Parkinson’s disease and is characterized by tremors, spasticity, and akinesia. These symptoms provide insight into the normal responsibility of the release of dopamine within the striatum. Many drugs that interact with the function of the forebrain dopaminergic system have side effects that resemble those seen in people with Parkinson’s disease.
c05.indd Sec3:88
Two other ascending pathways originate in the ventral tegmental area that lies just medial to the substantia nigra. The mesolimbic pathway projects to forebrain structures associated with the olfactory and limbic system. The mesocortical pathway projects to the frontal cortex. Too much activity in these pathways may underlie some of the symptoms associated with psychosis. The drugs that target this system will be discussed later. The many different catecholamine receptors that exist provide the opportunity for a large variety of drugs to have both selective and widespread effects on different parts of the peripheral and central nervous systems. These neurotransmitter receptor proteins have existed in the forms we find today for at least 600 million years. The actual point in evolution where receptor-mediated neurotransmission appears is not known. However, some of the oldest eukaryotes known respond to the same norepinephrine receptor–stimulating drugs as primates; for example, the beta subtype of norepinephrine receptors may have existed in annelids 500 million years ago during the Paleozoic era. Norepinephrine receptors were initially divided into two classes, alpha and beta, due to their distribution in the body and their selective responses to drugs that were available at that time. As newer drugs were studied, additional subclassifications were introduced to the standard nomenclature. Receptors are also categorized by their location on the neuron; for example, autoreceptors are found on the presynaptic membrane of many types of neurons; their purpose is to modulate the release of neurotransmitters from the axonal terminal. With continued investigations, subtypes of these subclassifications have led to the characterization of at least nine different receptors that are all linked to very similar primary signaling mechanisms. Pharmacological manipulation of these receptors, in order to understand their function, has been hindered by the lack of specific and selective agonists or antagonists. A second important feature of most receptors is that they are responsive to the presence of constant stimulation or constant blockade. For example, in response to constant blockade, receptors will become supersensitive. Receptor supersensitivity is a common behavior of receptors (Fleming, 1999) and may represent a compensatory mechanism by the postsynaptic cells in response to the absence of incoming signals via the receptor blockade. Exposure to receptor agonists can produce an uncoupling of the receptor from its intracellular signaling system, for example, a G-protein, leading to the stopped signaling. Ultimately, the receptor may be sequestered into intracellular compartments for later redeployment to the cell surface or degradation. The process of desensitization may involve genetic and posttranscriptional regulatory mechanisms (Heck & Bylund, 1998).
8/17/09 2:00:13 PM
Catecholamines
Many drugs that interact with the function of the neurotransmitter systems do so by affecting their metabolism. Three key enzymes are involved in the synthesis of dopamine and norepinephrine. These enzymes are structurally similar to each other, and the genes that encode them lay close together on the genome like beads on a string of DNA. These enzymes exist within the nervous system of all vertebrates and several invertebrate species but not all of these enzymes are expressed in every catecholamine neuron. When more than one enzyme is expressed in a neuron, their regulation is usually linked in a coordinated fashion. Occasionally during embryogenesis, selected enzymes are transiently expressed and then disappear. Evolutionary studies of these enzymes suggest a strong functional preservation of the catalytic site. The reason for this preservation may be related to the fact that the genes have a high degree of nucleotide homology, suggesting that they may have evolved from duplication of a common ancestral gene. Genetic analyses have shown that the enzymes that initiate the production of dopamine, norepinephrine, epinephrine, and serotonin, that is, phenylalanine hydroxylase, tyrosine hydroxylase, phenylethanolamine N-methyltransferase, and tryptophan hydroxylase, share considerable sequence homology (Baetge, Suh, & Joh, 1986; Grima, Lamouroux, Blanot, Biguet, & Mallet, 1985). The production of dopamine and norepinephrine begins with the amino acid tyrosine that is obtained from the diet and actively absorbed across the blood-brain barrier. Tyrosine is converted to l-dopa by the enzyme tyrosine hydroxylase. This enzyme is the rate-limiting step in the production of norepinephrine and dopamine. The activity of tyrosine hydroxylase is controlled by end-production inhibition and is also regulated by availability of precursors and cofactors. One very important cofactor is molecular iron. Without iron, this enzyme fails to function normally. People with anemia have reduced body levels of iron and, as a consequence, may have reduced tyrosine hydroxylase activity leading to reduced production of norepinephrine and dopamine. The reduced brain levels of these important neurotransmitters in the limbic system may lead to a slight depression. Tyrosine can also be acted on by the enzyme tyrosinase and converted into a melanin pigment. This enzyme is quite interesting to study because it is subject to a mutation that makes it heat labile, that is, it works only in the cooler areas of the body. The consequence of this mutation is a lack of pigmentation in humans; in cats it produces the Siamese breed. Apparently, this enzyme is critical for the normal decussation of visual tracts. The second critical enzymatic step in this pathway is L-Dopa decarboxylase. This enzyme converts the product of tyrosine hydroxylase, l-dopa, into dopamine. This enzyme is extremely
c05.indd Sec3:89
89
efficient which may explain why brain levels of l-dopa tend to be so low and why providing substrate to it leads to a dramatic increase in the production of dopamine. The synthesis of dopamine occurs in the cytoplasm of neurons, and the dopamine produced is then transported into synaptic vesicles for storage until it is released with the passage of an action potential. This enzyme is rather nonspecific due to its evolutionary history and also produces serotonin in the presence of the amino acid tryptophan or converts tyrosine to tyramine, which is the precursor to the insect neurotransmitter octopamine. Octopamine is also present in low levels in the vertebrate brain. The third enzyme in this pathway is dopamine-beta-hydroxylase, and it converts dopamine into norepinephrine and is therefore not expressed by dopaminergic neurons. The enzyme is stored within the synaptic vesicles, where it lies in wait for the entry of dopamine molecules from the cytoplasm. The conversion occurs within the vesicle while it’s being transported to the terminals from the cell soma. The enzyme is released with the norepinephrine into the synapse with its principal cofactor, copper, and the intravesicle antioxidant ascorbic acid. The diet is important for the control of synthesis (Siegel et al., 2006). The shifting concentration of different precursor amino acids in the diet will alter their relative uptake and may limit or alter production. With a balanced diet, the uptake of all amino acids is fairly constant and in correct proportion. Too much of one amino acid will offset the uptake of the others and therefore alter the availability of dopamine or norepinephrine for release. Obviously, diet can affect a person’s mood, but the impact is usually more subtle than is typically produced by most psychoactive drugs. Once dopamine or norepinephrine is released, its actions within the synapse are terminated principally by re-uptake into the presynaptic terminal or by metabolic breakdown due to oxidative deamination by monoamine oxidase or transmethylation by catechol-O-methyltransferase. What if the vesicles are empty? One interesting drug, reserpine, prevents the transport of the neurotransmitters into the vesicles. Reserpine is found in the snake root plant. If dopamine and norepinephrine (and serotonin) cannot be stored safely in vesicles, they are caught in the cytoplasm, where the enzyme monoamine oxidase can catabolize them. Reserpine has a tranquilizing effect due to its ability to prevent the transfer of catecholamines into the synaptic vesicle. In addition, the reduction in the availability of these transmitters is associated with a severe depression that might explain why the snake root plant is called the insanity herb (pagla-kadawa) by Sherpas in the Far East. In contrast to reserpine, some drugs enhance the availability and release of catecholamines. One of the beststudied drugs is amphetamine. Amphetamine is taken up
8/17/09 2:00:13 PM
90
Neuropharmacology
into the terminals disturbs the vesicular transport storage process, induces a presynaptic leakage of norepinephrine and dopamine, and also blocks their inactivation by reuptake and via monoamine oxidase (Sulzer, Sonders, Poulse, & Galli, 2005). The enhanced release of these catecholamines leads to heightened alertness, euphoria, lowered fatigue, decreased boredom, depressed appetite, insomnia, headaches, and tremors. The rebound symptoms from a drug’s effects on the brain are often proportional and in reverse to the effects of the drug. For example, amphetamine withdrawal produces extreme fatigue, dysphoria, and depression. Excessive exposure to amphetamine can produce a condition similar to paranoid schizophrenia, which is often the inevitable consequence of high-dose use. At the molecular level, these cognitive changes may be related to a reduction in the function or presence of dopamine transporters (Stanley, Pettergrew, & Keshavan, 2000). During World War II, forces on both sides of the battle lines used amphetamine to combat boredom, fatigue, and to increase endurance. Historians suggest that at the end of the war, Adolf Hitler ’s increasingly bizarre behavior may have been due to his excessive use of amphetamines. One of the basic principles of neuropharmacology is that lipid solubility is directly correlated with the speed of uptake of a drug into the brain (Meyer & Quenzer, 2005). Furthermore, the faster a drug enters the brain and alters its physiology, the greater the euphoria the drug is likely to induce. This principle has never been lost on illicit drug designers. Morphine becomes far more lipid-soluble and far more euphorigenic when two acetyl groups are added to produce heroin. Amphetamine has been modified many times in the past. The simplest manipulation was the addition of a methyl group to make methamphetamine, which is a very potent analog of amphetamine and far more lipidsoluble. Not surprisingly, its street name became “speed.” Over time, attempts to make amphetamine ever more lipidsoluble by the addition of carbon atoms has produce drugs that are more euphorigenic and/or hallucinogenic than amphetamine, for example, 3,4-methylenedioxyamphetamine and N-ethyl-3,4-methylenedioxyamphetamine were precursors to 2,5-dimethyl-4-methylamphetamine, and 3,4methylenedioxymethamphetamine; the last is known widely as “ecstasy.” Any chemical manipulation that makes amphetamine more lipid-soluble allows it to enter the brain faster; drugs that quickly enter the brain often become chosen as drugs of abuse. Although amphetamine does not occur naturally, some chemically similar molecules have been discovered. For example, asarone is chemically similar to amphetamine and is found in the plant Aacorus calamus that grows in Asia, Europe, and North America. Khat is an African plant, Catha edulis, that contains two phenylalkylamines called
c05.indd Sec3:90
cathinone and cathine (d-norisoephedrine). The habit of chewing khat probably predates coffee drinking by centuries. The relative level of these naturally occurring analogs of amphetamine depends, as is true for most plant-derived psychoactive drugs, on where plant is grown, its age, and the time elapsed after collection. Cathinone is quite unstable, a proclivity that makes storage for distribution nearly impossible. Decoctions in hot water are called Abyssinian tea. Other compounds in this plant, as in so many others, include flavonoids that are anti-inflammatory. The cactus Lophophora williamsii is used to prepare a drink called peyote that contains 3,4,5-trimethoxyphenylethylamine (clearly a lipid-soluble molecule that is chemically similar to a catecholamine), also known as mescaline. It produces a dose-dependent range of effects that include euphoria at low doses and hallucinations at higher doses. Drugs that selectively block monoamine oxidase also occur naturally, for example harmaline and harmine can be found in a thick vine plant, Peganum harmala, which grows in the Amazon rain forest. The latency of onset is only 5 minutes after ingesting the plant, and the colorful visual hallucinations may last up to 8 hours. The catecholamine neurotransmitters are inactivated principally by re-uptake into the presynaptic nerve terminal (Siegel et al., 2006). Drugs that block this re-uptake process augment the effects of the neurotransmitter within the synaptic cleft. Most of these drugs have found clinical use as antidepressants. Depression is considered the common cold of psychiatric illness because each year more than 100 million people worldwide develop clinically recognizable depression (for a more detailed discussion of this topic, see Chapter 55). These agents block re-uptake of norepinephrine, dopamine, and serotonin into the terminals thus prolonging interaction in the synaptic cleft. However, all chemical action at the re-uptake site in clinical efficacy does not occur simultaneously. The drugs require 2 to 3 weeks to produce their antidepressant benefit. Their mechanism of action is believed to relate to the adaptive neural mechanisms following chronic use. Overall, the immediate effects block re-uptake and are offset by compensatory short- and long-term adjustments at catecholamine synapses. The blockade of re-uptake of dopamine and serotonin has profoundly euphorigenic effects, as do many current antidepressant drugs. In addition, many drugs that have this action are quite addicting, such as cocaine and amphetamine. At the other side of the synapse, drugs that block dopamine receptors have considerable therapeutic efficacy as antipsychotics. The fact that these drugs are capable of reducing some of the symptoms associated with psychosis does not prove that psychosis is due simply to a dysfunction in dopamine neurons. Indeed, this is a very important general point to consider when using drug actions to
8/17/09 2:00:13 PM
Serotonin 91
understand brain function. The knowledge that selective re-uptake inhibitors act on a few specific neurotransmitter systems in order to reduce the symptoms of depression does not prove that a dysfunction of one of these neural systems underlies either the mental disorder or its symptoms. It simply indicates that manipulation of re-uptake of a specific neurotransmitter molecule ultimately leads to an alteration in the presentation of symptoms. Indeed, compelling evidence exists that this action has very little to do with the therapeutic benefits of selective serotonin reuptake inhibitors (Santarelli et al., 2003). Furthermore, the alteration in dopamine function probably does not cause psychosis; rather, it is most likely just a secondary consequence of a complex array of alterations of one or more different neural systems in the brain. This may explain why the blockade of some dopamine receptors within certain brain regions reduces the severity of some symptoms but not others. The antagonism of dopamine receptors simply compensates for the presence of an error of brain chemistry or connectivity that may exist somewhere in the brain. There are five identified functional dopamine pathways within the brain (Cooper et al., 2002). The antagonism of each by antipsychotic drugs provides insight into their unique role in the brain. Two originate within the midbrain ventral tegmental area and project into many structures of the limbic system (mesolimbic) or into the frontal neocortex (mesocortical). The antipsychotic actions of dopamine receptor antagonists are thought to involve alterations in synaptic function of the mesolimbic and mesocortical dopamine pathways. A third dopaminergic pathway originates in an adjacent midbrain region, the substantia nigra, and projects into the striatum. Antagonism of dopamine receptors within the striatum by typical antipsychotic drugs leads to the appearance of a series of extrapyramidal side-effects that are similar to those seen in patients with Parkinson’s disease, including tremors when at rest, reduction of voluntary movement, spasticity, and dystonia. The level of dopamine within the frontal lobes and striatum has been correlated with smiling behavior in humans; consistent with this interesting pleasure-related role of dopamine are reports that the loss of the ability to smile is often seen in patients with Parkinson’s disease. Within the striatum, dopamine is likely released on acetylcholine interneurons; many of the extrapyramidal side-effects can be attenuated by drugs that block muscarinic acetylcholine receptors. Antipsychotic drugs also antagonize dopamine receptors within a dopaminergic pathway that originates within the hypothalamus and negatively controls the release of prolactin leading to the increased release of this hormone from the pituitary. It was once thought that antagonism of this pathway within the hypothalamus accounted for the significant weight gain associated with antipsychotic drugs;
c05.indd Sec4:91
recent evidence suggests that the weight gain is related to the antagonism of histamine receptors. Ironically, the original clinical use of the first antipsychotic drug, Thorazine, occurred because of its ability to block histamine receptors and reduce symptoms of the common cold (Lindamood, 2005). Subsequently, its additional proclivities were recognized. In a manner similar to that observed following treatment with antidepressant drugs, the side effects of dopaminergic receptor blockade occur rather quickly but the clinical benefits require 2 to 3 weeks, or more, to fully develop.
SEROTONIN Serotonin was initially discovered in the serum and determined to have tonic effects on the vascular system, hence its name (Rapport, Green, & Page, 1948). It has been found in the venom of amphibians, wasps, and scorpions and within the nematocysts of sea anemone as well as in the nervous system of lobsters and parasitic flatworms. Serotonin immunoreactivity was not found in Coelenterata (Hydra magnipapillata), Echinodermata (Asterina pectinifera), or Protochordata (Halocynthia roretzi; Fujii & Takeda, 1988). In humans, approximately 90% of total body serotonin is found within the nervous system of the gastrointestinal system. About 8% is localized to platelets and mast cells and the remaining few percent is found within the brain, principally within the pineal gland—which is not generally considered part of the brain. Neurons that produce and release serotonin are organized into a series of nuclei that lie along the midline, or seam, of the reticular region of the brain stem; these are the raphe nuclei. The most caudal lying nuclei in the ventromedial pons send axonal projections into the spinal cord to control the sympathetic autonomic nervous system. The ascending pathways originate in the pons and midbrain raphe nuclei and pass through the medial forebrain bundle in the lateral hypothalamus to innervate the limbic system, hypothalamus, septal nuclei, cingulate gyrus, cerebellum, superior colliculi, hippocampus, and neocortex. Some fibers also make contacts with glial cells and blood vessels. The axonal terminals ramify very widely to topographically innervate most regions of the brain. Individual raphe neurons send projections into brain regions that have related functions. Raphe neurons have a regular, slow, spontaneous firing rate that varies little in response to sensory stimuli. Given the widespread ramifications of the individual neurons and the constant release rate of serotonin, it is likely that this neural system is involved in modulation of neural activity rather than actual information transfer. The production of serotonin requires the absorption of the amino acid tryptophan from the diet (Boadle-Biber, 1993).
8/17/09 2:00:13 PM
92
Neuropharmacology
Transport of this large neutral amino acid is influenced by the level of other amino acids in blood. Reduced ingestion of tryptophan leads to reduced levels of brain serotonin; approximately 1% of ingested tryptophan is converted to serotonin within the brain and the remainder is used for protein synthesis. Tryptophan is converted to 5-hydroxytryptophan by tryptophan hydroxylase within the terminals; the enzyme is usually not saturated with substrate. Activity of enzyme can be inhibited by pchlorophenylalanine. 5-Hydroxytryptophan is then converted to serotonin by a decarboxylase and transported into vesicles by a reserpine-sensitive mechanism. End-product inhibition of serotonin synthesis is fairly minor. The enzymatic synthesis of serotonin is principally influenced by neuronal activity and availability of tryptophan in the blood; this may explain why depletion or supplementation of this amino acid in the diet can influence serotonin-controlled cognitive and neural processes such as mood and sleep (Wurtman & Wurtman, 1995; and see Chapter 24 for a detailed discussion). More than 12 serotonin receptors have been characterized. The action of serotonin at these receptors is terminated principally by re-uptake into the presynaptic terminal by a selective transporter protein. Astrocytes lack this transporter yet are able to take up serotonin. Catabolism of serotonin is performed only by monoamine oxidase; however, this enzyme has a relatively minor contribution to the overall inactivation of the action of serotonin. When serotonin is applied to autoreceptors on raphe neurons, principally the 5HT-1A subtype, cell firing is rapidly decreased (Hajós, Hajós-Korcsok, & Sharp, 1999). The firing rate of raphe neurons is likely regulated by a small neural loop (by autoregulation via somatodendritic autoreceptors) as well as by a long, negative feedback loop involving postsynaptic 5-HT-1A receptors (Dong, De Montigny, & Blier, 1999). Therefore, overall, serotonergic neurons negatively regulate their own activity. Drugs that influence serotonin release, re-uptake, or serotonin receptors will have profound affects on the stability of this control. In general, serotonin’s postsynaptic effect has a short latency and long-lasting inhibition. A pair of autoreceptors, 5-HT-1B and -1D, exist near the synapse on axon terminals and may regulate serotonin production and release in response to changes in the synaptic concentration of the transmitter. Exposure to the indole alkylamine d-lysergic acid diethylamide (LSD) also reduces raphe cell firing; however, the cognitive effects of this drug may far outlast the slowing of neuronal activity. The effects of LSD on serotonin neurons may only be the initial trigger that sets in motion a cascade of neural processes throughout the brain. Other hallucinogens, such as mescaline, do not affect raphe cell firing. Although significant
c05.indd Sec4:92
evidence suggests an important agonist action at the 5HT2A receptor (Gonzalez-Maeso et al., 2007) in the initiation of hallucinations by LSD, the drug also has a high affinity for at least eight of the known serotonin receptors. Psilocybin, o-phosphoryl-4-hydroxy-N,N-dimethlytryptamine, obtained from the mushroom Psilocybe mexicana, is less potent than LSD but likely shares aspects of its actions on serotonin receptors (Spinella, 2001). Psilocybin is the only known naturally occurring indole to contain phosphorus and is chemically related to bufotenine, an interesting molecule that has been discovered in the skin and glands of a South American toad, bean pods from the South American tree Piptadenia peregrine, and the leaves and bark of the Central American mimosa, Acacia niopo. Genetically altered mice and positron emission tomography (PET) studies on humans have been very useful in demonstrating the potential role of specific serotonin receptors in the regulation of mood and control of anxiety. Mice lacking the 5HT-1A receptor show more anxiety-like behavior (Gobbi, 2005). In humans, ratings of religiosity and spirituality were inversely correlated with the number of 5HT-1A receptors (Borg, Andree, Soderstrom, & Farde, 2003). The potential effects of alterations in serotonin neuronal function in relation to spiritual experiences are consistent with the observations of the effects of hallucinogenic drugs such as LSD, psilocybin, and mescaline. In addition, these results have broad implications for understanding and treating several psychiatric conditions. Drugs that stimulate the 5HT-1A receptor have been shown to be clinically effective at reducing anxiety (Caliendo, Santagada, Perissutti, & Fiorino, 2005). Another common psychiatric disorder, depression, responds well to treatment with drugs that selectively block the re-uptake of serotonin into the presynaptic terminal. Long-term treatment with these drugs, and the gradual onset of clinical benefit, is associated with the progressive down-regulation of the number of 5HT-2C receptors (Millan, 2005). Recent evidence suggests that the clinical effectiveness of these antidepressant therapies may depend on their ability to induce neurogenesis within the hippocampus (Taylor, Fricker, Devi, & Gomnes, 2005).
ENDOCANNABINOIDS The very high potency of exogenous cannabinoids and their stereochemical and structural requirements for binding to brain tissue predicted the discovery of the brain’s endogenous cannabinoid system. Fifteen years ago, the first endocannabinoid was discovered and named anandamide, from the Sanskrit word ananda meaning “internal bliss.” Anandamide and a second endogenous cannabinoid,
8/17/09 2:00:14 PM
Amino Acids 93
2-AG (2-arachidonoyl-glycerol), are enzymatically produced within the postsynaptic neural membrane (Wilson & Nicoll, 2002). Their production is dependent on the entry of calcium ions following membrane depolarization. Therefore, unlike most classical transmitters, endocannabinoids are produced in response to neural activity and released immediately without being stored in vesicles. The endocannabinoid receptor CB1 is linked to a G-protein that inhibits the activation of adenylate cyclase and the formation of ATP (Mackie, 2006). The CB1 receptor is the most abundant G-protein coupled receptor in the brain; its abundance and presynaptic location on glutamate and GABA-releasing terminals gives an indication of its importance and potential role in the regulation of the brain’s principle excitatory and inhibitory neural systems. Anandamide inhibits the release of glutamate and acetylcholine within the cortex and hippocampus, an action that may underlie the effects of exogenous cannabinoids on memory. Anandamide also inhibits the release, and sometimes the re-uptake, of GABA; its complex actions with the basal ganglia and the presence of cannabinoid receptors in the cerebellum may underlie the ataxia occasionally seen in cannabis users (Hajos et al., 2000; Hajós, Ledent, & Freund, 2001; Tzavara, Wade, & Nomikos, 2003). Continued stimulation of CB1 receptors will produce the expected desensitization by G-protein uncoupling and the internalization of the receptor. Anandamide and 2-AG are inactivated by re-uptake into the presynaptic terminal by a specific transport protein and then hydrolyzed by specific enzymes (Fowler et al., 2005). Pharmacological antagonism of the endocannabinoid system may be therapeutic for major depressive disorders (Witkin, Tzavara, Davis, Li, & Nomikos, 2005). In addition, prolonged inhibition of the actions of these endocannabinoids may enhance the release of glutamate and increase the probability of excitotoxicity (Sang, Zhang, & Chen, 2007). Clinical trials using an antagonist of CB1 receptors demonstrated a reliable effect on appetite regulation, providing help to overweight patients. Interestingly, due to the relative omnipresence of CB1 receptors in the human brain, the use of endocannabinoid receptor antagonists might also prove helpful in relieving addiction to alcohol and nicotine. Endocannabinoids are catabolized by cyclooxygenase; which raises an interesting concern regarding the influence of long-term treatment with anti-inflammatory drugs that inhibit this enzyme, such as aspirin or ibuprofen, on mood and memory. We have speculated that the apparently complex actions of cannabinoids within the brain may be interrelated via the function of glutamate (an amino acid neurotransmitter that will be discussed later). Stimulation of endocannabinoid receptors may reduce brain inflammation in young and aged animals by restoring the proper calcium influx via glutamate
c05.indd Sec5:93
receptor channels (Marchalant, Cerbai, Brothers, & Wenk, 2007). In addition to the role of the endocannabinoid system in neuroprotection and modulation of neuroplasticity and neurotoxicity, cannabinoid receptor stimulation may also modulate neurogenesis (Galve-Roperh, Aguado, Palazuelos, & Guzmán, 2007). Hippocampal neurogenesis declines with normal aging, and this correlates with the onset of depression (Olariu, Cleaver, & Cameron, 2007; Taylor et al., 2005); stimulation of the endocannabinoid system may provide clinical benefit by reversing this decline (Marchalant, Cerbai, Brothers, & Wenk, 2008).
AMINO ACIDS The brain holds relatively high concentrations of amino acids, as compared to other body tissues, which are used primarily for protein and neurotransmitter synthesis. Although glucose is utilized extensively by the brain for energy, it does not use amino acids for gluconeogenesis. Neurons respond to the amino acid neurotransmitters with either excitation or inhibition. The principal neurotransmitters in this group include glutamate, aspartate, gamma amino butyric acid (GABA), glycine, and N-acetyl aspartate. Glutamate, aspartate, and N-acetyl aspartate are the major excitatory amino acid neurotransmitters, while GABA and glycine are the major inhibitory amino acid neurotransmitters (Siegel et al., 2006). Glutamate and GABA have similar patterns of innervation and occur in greater concentration than the other amino acids. The excitatory glutamate and inhibitory GABA transmitter molecules differ by only the presence of a carbon dioxide group; a situation mimicked in nature. For example, the excitatory ibotenic acid and inhibitory muscimol occur together in the Amanita muscaria mushroom, differing only by the presence of a carbon dioxide group, and have been used extensively as pharmacological tools to study brain physiology. Glutamatergic neurotransmission is mediated through ionotropic glutamate receptors such as alpha-amino3-hydroxy-5-methyl-4-isoxazolepropionic acid (AMPA), kainate, and N-methyl-d-aspartate (NMDA) receptors. Additionally, glutamate activates G-protein coupled metabotropic glutamate receptors that are believed to have a more modulatory function. Most AMPA receptors are impermeable to calcium ions and contribute to fast synaptic transmission. In contrast, NMDA receptors are characterized by high permeability to calcium ions, voltagedependent blockade by magnesium ions, and slower gating kinetics. At normal resting potentials, the transmembrane electric field (negative on the inside of the cell) favors entry of positively charged Mg++ into the pore of
8/17/09 2:00:14 PM
94
Neuropharmacology
the receptor so that the NMDA channel is blocked. Under such resting conditions, NMDA receptors do not conduct ions. However, with sufficient, and actually rather significant, postsynaptic depolarization within the neuronal membrane surrounding the channel, the magnesium ion is no longer strongly attracted into the pore of the channel and dissociates. Under such depolarized conditions, NMDA receptors activated by synaptically released glutamate are able to allow the influx of sodium ions and, in particular, calcium ions, and contribute to postsynaptic excitation and activation of second messenger systems. These features make NMDA receptors quite suitable for mediating plastic changes in the brain, such as learning and cognition (Castner & Williams, 2007). However, these features may also contribute to neurotoxicity due to excessive unregulated calcium ion influx through the channel. If, due to a variety of factors that may contribute to many neurodegenerative diseases, the NMDA channels remain open (Wenk, Danysz, & Roice, 1996), it is possible that ambient levels of glutamate associated with normal synaptic activity could activate NMDA receptors, allowing excessive calcium ion influx, which, if sufficiently prolonged, may trigger a cascade of events leading to neuronal injury and death. Pharmacological blockade of NMDA receptors can produce protection from excessive calcium ion influx; however, if the channel blockade is too complete, this could lead to a profound loss of neuroplasticity. Memantine is more potent and slightly less voltage-dependent than magnesium and it may serve as a more effective surrogate for magnesium ions (Rogawski & Wenk, 2003). As a result, of its somewhat less pronounced voltage-dependency, memantine is more effective than magnesium ions in blocking tonic pathological activation of NMDA receptors at moderately depolarized membrane potentials. However, following strong synaptic activation, memantine like magnesium ions can leave the NMDA receptor channel with voltage-dependent, fast-unblocking kinetics. In turn, memantine suppresses synaptic noise but allows the relevant physiological synaptic signal to be detected. This provides both neuroprotection and symptomatic restoration of synaptic plasticity by one and the same mechanism (Danysz & Parsons, 2003). Antagonists that have “too high” affinity for the channel or “too little” voltage-dependence, such as dizocilpine (5-methyl-10,11-dihydro-5H-dibenzocyclohepten-5,10-imine maleate, MK-801), do not have a favorable therapeutic profile and produce numerous side effects because they essentially act as an irreversible plug of the NMDA receptor channel and block both pathological and physiological function. Deficits in energy metabolism associated with aging play an important role in the vulnerability of neurons in neurodegenerative diseases (Emerit, Edeas, & Bricaire,
c05.indd Sec5:94
2004; see Chapter 61). A defect in energy production would make neurons that express glutamatergic receptors more vulnerable to elevated synaptic levels of glutamate. Decreased levels of intracellular ATP would lead to a partial, and chronic, membrane depolarization, the relief of the voltage-dependent magnesium ion blockade at NMDA receptors, and a persistent increase in the influx of calcium ions into the cells; ultimately, the accumulation of intracellular calcium ions following the activation of NMDA receptors, by glutamate would lead to neuronal death. Chronic neuroinflammation, oxidative stress, or impaired intracellular calcium buffering may also result in impaired energy production, possibly leading to impaired function of the membrane ion pumps required for maintenance of the resting potential. In any of these situations, excessive calcium ion influx through NMDA receptors could activate a host of calcium ion–dependent signaling pathways and stimulate nitric oxide production through closely associated neuronal nitric oxide synthase. This gaseous neurotransmitter, nitric oxide, can react with a superoxide anion to form peroxynitrite, which disintegrates into extremely toxic hydroxyl free radicals that can further impair mitochondrial function and energy production. Intracellular calcium may become concentrated within the postsynaptic mitochondria further contributing to the impaired energy production within the region of the NMDA channels (Duchen, 2000). Mitochondrial dysfunction coupled with activation of glutamatergic receptors could underlie the selective vulnerability of neural systems during normal aging. Mitochondrial failure and neurochemical processes involving glutamate NMDA receptor in the presence of chronic neuroinflammation and oxidative stress may underlie the pathogenesis of many different neurodegenerative diseases (Barnham, Masters, & Bush, 2004; Wenk et al., 1996). GABA and glycine bind to their respective receptors that are primarily chloride ion channels; the opening of these channels allows chloride ions to move across the membrane to achieve an equilibrium that makes the membrane resistant to depolarization. The receptors for these two inhibitory amino acid transmitters are structurally similar and may share a common evolutionary history. The glycine receptor is antagonized by strychnine; the loss of this important inhibitory system leads to excitation of the glycine-sensitive neurons principally within the ventral (motor) spinal cord. GABA has two, or possibly three, receptors. The best studied is the GABAA receptor which is the site of action of many popular drugs, including alcohol and the benzodiazepines (Martin & Olsen, 2000; Wild & Benzel, 1994). When these drugs bind to the GABAA receptor, they enhance the ability of GABA to stabilize the membrane, altering the balance between excitation by
8/17/09 2:00:14 PM
Neuropeptides
glutamate and inhibition by GABA within local neuronal circuits. The GABAA receptor has been difficult to completely characterize because it is a multi-subunit, heteromeric ion channel: the receptor is essentially built using a combinatorial principle where the functional unit is not just multiple copies of a single polypeptide; rather, the channel is formed by many different polypeptides that contribute unique functional properties that can vary depending on its interactions with the other resident polypeptides. This multifunctional approach offers the opportunity for a more sophisticated and complex control that is common to other ionotropic receptors such as the NMDA glutamate and nicotinic acetylcholine receptors. Therefore, pharmacological manipulation of GABA receptor function has produced therapeutic benefit for a wide range of disorders, particularly anxiety and insomnia (Olsen, 2001).
ADENOSINE Adenosine is probably produced by all neurons in proportion with their firing rate (Siegel et al., 2006). It is the enzymatic product of ATP metabolism within the synapse. ATP can be stored within the synaptic vesicles and released with the neurotransmitter in an activity-dependent manner. Adenosine, in contrast, is not considered a classical neurotransmitter because it is not stored in vesicles and not released in quantal fashion. Adenosine acts as a local vasodilator within the extracellular space of the brain, providing a direct link between neuronal activity and regional blood flow. Extracellular levels must therefore be carefully controlled; adenosine is rapidly removed by re-uptake into cells and degraded to inosine by adenosine deaminase. Four adenosine receptor subtypes have been characterized; they are typical seven transmembranespanning G-protein coupled receptors that have been highly conserved throughout evolution and primarily produce a postsynaptic inhibition. Following tissue injury or the presence of inflammatory proteins, the extracellular level of adenosine will increase and likely plays an important role in neuroprotection. Stimulation of the A2A subtype has an anti-inflammatory effect. A1 and A2A adenosine receptors are the most common type found in the brain, and these receptors are blocked by caffeine at doses that are likely achieved by a person drinking a cup of coffee. Brain inflammation, hypoxia, and ischemia are associated with increased extracellular levels of adenosine. Adenosine may regulate aspects of the brain’s inflammatory processes, including the release of pro-inflammatory cytokines and modulation of microglial activation (Rosi, McGann, Hauss-Wegrzyniak, & Wenk, 2004) via A1 and
c05.indd Sec6:95
95
A2A receptors, which may provoke neuroprotection or injury, respectively (Dunwiddie & Masino, 2001; Mayne et al., 2001); not surprisingly, the blockade of A2A receptors is neuroprotective (Kalda, Yu, Oztas, & Chen, 2006). Caffeine is a nonselective A1 and A2A adenosine receptor antagonist. Chronic caffeine intake may delay the onset of Alzheimer’s disease (Maia & de Mendonca, 2002) and Parkinson’s disease (Schwarzschild & Chen, 2002). Adenosine may direct both proinflammatory and antiinflammatory functions, depending on which subtype of adenosine receptor and which type of inflammatory cell are involved and the duration of the neuroinflammation (Dunwiddie & Masino, 2001; Farber & Kettenmann, 2006). The mechanism underlying the effect of adenosine A1 and A2A receptor antagonism on the level of microglial activation in the current study is unknown. A1 receptors on glutamatergic terminals form heteromeric complexes with A2A receptors; this A1–A2A receptor heteromer may provide a mechanism by which adenosine can dosedependently control glutamate release (Sitkovsky & Ohta, 2005). A2A receptors are located on glutamatergic terminals; A2A receptor antagonists can attenuate the release of glutamate in the presence of ischemia (Marcoli et al., 2003) possibly by acting on A2A receptors on astrocytes (Pintor et al., 2004). The reduced release of glutamate and the subsequent reduction in activation of the NMDA channel on neurons may underlie the effects of caffeine in the current study. Consistent with this hypothesis, we have recently shown that selective antagonism of NMDA receptors can reduce microglia activation in the DG (Rosi et al., 2003; Wenk, Parson, & Danysz, 2006) suggesting an influence of adenosine receptors on microglia activation that might be linked to the modulation of glutamate synaptic transmission or neuronal activity. The reduction in glutamatergic signaling may also contribute to the neuroprotective effects of caffeine (Kalda et al., 2006) and its therapeutic potential for preventing neurodegenerative diseases associated with neuroinflammation (Wenk & Hauss-Wegrzyniak, 2003; Wenk et al., 2006).
NEUROPEPTIDES The tissue concentration of neuropeptides is usually about three orders of magnitude lower than the classical neurotransmitters, such as acetylcholine and the catecholamines (Siegel et al., 2006). The synthesis, postproduction modifications, and inactivation of neuropeptides stand in contrast to the way the brain metabolizes the neurotransmitters discussed thus far. Neuropeptides are initially produced as large precursor molecules called proproteins that are further processes to preproproteins that are then further
8/17/09 2:00:14 PM
96
Neuropharmacology
enzymatically processed to the final chemically stable polypeptide product that is stored in the axon terminal in large, dense-core vesicles until released. Once released, neuropeptides are hydrolyzed in the extracellular space to individual amino acids or small polypeptides; these are further catabolized rather than being recycled by re-uptake and then reused again. Therefore, the life cycle of neuropeptides is quite costly to the cell because the process is energetically inefficient. Neuropeptides commonly coexist within, and are released by, the same neuron however the rules governing their activity-dependent release may be different; for example, the release of the neuropeptide is usually from a different part of the terminal and may require a burst of high frequency stimulation and the subsequent influx of greater numbers of calcium ions. In addition, neuropeptides often coexist with other, structurally unrelated, neuropeptides, particularly within the hypothalamus and brain stem regions. Some small polypeptide products of the neuropeptides catabolism may be able to influence the interaction of coexisting neurotransmitters. Many neuropeptides found in invertebrates are insulinlike peptides, which suggests a shared evolutionary history, particularly since insulin has been found in protozoa, bacteria, fungi, invertebrates, and vertebrates. The neuropeptides growth hormone and prolactin may have diverged from a common ancestor about 350 million years ago (Miller et al., 1983). The primitive multicellular Hydra has a nervous system that uses neuropeptides as neurotransmitters, suggesting that neuropeptides were the first signaling molecules used by primitive nervous systems (Grimmelikhuijzen, Leviev, & Carstensen, 1996). Neuropeptides in primitive animals are constructed and stored in a manner similar to that found in vertebrates (Westfall, Sayyar, Elliott, & Grimmelikhuijzen, 1995); they produce excitatory or inhibitory actions when tested on vertebrate tissues. Postsynaptically, neuropeptides are more likely to produce a slow, longer-lasting change in membrane conductance than the classical amine transmitters. Pharmacological manipulation of neuropeptide systems has been most successful with the endogenous opiate neurotransmitter receptors, primarily because Mother Nature provided the first example, morphine, for the subsequent guidance of synthetic chemists. The number and variety of neuropeptide receptor agonists and antagonists for nonopiate systems remains comparatively limited. Pharmacological manipulation of neuropeptide receptors is also complicated by the fact that active neuropeptide receptors are not confined to the synapse and there is not always a direct corresponding relationship between the presence of a neuropeptide receptor and the neuropeptide itself.
c05.indd Sec6:96
SUMMARY Given the limitations of space, the interested reader may want to examine some excellent texts on brain chemistry and pharmacology to further supplement their knowledge (Cooper et al., 1996; Meyer & Quenzer, 2005; Siegel et al., 2006; Wild & Benzel, 1994).
REFERENCES Albin, R. L., & Greenamyre, J. T. (1992). Alternative excitotoxic hypothesis. Neurology, 42, 733–738. Baetge, E. E., Suh, Y. H., & Joh, T. H. (1986). Complete nucleotide and deduced amino acid sequence of bovine phenylethanolamine Nmethyltransferase: Partial amino acid homology with rat tyrosine hydroxylase. Proceedings of the National Academy of Sciences, USA, 83, 5454–5458. Barnham, K. J., Masters, C. L., & Bush, A. I. (2004). Neurodegenerative diseases and oxidative stress. Nature Review, Drug Discovery, 3, 205–214. Boadle-Biber, M. C. (1993). Regulation of serotonin synthesis. Progress in Biophysics and Molecular Biology, 60, 1–15. Borg, J., Andree, B., Soderstrom, H., & Farde, L. (2003). The serotonin system and spiritual experiences. American Journal of Psychiatry, 160, 1965–1969. Caliendo, G., Santagada, V., Perissutti, E., & Fiorino, F. (2005). Derivatives as 5HT(1A) receptor ligands: Past and present. Current Medicinal Chemistry, 12, 1721–1753. Castner, S. A., & Williams, G. V. (2007). Tuning the engine of cognition: A focus on NMDA/D1 receptor interactions in prefrontal cortex. Brain Cognition, 63, 94–122. Cooper, J. R., Bloom, F. E., & Roth, R. H. (2002). The biochemical basis of neuropharmacology (8th ed.). New York: Oxford University Press. Daiello, L. A., Galvin, J. E., & Wenk, G. L. (2005). A case-based approach to management of Alzheimer ’s disease across the disease continuum. U.S. Pharmacist, 30, 2–5. Danysz, W., & Parsons, C. G. (2003). The NMDA receptor antagonist memantine as a symptomatological and neuroprotective treatment for Alzheimer ’s disease: Preclinical evidence. International Journal of Geriatric Psychiatry, 18, S23–S32. Davis, K. L., Mohs, R. C., Marin, D., Purohit, D. P., Perl, D. P., Lantz, M., et al. (1999). Cholinergic markers in elderly patients with early signs of Alzheimer disease. Journal of the American Medical Association, 281, 1401–1406. Dong, J. M., De Montigny, C., & Blier, P. (1999). Assessment of the serotonin re-uptake blocking property of YM992: Electrophysiological studies in the rat hippocampus and dorsal raphe. Synapse, 34, 277–289. Duchen, M. R. (2000). Mitochondria and calcium: From cell signalling to cell death. Journal of Physiology, 529, 27–68. Dunwiddie, T. V., & Masino, S. A. (2001). The role and regulation of adenosine in the central nervous system. Annual Review of Neuroscience, 24, 31–35. Emerit, J., Edeas, M., & Bricaire, F. (2004). Neurodegenerative diseases and oxidative stress. Biomedical Pharmacotherapy, 58, 39–46. Farber, K., & Kettenmann, H. (2006). Purinergic signaling and microglia. Pflugers Archives of European Journal of Physiology, 452, 615–621. Fleming, W. W. (1999). Cellular adaptation: Journey from smooth muscle cells to neurons. Journal of Pharmacology and Experimental Therapeutics, 291, 925–931.
8/17/09 2:00:15 PM
References 97 Fowler, C. J., Holt, S., Nilsson, O., Jonsson, K.-O., Tiger, G., & Jacobsson, S. O. P. (2005). The endocannabinoid signaling system: Pharmacological and therapeutic aspects. Pharmacology, Biochemistry and Behavior, 81, 248–262.
Olariu, A., Cleaver, K. M., & Cameron, H. A. (2007). Decreased neurogenesis in aged rats results from loss of granule cell precursors without lengthening of the cell cycle. Journal of Comparative Neurology, 501, 659–667.
Fujii, K., & Takeda, N. (1988). Phylogenetic detection of serotonin immunoreactive cells in the central nervous system of invertebrates. Comparative Biochemistry and Physiology, 89C, 233–239.
Olsen, R. W. (2001). GABA. In K. L. Davis, D. Charney, J. T. Coyle, & C. Nemeroff (Eds.), Neuropsychopharmacology: Fifth generation of progress (pp. 000–000). Philadelphia: Lippincott, Williams, & Wilkins.
Galve-Roperh, I., Aguado, T., Palazuelos, J., & Guzmán, M. (2007). The endocannabinoid system and neurogenesis in health and disease. Neuroscientist, 13, 109–114. Gobbi, G. (2005). Serotonin firing activity as a marker for mood disorders: Lessons from knockout mice. International Review of Neurobiology, 65, 249–272. Gonzalez-Maeso, J., Weisstaub, N. V., Zhou, M. M., Chan, P., Ivic, L., Ang, R., et al. (2007). Hallucinogens recruit specific cortical 5-HT2A receptor-mediated signaling pathways to affect behavior. Neuron, 53, 439–452. Grima, B., Lamouroux, A., Blanot, F., Biguet, N. F., & Mallet, J. (1985). Complete coding sequence of rat tyrosine hydroxylase mRNA. Proceedings of the National Academy of Sciences, USA, 82, 617–621. Grimmelikhuijzen, C. J. P., Leviev, I., & Carstensen, K. (1996). Peptides in the nervous systems of cnidarians: Structure, function and biosynthesis. International Review of Cytology, 167, 37–89. Hajós, M., Hajós-Korcsok, É., & Sharp, T. (1999). Role of the medial prefrontal cortex in 5-HT1A receptor-induced inhibition of 5-HT neuronal activity in the rat. British Journal of Pharmacology, 126, 1741–1750. Hajós, N., Katona, I., Naiem, S. S., Mackie, K., Ledent, C., Mody, I., et al. (2000). Cannabinoids inhibit hippocampal GABAergic transmission and network oscillations. European Journal of Neuroscience, 12, 3239–3249.
Pintor, A., Galluzzo, M., Grieco, R., Pezzola, A., Reggio, R., & Popoli, P. (2004). Adenosine A 2A receptor antagonists prevent the increase in striatal glutamate levels induced by glutamate uptake inhibitors. Journal of Neurochemistry, 89, 152–156. Power, A. E., Vazdarjanova, A., & McGaugh, J. L. (2003). Muscarinic cholinergic influences in memory consolidation. Neurobiology of Learning and Memory, 80, 178–193. Rapport, M. M., Green, A. A., & Page, I. H. (1948). Serum vasoconstrictor (serotonin): IV. Isolation and characterization. Journal of Biological Chemistry, 176, 1243–1251. Rogawski, M., & Wenk, G. L. (2003). The neuropharmacological basis for memantine in the treatment of Alzheimer ’s disease. CNS Drug Review, 9, 275–308. Rosi, S., McGann, K., Hauss-Wegrzyniak, B., & Wenk, G. L. (2003). The influence of brain inflammation upon neuronal adenosine A2B receptors. Journal of Neurochemistry, 86, 220–227. Sang, N., Zhang, J., & Chen, C. (2007). COX-2 oxidative metabolite of endocannabinoid 2-AG enhances excitatory glutamatergic synaptic transmission and induces neurotoxicity. Journal of Neurochemistry.
Hajós, N., Ledent, C., & Freund, T. F. (2001). Novel cannabinoid-sensitive receptor mediates inhibition of glutamatergic synaptic transmission in the hippocampus. Neuroscience, 106, 1–4.
Santarelli, L., Saxe, M., Gross, C., Surget, A., Battaglia, F., Dulawa, S., et al. (2003). Requirement of hippocampal neurogenesis for the behavioral effects of antidepressants. Science, 301, 805–809.
Heck, D. A., & Bylund, D. B. (1998). Differential down-regulation of alpha-2 adrenergic receptor subtypes. Life Sciences, 62, 1467–1472.
Sarter, M., Gehring, W. J., & Kozak, R. (2006). More attention must be paid: The neurobiology of attentional effort. Brain Research Reviews, 51, 145–160.
Kalda, A., Yu, L., Oztas, E., & Chen, J.-F. (2006). Novel neuroprotection by caffeine and adenosine A2A receptor antagonists in animal models of Parkinson’s disease. Journal of Neurological Sciences, 248, 9–15. Lindamood, W. (2005). Thorazine. Chemical Engineering News, 83, 126. Mackie, K. (2006). Cannabinoid receptors as therapeutic targets. Annual Review of Pharmacology and Toxicology, 46, 101–122. Maia, L., & de Mendonca, A. (2002). Does caffeine intake protect from Alzheimer ’s disease? European Journal of Neurology, 9, 377–382. Marchalant, Y., Cerbai, F., Brothers, H., & Wenk, G. L. (2007). Cannabinoid receptor stimulation is anti-inflammatory and improves memory in old rats. Neurobiology of Aging. Marchalant, Y., Cerbai, F., Brothers, H., & Wenk, G. L. (2008). Cannabinoid receptor stimulation partially restores age-associated decline in neurogenesis in the hippocampus. Neuroscience. Marcoli, M., Raiteri, L., Bonfanti, A., Monopoli, A., Ongini, E., Raiteri, M., et al. (2003). Sensitivity to selective adenosine A1 and A2A receptor antagonists of the release of glutamate induced by ischemia in rat cerebrocortical slices. Neuropharmacology, 45, 201–210. Martin, D. L., & Olsen, R. W. (2000). GABA in the nervous system: The view at 50 years. Philadelphia: Lippincott, Williams, & Wilkins. Meyer, J. S., & Quenzer, L. F. (2005). Psychopharmacology: Drugs, the brain and behavior. Boston: Sinauer Associates. Millan, M. J. (2005). Serotonin 5-HT2C receptors as a target for the treatment of depressive and anxious states: Focus on novel therapeutic strategies. Therapie, 60, 441–460. Molina-Holgado, F., Hider, R. C., Gaeta, A., Williams, R., & Francis, P. (2007). Metals ions and neurodegeneration. Biometals, 20, 639–654.
c05.indd Sec7:97
Olton, D. S., Wenk, G. L., Church, R. M., & Meck, W. H. (1988). Attention and the frontal cortical cortex as examined by simultaneous temporal processing. Neuropsychologia, 26, 307–318.
Schwarzschild, M. A., & Chen, J. F. (2002). Ascherio A. Caffeinated clues and the promise of adenosine A(2A) antagonists in, P. D. Neurology, 58, 154–160. Siegel, G. J., Albers, R. W., Brady, S. T., & Price, D. L. (2006). Basic neurochemistry: Molecular, cellular and medical aspects (7th ed.). New York: Elsevier. Sitkovsky, M. V., & Ohta, A. (2005). The ‘danger ’ sensors that STOP the immune response, the A2 adenosine receptors? Trends in Immunology, 26, 299–304. Spinella, M. (2001). The psychopharmacology of herbal medicines. Cambridge: MIT Press. Stanley, J. A., Pettergrew, J. W., & Keshavan, M. S. (2000). Magnetic resonance spectroscopy in schizophrenia: Methodological issues and findings: Part I. Biological Psychiatry, 48, 357–368. Sulzer, D., Sonders, M. S., Poulse, N. W., & Galli, A. (2005). Mechanisms of neurotransmitter release by amphetamines: A review. Progress in Neurobiology, 75, 406–433. Taylor, C., Fricker, A. D., Devi, L. A., & Gomnes, N. (2005). Mechanisms of action of antidepressants: From neurotransmitter systems to signaling pathways. Cellular Signalling, 17, 549–557. Tzavara, E. T., Wade, M., & Nomikos, G. G. (2003). Biphasic effects of cannabinoids on acetylcholine release in the hippocampus: Site and mechanism of action. Journal of Neuroscience, 23, 9374–9384. Venter, J. C., Di Porzio, U., Robinson, D. A., Shreeve, S. M., Lai, J., Kerlavage, A. R., et al. (1988). Evolution of neurotransmitter receptor systems. Progress in Neurobiology, 30, 105–169.
8/17/09 2:00:15 PM
98
Neuropharmacology
Wenk, G. L. (1997). The nucleus basalis magnocellularis cholinergic system: 100 years of progress. Neurobiology of Learning and Memory, 67, 85–95. Wenk, G. L. (2003). Neurotransmitters. In L. Nadel (Ed.), Encyclopedia of cognitive science (pp. 2414–2421). London: Nature Publishing, Macmillan Press. Wenk, G. L. (2006). Neuropathologic changes in Alzheimer ’s disease: Potential targets for treatment. Journal of Clinical Psychiatry, 67, 3–7. Wenk, G. L., Danysz, W., & Roice, D. D. (1996). The effects of mitochondrial failure upon cholinergic and glutamatergic toxicity within the nucleus basalis. NeuroReport, 7, 1453–1456. Wenk, G. L., & Hauss-Wegrzyniak, B. (2001). Animal models of chronic neuroinflammation as a model of Alzheimer ’s disease. In S. Bondy & A. Campbell (Eds.), Inflammatory events in neurodegeneration (pp. 83–87). Scottsdale: Prominent Press. Wenk, G. L., & Hauss-Wegrzyniak, B. (2003). Chronic intracerebral LPS as a model of neuroinflammation. In P. L. Wood (Ed.), Neuroinflammation: Mechanisms and management (2nd ed., pp. 137–150). Totowa, NJ: Humana Press. Wenk, G. L., Parson, C., & Danysz, W. (2006). Potential role of NMDA receptors as executors of neurodegeneration resulting from diverse insults: Focus on memantine. Behavioral Pharmacology, 17, 411–424.
c05.indd Sec7:98
Westfall, J. A., Sayyar, K. L., Elliott, C. F., & Grimmelikhuijzen, C. J. P. (1995). Ultrastructural localization of Antho-RWamide, I., & II at neuromuscular synapses in the gastrodermis and oral sphincter muscle of the sea anemone. Calliactis parasitica. Biological Bulletin, 189, 280–287. Whitehouse, P. J., Price, D. L., Clark, A. W., Coyle, J. T., & DeLong, M. R. (1981). Alzheimer disease: Evidence for selective loss of cholinergic neurons in the nucleus basalis. Annals of Neurology, 10, 122–126. Wild, G. C., & Benzel, E. C. (1994). Essentials of neurochemistry. Boston: Jones and Bartlett. Wilder, J. (1958). Modern psychophysiology and the law of initial value. American Journal of Psychotherapy, 12, 199–207. Wilson, R. I., & Nicoll, R. A. (2002). Endocannabinoid signaling in the brain. Science, 296, 678–682. Witkin, J., Tzavara, E. T., Davis, R. J., Li, X., & Nomikos, G. G. (2005). A therapeutic role for cannabinoid CB1 receptor antagonists in major depressive disorders. Trends in Pharmacological Sciences, 26, 609–617. Wurtman, R. J., & Wurtman, J. J. (1995). Brain serotonin, carbohydratecraving, obesity and depression. Obesity Research, 3, S477–S480.
8/17/09 2:00:15 PM
Chapter 6
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors DONALD W. PFAFF, MARC TETEL, AND JUSTINE SCHOBER
PRINCIPLE 1: IT IS POSSIBLE TO DISCERN MECHANISMS BY WHICH HORMONES AFFECT MAMMALIAN BEHAVIORS
The field of work that discovers how endocrine signaling agents affect the brain in order to regulate behavior has become highly developed during the past 50 years. As a result, it is possible to state principles of hormone/behavior relations and their mechanisms, rather than simply presenting a compendium of facts. For the latter, consult Hormones, Brain and Behavior, 5 volumes and 4,100 pages long, written by over 100 experts. Here we will try to make points that relate to the general features of neuroendocrine systems relevant for behavioral science. Most of our examples come from reproductive neuroendocrine mechanisms because progress has been most rapid in this area for four reasons: (1) Mating behaviors are specific, well-defined, and easily studied in the laboratory in their natural form; (2) the molecular biology of steroid sex hormones is best understood; (3) the stimuli eliciting these behaviors are relatively simple; and (4) the motoric responses themselves are stereotyped, do not need to be learned, and of relatively simple topography. Another very well-developed area of work covers mechanisms of stress hormones. For that topic, see Chapter 62. We have organized several principles of neuroendocrine mechanisms related to behavior, stated in the most general form we feel is justifiable. While this chapter is restricted in scope in order to maintain coherence, the range of hormone/behavior phenomena is breathtaking. Hormones do not “cause” behaviors; they alter the probabilities of given responses to fixed stimuli. And, the probabilities of a given response may be raised or lowered by hormone treatment. Further, the behavioral response to a hormone can depend on stage of development and environmental context. That said, the general principles discussed next remain true under a very wide variety of circumstances.
The proof that it is possible to work out detailed neuroanatomical, neurophysiological, and genomic mechanisms in a neuroendocrine system that regulates behavior came in four steps. 1. The localization of hormone target neurons in the brain was determined and estrogen-binding neurons in a limbic/hypothalamic system were discovered. The discovery initially was made in rat brain (Figure 6.1), but work on fish CNS through monkey CNS showed it to be a general vertebrate system. The neuroanatomical research was followed up by histochemical findings that demonstrated consequences of hormone binding for electrophysiological activity and neuronal growth. 2. The neural circuit for a hormone-dependent vertebrate behavior (Figure 6.2), the estrogen-dependent lordosis behavior (Pfaff, 1980), was worked out. 3. We found hormone-dependent genes in the brain (Figure 6.3). Their induction by estrogenic hormones has temporal, spatial, and gender specificities appropriate to reproductive behavior. 4. In turn, the products of some of these hormonedependent genes are required for hormone-dependent lordosis behavior (Pfaff, 1999). Taken together, these four findings showed that specific neurochemical reactions in specific parts of the brain determine a specific mammalian behavior.
99
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c06.indd 99
8/17/09 2:00:37 PM
100
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors h
cc caud ic oc
lot
ot tub lpoa
h
m co
aa
Amygdala lsep
nst
lh pf
h
cg ob cc
a
sc fr
cbllm
ac
mt tub
db
ic
mpoa oc aha pvm
vm dm
scp
mamm vpm arc
PRINCIPLE 2: THESE MECHANISMS WORK AT SEVERAL OVERLAPPING LEVELS OF CELLULAR FUNCTION, COVERING A WIDE RANGE OF PHYSICAL DIMENSIONS The four steps listed take a problem from genomic transcriptional alterations to physiological alterations to mammalian behavior. Starting with mechanisms at the level of the very smallest physical dimensions, it is known that lordosis behavior depends on estrogen receptor-alpha, but not on its gene duplication product, estrogen receptor-beta. The differences between the ligand-binding lipophilic pits of these two hormone receptors is measured in angstrom units. Hormone-dependent currents in the ventromedial hypothalamic neurons at the top of the circuit for lordosis behavior (Figure 6.2) are carried by ions such as sodium and potassium. Hormone-dependent gene expression is understood down to the single DNA nucleotide level, when the estrogen receptors bind to specific nucleotide sequences, estrogen response elements, on the chromosome. At the biochemical level, neuroendocrinologists understand many of the cellular pathways involved in transducing effects of sex hormones into cellular changes that underlie sex behavior. For example, estrogens not only affect the gene for the neuropeptide oxytocin, but also increase
c06.indd 100
Figure 6.1 Looking at two schematic sagittal sections of the rat brain from the left side, the black dots indicate the distribution of nerve cells expressing estrogen receptors. Note. From Drive: Neurobiological and Molecular Mechanisms of Sexual Motivation, by D. W. Pfaff, 1999, Cambridge, MA: MIT Press. Reprinted with permission.
transduction for the oxytocin receptor (Figure 6.3). Thus, the neuropeptide, in order to be behaviorally active, must leave the rough endoplasmic reticulum where it is synthesized, directed toward the axon hillock, travel down the axon, be released from the presynaptic ending, and bind to its receptor. Likewise, molecular neuroendocrinologists have worked out some of the protein phosphorylation cascades that mediate membrane-initiated effects of steroid sex hormones (Rønnekleiv & Kelly, 2005). At the level of neuronal circuitry, the production of lordosis behavior is understood from the lumbosacral spinal cord up to the hypothalamus (Figure 6.2). The primary role of the telencephalon (e.g., the preoptic area, the septum, and the amygdala) is to inhibit lordosis behavior. What about the local environment of the female, and how might it affect lordosis? First and most obvious, signals from a reproductively competent conspecific are required. Among rodents, testosterone-dependent signals from a potential mating partner are likely to be carried by the olfactory or vomeronasal systems. Other animals might use other sensory systems, with the greatest variety and subtlety being used by the species with the highest capacity for information transfer, humans. In addition, however, the local environment has permissive and suppressive effects. The environment must afford the basis for
8/17/09 2:00:37 PM
Principle 2 Medial preoptic Medial ant. hypothalamus Hypothal Ventromedial Module nuclear hypothalamus
Estradiol Midbrain central gray
Midbrain reticular form
Figure 6.2 Drawing of the basic working circuit for the production of lordosis behavior in female quadripeds. Note. The circuitry is bilaterally symmetric and is plotted on just one side for visual clarity. From Drive: Neurobiological and Molecular Mechanisms of Sexual Motivation, by D. W. Pfaff, 1999, Cambridge, MA: MIT Press. Adapted with permission.
Lat. vestib. nuclear Lower Brainstem Module
Medullary reticular formation
Spinal cord
Midbrain Module
101
Lateral vestibulosp. and reticulosp. tracts
Dorsal roots L1, L2
Stimuli
Spinal Module
L5, L6, S1
Pr Re essu cep re tor s
Flanks
Skin of rump tailbase perineum
Lat. longissimus and transverso – spinalis
Lordosis response
Figure 6.3 List of genes discovered to have two properties: that estrogens (E) having bound to estrogen receptors (ER) elevate their mRNA transcript levels; and that their gene products foster female reproductive behaviors.
Gene turned on (in hypothalamus) rRNA and growth
Note. The exception is prostaglandin D synthase, where hormones work to foster the behavior by disinhibition. From Drive: Neurobiological and Molecular Mechanisms of Sexual Motivation, by D. W. Pfaff, 1999, Cambridge, MA: MIT Press. Adapted with permission.
Progesterone receptor Nitric oxide synthase
ER-α E binds
ER-β
Adrenergic α1 receptor Muscarinic receptors
Female reproductive behaviors
Enkephalin X Opioid receptors Oxytocin X Oxytocin receptor
(in preoptic area) GnRH X GnRH receptor Prostaglandin-D synthase ( )
c06.indd 101
8/17/09 2:00:38 PM
102
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
(a)
(b)
IFF Testosterone: flank gland, scent marks
Olfaction (Long time, wide space)
IFF Estradiol: scent marks
E2
Testosterone Activity increased
Up the pheromone gradient
Activity increased
Ultrasound
Odor preference
(Limited t,s)
Cutaneous Stimuli
(Lordosis)
(Time)
Very rapid forward move and sudden stop
(Space)
Figure 6.4 Feed-forward mechanisms in reproductive behavior help to guarantee that reproductively competent conspecifics mate. Note. Signaling among hamsters over a long time period (A) and between rats over a short time period (B) has this character. In both cases, the “hormone-dependent behavioral funnels” help to move males and females into the same place at the same time so that they can mate. From Pfaff, Kow, Loose, and Flanagan-Cato (in preparation). Adapted with permission.
Move closer to burrow
. sion Ten port Sup ght. i We Lordosis
Follow
Encourage Proper mount
Mount
Time Fertilization
adequate nutrition of the female. Otherwise (a) she will not ovulate, and (b) electrical activity in her ventromedial hypothalamic neurons will not be high enough to trigger the rest of the lordosis behavior circuitry. The environment may also be a source of marked stress, which would inhibit lordosis behavior both directly and indirectly, the latter being shown by the female not leaving her home nest to engage in courtship behaviors that attract males. At the level of social behaviors, neuroendocrinologists have worked out how courtship behaviors by the female laboratory rat lead to successful mounting by the male and subsequent lordosis by the female. The female runs forward very rapidly then, equally suddenly, brakes to a full stop. This braking tenses her leg and postural muscles so that she is braced and ready to support the weight of the much larger male. Second, that very high degree of muscular tension itself primes lordosis behavior circuitry. Third, that topography of movement by the female causes the male that is following to bump into the female in the correct position for mounting so that even an inexperienced male will mount the braked female properly (Figure 6.4). Finally, reproduction in many animals is seasonal. It is important for offspring to be born in seasons that offer adequate food supplies. Since the distance to the sun is measured in light-years, this is the largest dimension bearing on neuroendocrine mechanisms regulating behavior. It is important to emphasize that all of these levels of mechanisms, from molecular reactions and ion flows
c06.indd 102
measured in angstrom units to seasonal fluctuations dependent on day length, must work together for neuroendocrine regulation of reproduction to work in a biologically adaptive fashion—that is, for reproduction to occur successfully, but not to be attempted when it would be fruitless or dangerous.
PRINCIPLE 3: MOLECULAR ASPECTS OF HORMONE ACTION IN THE BRAIN USE THE SAME TYPES OF BIOCHEMISTRY AS IN OTHER HORMONE-DEPENDENT ORGANS The basic features of steroid hormone actions on neuroendocrine cells are indistinguishable from those discovered in tissues outside the brain. Nuclear receptors represent a superfamily of transcriptional activators that can be divided into subfamilies based on phylogenetic analysis (Evans, 1988; Mangelsdorf et al., 1995; Tsai & O’Malley, 1994). The classic steroid receptors represent the type I subfamily and include receptors for estrogens, progestins, androgens, glucocorticoids, and mineralocorticoids. Receptors for thyroid hormone, vitamin D3, all-trans retinoic acid, and 9-cis retinoic acid comprise the type II receptors. The third subfamily includes the orphan nuclear receptors, which have no known ligands (Benoit et al., 2006). Although this discussion focuses on ligand-dependent genomic mechanisms of action of the type I steroid receptors, studies are revealing an increasing role for nongenomic mechanisms
8/17/09 2:00:38 PM
Principle 3
via membrane-associated receptors in steroid action (Lange, 2007; Vasudevan & Pfaff, 2007). Mutagenesis studies reveal that steroid receptors share a modular domain structure (Evans, 1988; Figure 6.5). The amino terminal domain is highly variable and contains an activation function (AF-1) that regulates the level and specificity of activation of target genes (Tora, Gronemeyer, Turcotte, Gaub, & Chambon, 1988). The centrally located and conserved DNA-binding domain (DBD) contains two zinc fingers to facilitate receptor binding to DNA (Freedman & Luisi, 1993). The flexible hinge region is important in dimerization (Tetel et al., 1997) and in some receptors contains nuclear localization sequences (Ylikomi, Bocquel, Berry, Gronemeyer, & Chambon, 1992). The carboxyl-terminal ligand-binding domain (LBD) contains another activation function (AF-2) and is essential for ligand-dependent activation, dimerization, and binding of many cofactors (Lees, Fawell, & Parker, 1989; Oñate, Tsai, Tsai, & O’Malley, 1995; Shiau et al., 1998; Tetel et al., 1997). Some of the steroid receptors exist in two forms. For example, ER and ER are transcribed from different genes (Kuiper, Enmark, Pelto-Huikko, Nilsson, & Gustafsson, 1996), while the full-length PR-B and truncated PR-A are encoded by the same gene (Kastner et al., 1990). In both cases, the ER (Nomura et al., 2006) and PR
N
NTD
DBD
h
LBD
AF-1
AF-2
C Activation domains
Figure 6.5 Modular domain structure of steroid receptors. Note. AF Activation function; DBD DNA-binding domain; h Hinge region; LBD Ligand-binding domain; NTD N-terminal domain.
103
(Mani, Reyna, Chen, Mulac-Jericevic, & Conneely, 2006) subtypes have different biological functions. Since the discovery of estrogen receptors almost 4 decades ago (Gorski, Toft, Shyamala, Smith, & Notides, 1968; Jensen et al., 1968), a variety of in vitro and cell culture studies have elucidated much about the molecular mechanisms of steroid receptor action. The classic, ligand-dependent, genomic mechanism of action of steroid receptors is shown in Figure 6.6. In the absence of ligands, inactive steroid receptors are bound to heat-shock proteins and other immunophilins (Pratt, Galigniana, Morishima, & Murphy, 2004). Upon binding hormone, steroid receptors undergo a conformational change that causes dissociation of these heat-shock proteins and immunophilins, which allows receptors to dimerize (DeMarzo, Beck, Oñate, & Edwards, 1991). Activated receptor dimers bind preferentially to steroid response elements (SRE) in the promoter region of steroid-responsive target genes (Beato & Sánchez-Pacheco, 1996; Evans, 1988). These SREs consist of partial palindromic hexanucleotide sequences that are separated by an invariant three-nucleotide spacer (Beato & Sánchez-Pacheco, 1996). Binding of receptors to DNA increases or decreases gene transcription by altering the rate of recruitment of general transcription factors and influencing the recruitment of RNA polymerase II to the initiation site (Kininis et al., 2007; Klein-Hitpass et al., 1990). It is thought that steroids elicit many of their biological effects in brain by acting through their respective receptors to alter neuronal gene transcription, via mechanisms similar to those described previously, and cause changes in hormone-dependent behavior and physiology (Blaustein & Mani, 2006; Pfaff, 1997). For example, estrogens elevate transcription from the genes that encode the progesterone
Steroid p/CAF CBP SRCs
hsp
SR
SR SR
Inactive steroid receptor
Active receptor dimers
Expression of steroid-induced genes
SR SR Steroid response element
Cytoplasm
Pol II
Nucleus
Figure 6.6 Ligand-dependent genomic mechanism of action of steroid receptors. Note. CBP CREB-binding protein; hsp Heat-shock proteins; p/CAF p300/CBP-associated factor; Pol II RNA polymerase II; SR Steroid receptor; SRCs Steroid receptor coactivator family (p160s); SRE Steroid response element.
c06.indd 103
8/17/09 2:00:39 PM
104
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
receptor and the opioid peptide enkephalin in a manner that shows the neuroanatomical specificity, the temporal features, and the sex dimorphism that permits those gene products to foster the female sex behavior lordosis in the female but not the male (Pfaff, 1999). Coactivators of Steroid Receptors In contrast to the historical perspective of nuclear receptors acting alone by binding to their appropriate response elements on DNA, contemporary research reveals a wide set of nuclear proteins, called coactivators, that modulate behavioral and other physiological actions of hormones according to context. A critical component of efficient steroid receptor transcription is the recruitment of nuclear receptor coactivators, which dramatically enhance transcriptional activity (Lonard & O’Malley, 2006; O’Malley, 2006; Rosenfeld, Lunyak, & Glass, 2006). Under most conditions, steroid receptors interact with coactivators in the presence of an agonist, but not in the absence of ligands or in the presence of an antagonist (McInerney, Tsai, O’Malley, & Katzenellenbogen, 1996; Oñate et al., 1995; Shiau et al., 1998; Tanenbaum, Wang, Williams, & Sigler, 1998; but see also Dutertre & Smith, 2003; Oñate et al., 1998; Webb et al., 1998). It has been proposed that nuclear receptor coactivators influence receptor transcription through a variety of mechanisms, including acetylation of histones, methylation, phosphorylation, and chromatin remodeling (Lonard & O’Malley, 2006; Rosenfeld et al., 2006). The first steroid receptor coactivator to be cloned was steroid receptor coactivator-1 (SRC-1/NcoA-1; Oñate et al., 1995), which was later found to be a member of a larger family of p160 proteins that includes SRC-2 (also known as GRIP1, TIF2 and NCoA-2; Voegel, Heine, Zechel, Chambon, & Gronemeyer, 1996) and SRC-3 (AIB1, TRAM-1, p/CIP, ACTR, RAC3; Anzick et al., 1997). The SRC family of coactivators physically interacts with steroid receptors, including ER and PR, in a ligand-dependent manner (Oñate et al., 1995; Lonard & O’Malley, 2006; Rosenfeld et al., 2006). In cell culture, hormone-induced transactivation of PR is reduced by coexpression of ER, presumably due to squelching or sequestering of shared coactivators. This squelching effect can be reversed by over-expression of SRC-1, suggesting that coactivators are a limiting factor necessary for full transcriptional activation of receptors (Oñate et al., 1995). In further support of this concept, over-expression of SRC-1 relieves thyroid hormone receptor inhibition of ER-mediated transcription in a neuroendocrine model (Vasudevan et al., 2001). It has
c06.indd 104
been suggested that the SRC family of coactivators acts as a platform to allow the recruitment of other coactivators, including CREB-binding protein (CBP) and p300/ CBP-associated protein (p/CAF), that possess histone acetyltransferase activity and aid in chromatin remodeling (Kamei et al., 1996; McKenna, Nawaz, Tsai, Tsai, & O’Malley, 1998; Smith, Oñate, Tsai, & O’Malley, 1996; Figure 6.2). Finally, cell culture studies provide evidence that steroid receptor-mediated recruitment of distinct coactivators (e.g., SRC-1 versus SRC-2) can result in different chromatin modifications and modulate the transcription of steroid-responsive genes (Li, Wong, Tsai, & O’Malley, 2003). Steroid Receptor Coactivator Function in the Brain While much is known about the molecular mechanisms of nuclear receptor coactivators from a variety of cell culture studies (Lonard & O’Malley, 2006; O’Malley, 2006; Rosenfeld et al., 2006), we are just beginning to understand their role in hormone action in brain (Molenda, Kilts, Allen, & Tetel, 2003). SRC-1 mRNA and protein are expressed at high levels in the cortex, hypothalamus, and hippocampus of rodents (Auger, Tetel, & McCarthy, 2000; Martinez de Arrieta, Koibuchi, & Chin, 2000; Meijer, Steenbergen, & de Kloet, 2000; Misiti, Schomburg, Yen, & Chin, 1998; Molenda, Griffin, Auger, McCarthy, & Tetel, 2002; Ogawa, Nishi, & Kawata, 2001; Shearman, Zylka, Reppert, & Weaver, 1999) and birds (Charlier, Lakaye, Ball, & Balthazart, 2002). In order for coactivators to function with steroid receptors, they must be expressed in the same cells. Indeed, SRC-1 is expressed in the majority of estrogeninduced PR cells in reproductively relevant brain regions, including the VMN, medial preoptic area, and arcuate nucleus (Tetel, Siegal, & Murphy, 2007). The expression of the SRC family of coactivators in brain appears to be regulated by a variety of factors, including hormones (Camacho-Arroyo, Neri-Gomez, Gonzalez-Arenas, & Guerra-Araiza, 2005; Charlier, Ball, & Balthazart, 2006; Iannacone, Yan, Gauger, Dowling, & Zoeller, 2002; Maerkel, Durrer, Henseler, Schlumpf, & Lichtensteiger, 2007; McGinnis, Lumia, Tetel, Molenda-Figuiera, & Possidente, 2007; Mitev, Wolf, Almeida, & Patchev, 2003; Ramos & Weiss, 2006), day length (Tetel, Ungar, Hassan, & Bittman, 2004), and stress (Bousios, Karandrea, Kittas, & Kitraki, 2001; Charlier et al., 2006; Meijer, van der Laan, Lachize, Steenbergen, & de Kloet, 2006). More recently, the function of nuclear receptor coactivators in hormone action in brain and behavior has been
8/17/09 2:00:39 PM
Principle 4
investigated. One clever set of experiments used “antisense” DNA oligomers. These are short sequences of DNA, usually 15 to 30 nucleotide bases, that are complementary to a chosen sequence in the messenger RNA being targeted. They ruin that messenger RNA’s function by two mechanisms: by rendering the messenger RNA too fat to fit into the ribosome to be translated, and by rendering the messenger RNA susceptible to breakdown by a degrading enzyme. In the developing rodent brain, antisense to SRC-1 reduced masculinization of the sexually dimorphic nucleus, indicating SRC-1 is involved in hormone-mediated sexual differentiation of the brain (Auger etal., 2000). This is important because this nucleus is one of a set of structures that shows hormonally and developmentally dependent sexual dimorphisms and may play an important role in the sexual differentiation and physiological regulation of brain and behavior. In the adult brain, SRC-1 and SRC-2 function in the VMN to modulate ER-mediated transactivation of the behaviorally relevant PR gene (Apostolakis, Ramamurphy, Zhou, Oñate, & O’Malley, 2002; Molenda et al., 2002). In addition, SRC-1 acts in the VMN to regulate both ER- and PR-dependent aspects of female sexual behavior in rats (Molenda-Figueira et al., 2006). In the adult quail brain, SRC-1 modulates hormone-dependent gene expression, brain plasticity, and behavior (Charlier, Ball, & Balthazart, 2005). Finally, the p160 coactivators function in glucocorticoid receptor action in glial cells (Grenier et al., 2005). Taken together, these findings indicate that the SRC family of coactivators has profound effects on hormone action in brain and the regulation of behavior. The mechanisms by which steroids act in a tissuespecific manner comprise a fundamental issue in steroid hormone action. The field of Neuroendocrinology is poised to make dramatic gains in understanding how steroids regulate gene expression in brain. Recent investigations indicate that, in addition to the bioavailability of hormone and receptor levels, nuclear receptor coactivators are critical regulatory molecules in hormone-dependent activation of genes in the brain and the regulation of behavior.
reverse(!)—the nature of the causal relation between a hormone-dependent gene and the behavior (Ogawa, Choleris, & Pfaff, 2004). Here are three examples:
PRINCIPLE 4: GENES CODING FOR HORMONE RECEPTORS DO NOT DRIVE BEHAVIOR DIRECTLY; THEY ARE MODULATED BY SEVERAL TYPES OF ORGANISMIC AND ENVIRONMENTAL FACTORS
Female mice tested for their vigor in nest defense showed a marked effect of the null deletion of ER-. At the beginning of each test, beta-ERKOs showed a much greater number of attacks than wildtype controls. On the other hand, when tested for testosterone-facilitated aggression beta-ERKOs responded with significantly less frequent aggression for a given dose of testosterone. This comparison suggests that the effect of a specific gene on aggressive behavior can depend on the type of aggression tested.
For sex behavior and for aggressive behaviors, factors such as the gender of the animal, the age of the animal, and the nature of the aggressive encounter may alter—and even
c06.indd 105
105
1. Gender: Deleting the ER- gene permanently in a male mouse abolishes aggression (Ogawa, Washburn, Taylor, Lubahn, Korach, & Pfaff, 1998). However, in a female mouse, the same type of mutation increases aggression (Ogawa, Eng, et al., 1998). The results for the two sexes are opposite. Now consider deletion of the gene coding for ER-. In males, such a gene knockout can increase aggression, but in female mice, knocking out ER-beta reduces testosterone-facilitated aggression. From these two sets of contrasts between male and female mice, we infer that the effect of a given gene on aggression depends on the gender in which that gene is expressed. 2. Age: It is obvious that with respect to aggressive behaviors, the magnitude of the phenotype in the ER- knockout male mouse declines with age (Nomura et al., 2002). The strongest increase in aggression consequent to an ER-β gene deletion is just after puberty. While the mechanism for this is still obscure, nevertheless, it is clear that the effect of a specific gene on aggressive behaviors can depend on the age of the animals at which the behavioral assay is conducted. 3. Nature of the aggression: During the resident-intruder paradigm for testing aggression, ER- knockout female mice display high levels of increased aggression toward female intruders. Their aggression persists well beyond that shown by their wildtype littermate controls, whether the intruder is a female mouse treated with estrogens and progestins, or not. In contrast, if the intruder is an olfactory bulbectomized male mouse, the alpha-ERKO female’s aggression is at a very low level, not any different than the wildtype female mouse’s. This comparison shows that the effect of a specific gene on aggressive behavior can depend on the nature of the opponent.
8/17/09 2:00:40 PM
106
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
PRINCIPLE 5: BEHAVIORALLY RELEVANT HORMONE-SENSITIVE GENES GOVERN FUNCTIONAL MODULES FOR THE BIOLOGICALLY ADAPTIVE REGULATION OF BEHAVIOR Because the genes induced by sex hormones and important for lordosis behavior are not all of the same type—they include growth related genes, a transcription factor, genes for neurotransmitter receptors, neuropeptides and their receptors (Figure 6.3), a different kind of organizing principle had to be envisioned. Mong and Pfaff (2004) have conceived of functional modules governed by estrogens acting in the brain. The easiest to understand are the direct effects, from gene induction to neural circuit to behavioral change. Hormone effects on neurotransmitter receptors in ventromedial hypothalamic neurons directly trigger the rest of the lordosis circuit to operate, using the following modules. Noradrenergic -1b receptors, associated with generalized CNS arousal, are induced by estrogen treatment in ventromedial hypothalamic (VMH) cells which govern the rest of the lordosis behavior circuit. Noradrenergic (NA) ascending afferents come into the VMH from the ventral noradrenergic bundle, which originates in arousalrelated neuronal groups A1 and A2, and signals heightened arousal upon stimulation from the male. In biophysical studies, directly applied NA increases the electrical activity of VMH neurons. These VMH neurons are at the top of the lordosis behavior circuit (Figure 6.2), thus fostering reproductive behavior. Muscarinic cholinergic receptors responding to the neurotransmitter acetylcholine are also found on VMH neurons. Estrogen treatment increases their activities as well. Inputs to the VMH come from, among other places, the lateral dorsal nucleus of the tegmentum. Neurons there are part of the ascending arousal pathways, and would signal stimulation from the male upon mounting the female. In any case, inducing muscarinic receptors increases the VMH electrophysiological response to acetylcholine. The enhanced VMH output primes lower pathways in the circuit for lordosis behavior. There are also indirect effects, from gene induction to downstream genes to behavioral change. Some hormone effects occur early, long before the onset of reproductive behaviors, and set the stage for later developments. Neuronal Growth Growth promotion by estrogens in VMH neurons follows from the stimulation of synthesis of ribosomal RNA, which precedes the elaboration of dendrites and synapses on VMH neurons observed after hormonal treatment. The earliest
c06.indd 106
estrogen effect is the increase of transcription of ribosomal RNA, followed rapidly by morphological effects, including those in the nucleolus itself and a striking elaboration of rough endoplasmic reticulum in the cytoplasm. Wooley and Cohen (2002) have shown, probably consequent to the phenomena described previously, a stimulatory effect of estrogen treatment on dendritic growth. In the hypothalamus, others have reported that estrogens foster dendritic growth and an increased number of synapses. Therefore, in VMH cells that control lordosis behavior circuitry, estrogens provide the structural basis for increased synaptic activity and, therefore, greater sex-behavior-facilitating output.
Amplification by Progesterone Administration of progesterone 24 or 48 hours after estrogen priming greatly amplifies the effect of estradiol on mating behavior. This effect requires the nuclear progesterone receptor (PR), as mating behavior disappears after antisense DNA against PR mRNA has been administered in the VMH. This behavior also disappears in PR knockout mice. Importantly, PR itself is a transcription factor, so it will be possible to explore downstream progesteronesensitive genes.
GnRH The physiological importance of estrogenic elevation of gonadotropin-releasing hormone (GnRH, LHRH) mRNA levels under positive feedback conditions—as well as elevation of the receptor mRNA for GnRH—must be to synchronize reproductive behavior with the ovulatory surge of luteinizing hormone (LH). The same GnRH decapeptide which stimulates the ovulatory release of gonadotropins also facilitates mating behavior. In many small animals, synchrony of sex behavior with ovulation would be biologically adaptive because it eliminates unnecessary exposure to predation. In this respect, the behavioral effect of this neuropeptide is consonant with its peripheral physiological action. The case of GnRH also brings up a rare, unambiguous proof of an individual gene causally related to a human social behavior. During development in vertebrates ranging from fish to humans, GnRH neurons migrate from their birth place in the olfactory placode into the brain. A human with damage at the Kallmann’s syndrome locus on the X chromosome did not fail to express the GnRH gene in the appropriate neurons. Instead, the neurons failed to migrate out of the olfactory placode (Schwanzel-Fukuda & Pfaff, 1989; Schwanzel-Fukuda, Bick, & Pfaff, 1989). A single gene for the Kall protein accounts for the deficit.
8/17/09 2:00:40 PM
Principle 6
It is for an extracellular matrix protein that is necessary for the GnRH neuronal migration and that, in fact, decorates the migration route. A striking feature of the phenotype in men is important to note. They have no libido. Here is the causal route. The men have no sexual drive (1) because they have little testosterone, (2) because they have little LH and FSH circulating from the pituitary gland, (3) because no GNRH is coming down the portal circulation to the pituitary from the hypothalamus, (4) because there is no GnRH in the hypothalamus, (5) because the GnRH neurons did not migrate during development into the brain and, (6) because of a mutation in the gene for the Kall protein. Therefore, we can causally connect, step-by-step, an individual gene to an important human social behavior, but at least six causal links are required. This causal route illustrates the complexity of gene/behavior relationships in humans. Some of the indirect effects have a causal route from gene induction to intermediate behaviors (Mong & Pfaff, 2004). That is, some of the genes affected by estrogens work by altering other behaviors which then prepare the animal for the behavior in question, in this case mating.
Analgesia The enkephalin gene is turned on rapidly by estrogens, within about 30 minutes, and this is proven to represent a hormone-facilitated transcriptional facilitation. The route of action on lordosis, of the enkephalin gene product, is indirect, through other behaviors. That is, we propose that, through the reduction of pain, enkephalins help to allow the female to engage in mating behavior despite the mauling she receives from the male. The strong somatosensory and interoceptive stimuli that ordinarily would be treated by the female as noxious are now tolerable and allow successful mating to proceed.
Anxiety Reduction The oxytocin gene and the gene for its receptor are both expressed by hypothalamic neurons at higher levels in the presence of estrogens. The indirect route of action of this multiplicative set of gene inductions, on mating behavior, is likely through a behavioral link: anxiety reduction allows courtship and mating. This proposal is consistent with previous formulations: oxytocin has been conceived as protecting instinctive behaviors connected with reproduction, maternity, and other social behaviors from the disruptive effects of stress. Indeed, oxytocin has an anxiolytic action in the presence of estrogens (which presumably elevate the oxytocin receptor gene product; McCarthy, McDonald, Brooks, & Goldman, 1996).
c06.indd 107
107
Individual-Specific Olfactory Cues Nonvolatile Volatile Hypothalamus PVN and SON Vomeronasal organ
Main olfactory bulb/system
Accessory olfactory bulb/system
OT
ER β
OT E
OT
Blood stream OTR
Amygdala
ER α
E E
Social recognition
Ovaries: estrogen (E) production
Figure 6.7 Estrogens increasing oxytocin transcription in the hypothalamus and increasing oxytocin receptor transcription in the amygdala help to foster social recognition in mice. Note. Thus, four genes (estrogen receptor-alpha, estrogen receptor-beta, oxytocin, and oxytocin receptor) and their products form a functional module that supports social recognition. From “An Estrogen-Dependent Four-Gene Micronet Regulating Social Recognition: A Study with Oxytocin and Estrogen Receptor- and: Knockout Mice,” by E. Choleris, Gustafsson, et al., 2003, Proceedings of the National Academy of Sciences, p. 6196. Adapted with permission.
Social Recognition The induction of the oxytocin gene by estrogens is an ER- dependent, behaviorally significant phenomenon, reassuring since only ER- gene expression is found in oxytocinergic cells In turn, oxytocinergic projections to the amygdala, where oxytocin receptor gene transcription under the control of ER-alpha, are thought to be important for social recognition in mice, which helps to prevent aggression. Altogether, these data invoke the idea of a four-gene micronet (Choleris, Gustafsson, et al., 2003; Choleris, Little, etal., 2007) important for social behaviors (Figure 6.7). All of these modules support reproductive behaviors from the earliest estrogenic actions on the brain to the occurrences of mating behaviors themselves.
PRINCIPLE 6: TEMPORAL ASPECTS OF HORMONE ACTION IN THE BRAIN ARE IMPORTANT By temporal aspects, we mean questions of both duration and order of events. For some steroid sex hormone effects on a wide variety of behaviors, longer durations of hormone administration make
8/17/09 2:00:40 PM
108
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
for greater behavioral increases. A case in point is the effect of testosterone on male-typical sex behaviors in male rodents. Typically, for increased mounting behavior, and especially for the pelvic thrusting, penile insertions, and ejaculation, many days or several weeks of continuous testosterone treatment would be required for high levels of response. Hormone effects on female sex behaviors are more diverse. Consider first the role of estrogenic hormones. A long priming period is useful for high levels of lordosis behavior. If the female is cycling normally or has been exposed recently to substantial levels of estrogens, then 48 hours is enough. The longer the female is without ovarian estrogens, the longer the priming period required for female sex behavior. A molecular interpretation of the mechanism operating in this case is the following: That prolonged absence of estrogens allows the decline of nuclear coactivator protein levels (see Chapter 18), which are needed to transduce the nuclear binding of ERs into behaviorally relevant transcriptional facilitations. A surprising development with respect to long priming actions of estrogens came from biochemical work in the uterus. A priming period of 24 hours could be substituted for by two brief exposures to estrogen, suitably timed. We followed this up for the CNS and behavior. Again, an estrogen priming of 24 hours could be replaced by two 1-hour exposures, the first from hour 0 to 1 and the second beginning between 4 and 13 hours after the first. This pulse schedule was effective for inducing PR and as well as for lordosis behavior. Questions of Order of Administration In females, estrogen priming must be followed by progesterone for optimal behavioral facilitation. The progesterone, if it is timed correctly, amplifies the estrogenic effect. The required temporal parameters for progesterone are much different from those for estradiol—there is a biphasic action of progesterone both on pituitary release of LH and on behavior. In female rats, for example, 2 to 5 hours after progesterone injection, LH release and lordosis behavior are facilitated mightily. But the continued presence of progesterone, several hours later, actually inhibits both the endocrine output (LH) and the behavioral output (lordosis). In other neuroendocrine cases, brevity of hormone action is not only effective but also is actually required. Gonadotropin-releasing hormone (GnRH, also known as LHRH) is well known to show a pulsatile pattern of release. It is fascinating that populations of identical GnRH neurons can manage pulsatile output. In fact, pulsatile outputs of GnRH are absolutely necessary for the pituitary to respond with substantial gonadotropin release into the
c06.indd 108
blood. A steady, high level of administration of GnRH actually turns off the gonadotropin system and thus can be used either (a) as a birth control device or (b) to ramp down the system in case of gonadal cancers. Some of these phenomena are beginning to be understood at the molecular level. The GnRH promoter is activated in an episodic fashion in GnRH neuronal cultures. A specific 410 base region of the GnRH promoter is required for pulsatile GnRH promoter activity. Within that region, a small five-base site that represents a binding site for the transcription factor Oct-1 is likewise required. In order to achieve fertility, LH has to be given in a pulsatile fashion.
PRINCIPLE 7: IN SOME CASES, A HORMONE REQUIRES METABOLISM OR NEEDS COMBINATIONS WITH OTHER HORMONE(S) TO BE EFFECTIVE BEHAVIORALLY First consider the case of testosterone as a “prohormone”— a steroid whose enzymatically regulated metabolism produces other steroids that are active in actually exerting the behavioral effect. Testosterone, produced mainly by the testis in men and the adrenals in women, is a potent androgen (male sex hormone), that influences many tissues throughout the body, including the CNS, in which testosterone receptors are present. Metabolites, of testosterone, however, also are important hormones and have specific effects that are different from the effects of testosterone itself. Two important testosterone metabolites are dihydrotestosterone (DHT) and estradiol (E2), which are male and female sex steroids, respectively. It is the conversion of testosterone to these two active hormonal metabolites that is critical for male sex behavior. There are specific receptors for each; for DHT these are in peripheral tissues related to the development of secondary sex characteristics, and for E2 these are in both peripheral tissues and the CNS. The major structural changes inherent in the enzymatic conversions of testosterone to DHT and E2 confer receptor specificity to these steroid compounds. The conversion of testosterone to E2 by the sequence of three enzymatic steps that are called aromatase, within the CNS, is considered to be important for the early development of the masculine brain. The aromatase enzyme is concentrated in areas of the brain related to sexual differentiation of the CNS, such as the hypothalamus. In the male, exposure of the developing brain to high concentrations of testosterone, and therefore to high concentrations of E2 converted from testosterone in specific regions, leads to, for example, the tonic secretion of gonadotropins in the adult male versus the cyclic secretion of these hormones in the adult
8/17/09 2:00:41 PM
Principle 8
female. Many differences in behavior, especially in aggressive behavior, also result from the differential exposure of the developing male and female CNS to testosterone and E2. While testosterone has many androgenic effects throughout the body and is responsible for virilization of internal structures, for example, testicular development, DHT is required for virilization of the external genitalia, that is, penile growth and scrotal development. If the enzyme converting testosterone to DHT, 5-alpha reductase, is genetically deficient, the external genitalia of the newborn infant appear ambiguous (male pseudohermaphroditism). Figure 6.4B schematically illustrates the genitalia of a normal man (left) and the genitalia of a 5-alpha reductase-deficient prepubertal boy. The female-appearing external genitalia belie the presence of an XY sex chromosome complement, functioning testes, although undescended, and a masculinized brain secondary to fetal testosterone exposure. The behavioral consequences of this anatomical alteration are obvious, and they also illustrate how hormone metabolites can influence behaviors through indirect routes. Before it was recognized that this syndrome is inherited and therefore is concentrated in certain families, affected infants were raised as girls. On reaching puberty, however, the increased secretion of testosterone from the pubertal testis led to some DHT being produced, so that many prepubertal “girls” developed a male phallus with erections, scrotal testes, male hair distribution, deepened voice, and male body habitus and psychological characteristics, to the initial consternation of parents and family members. However, the syndrome was quickly recognized to occur in affected families in several parts of the world, for example, in the Dominican Republic, so that newborns with ambiguous genitalia in these families were not forced to grow up as girls. Indeed, some individuals have entered into heterosexual relationships and have been able to function physically and emotionally as men. Others have led isolated lives or retained some female gender identity. The outward switch of sex and gender identity from female to male at puberty in 5-alpha reductase-deficient individuals was initially interpreted as the primacy of nature over nurture; that is, sex hormones being more influential than psychosocial factors imposed since infancy. However, the early recognition that newborn from affected families and with ambiguous genitalia might have masculine pubertal development led to these infants’ being raised either as boys or ambiguously as girls. The nature–versus– nurture dichotomy, therefore, is not as straightforward as some would believe. As already mentioned, for some hormone-dependent behaviors, one hormone is not enough: a combination is required. And for hormone combinations, order of administration can be crucial. Consider the combination of estrogens
c06.indd 109
109
and progestins acting to facilitate reproductive and maternal behaviors. Even for the same pair of hormones, different temporal patterns of combinatorial action will be important for different behaviors. For example, in many female laboratory animals, a long priming treatment of estrogens—48 hours or more—followed by a brief progesterone exposure will promote female-typical sexual behaviors. The neural circuit and some of the genes supporting the mechanisms for this type of behavior have been worked out (see earlier discussion). In a biologically adaptive fashion, the requirement of E followed by P for female sex behavior synchronizes it with ovulation, which has the same combinatorial effects. However, following the estrogen priming, if the progesterone is allowed to stay around for a long time, the opposite effect is seen on sexual behavior. Finally, in a pregnant animal, estrogen levels are high and remain high during parturition. In contrast, progesterone levels start high but decline around the time of giving birth. High levels of estrogens and a declination in progestins make the optimal combination for maternal behavior (Numan & Insel, 2003). A different kind of combination of hormone actions underlies social recognition, motivation, and memory in female laboratory animals. Several labs have shown the powerful effects of oxytocin, and in some species vasopressin, on this cluster of behaviors. In addition, estrogens have an overriding effect in two ways, as noted in Figure 6.7. Working through estrogen receptor- estrogens turn on the oxytocin gene; and working through ER- they turn on the oxytocin receptor gene. Therefore, in this combination of hormones, estradiol and oxytocin, one has a superordinate relation to the other. Estrogens have to come first, to turn on the gene for the oxytocin receptor, for instance, and oxytocin later.
PRINCIPLE 8: SOME OF THE INFLUENCES ON SPECIFIC HORMONE-BEHAVIOR SYSTEMS ARE REMARKABLY NONSPECIFIC Decades of work have gone into the elucidation of the neuroanatomical pathways and neurophysiological mechanisms related to arousal. Mechanisms for generalized arousal of the CNS (Pfaff, 2006) are fairly well known at the neuroanatomical level. Their most important features emphasize multiplicity and redundancy of ascending arousal pathways in such a way as to prevent failure. Five major neurochemically distinct systems work together to increase arousal. They use norepinephrine, dopamine, serotonin, acetylcholine, and histamine as transmitters. They all begin in the brain stem and converge in the thalamus or in the basal forebrain (Figure 6.8). They overlap and cooperate. Their very multiplicity ensures against failure.
8/17/09 2:00:41 PM
110
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors A
P
)
cts rA o t x Mo rte ed al co t c ire nt (D Fro
(Se
nso ry Pos , Aler teri tne or n ss, A eoc tte ort ntio ex n
(Emotional reactivity, Au tonomic arousal)
)
Limbic cortex Striatum ACh Basal forebrain
HA
Hypothalamus
5HT
DA
NE
Ascending activating systems
Figure 6.8 Classical ascending systems elevate CNS arousal. Note. From Brain Arousal and Information Theory, by D. Pfaff, 2006, Cambridge, MA: Harvard University Press. Adapted with permission.
Four sensory systems feed ascending arousal pathways in a straightforward fashion. These clearly show how vestibular stimuli, somatosensory, auditory, and taste stimuli on the tongue could arouse an animal or human being. Pain mechanisms further dramatize how a vastly amplified somatosensory signal from the skin or the viscera can wake up and alert an individual. Moreover, pain pathways and sexual cutaneous signals overlap, and share the ability to cause states of high arousal (Figure 6.9). In contrast, electrical impulses triggered by odor stimuli enter the brain through tracts in the basal forebrain, and project to the amygdala, a primary receiving zone which itself is connected with high degrees of arousal, during both sex and fear. Visual stimuli impact CNS arousal pathways both through the outer layers of the superior colliculus and through the reticular and medial cell groups of the thalamus. An important point to reiterate is that these various arousal-related transmitter systems and sensory signals converge. Whether in the basal forebrain or in the medial thalamus, a strong signal for cortical arousal is generated and must be distributed broadly in the cerebral cortex to command the attention of a wide variety of higher level perceptual processers and motor control cell groups. In terms of the general principles illustrated by CNS arousal pathways, it is eminently clear that arousal mechanisms impact neuroendocrine processes by their projections into the hypothalamus. They are bilateral. Unilateral damage in the animal brain or human brain has little effect on
c06.indd 110
Heightened arousal Altered motivational state
Shared ascending pathways in A-L columns of spinal cord Stimulusproduced analgesia Sexually relevant
Painful
Somatosensory stimuli
Figure 6.9 Painful and sexually relevant cutaneous stimuli both increase CNS arousal but have different behavioral consequences.
generalized arousal or consciousness. Second, they are bidirectional. In addition to the classical aminergic ascending pathways just mentioned, there are crucial descending pathways (e.g., vasopressin, histamine, orexin). Third, these pathways have been conserved across a variety of species, including humans. Finally, these pathways always potentiate an animal’s or human’s behavioral responsivity.
8/17/09 2:00:41 PM
Principle 8
These responses may be active approach responses, but in the case of fearful or stressful inputs, these are avoidance responses. It has been hypothesized that some of the most important cells stimulating CNS arousal are medullary gigantocellular neurons that have bifurcating axons, both ascending and descending in the brain stem (Figure 6.10) and that likely contribute to cortical and to autonomic arousal, respectively. Electrical stimulation of these neurons in the rat medullary reticular formation stimulates EEG arousal, decreasing spectral power in the low-frequency delta range and increasing power in the high-frequency gamma range (Wu, Stavarache, Pfaff, & Kow, 2007). Even with these data in hand, further work will be necessary to sort out contradictions in the literature and to specify the exact molecular characteristics of these “master cells” for arousal and the exact mechanisms by which they influence CNS arousal. In neurophysiological terms, cells involved in generalized arousal of the CNS would be expected to respond to a variety of stimuli in several sensory modalities. During electrophysiological recordings from reticular and raphe neurons in the medulla, such neurons have been found (Hubscher & Johnson, 2002; Leung & Mason, 1998, 1999; Martin, Pavlides, & Pfaff, unpublished work). Moving anterior in the brain stem, certain ponto-medullary reticular neurons (Peterson, Anderson, & Filion, 1974) as well as
POA PVNp (OT, AVP) SCN TMN (HA) LHA (Orexin/ Hypocretin)
Brainstem
Autonomic Controls
Figure 6.10 CNS arousal depends on descending systems as well as ascending systems. Note. From Brain Arousal and Information Theory, by D. Pfaff, 2006, Cambridge, MA: Harvard University Press. Adapted with permission.
c06.indd 111
111
the omnipause neurons of Phillips, Ling, and Fuchs (1999) recorded in the pons, also fit the requirement that cells be multimodal in their range of sensitivities and have firing rates correlated with arousal and visual attention. In the midbrain, the work of Horvitz, Stewart, and Jacobs (1997), recording from dopaminergic neurons, revealed responses that were correlated with the activation of behavior, that is, the initiation of motor responses directed toward salient stimuli. These and many other reports supply the neurophysiological basis of generalized arousal responses. Functional Genomics Data are accumulating rapidly with respect to neurochemical and genomic mechanisms for both arousal and stress. A large number of genes, more than 120, participate in regulating generalized CNS arousal. The large number is due to the inclusion of gene-encoding synthetic enzymes, receptors (for serotonin, alone, there are 14), transporters, and catabolic enzymes for both the relevant neurotransmitters and neuropeptides; both those increasing and those decreasing arousal (Pfaff, 2006). As might be expected, sex hormones are involved in CNS arousal. Disruption of the gene-encoding estrogen receptor alpha severely reduced arousal measures in female mice, compared to their wildtype littermate controls (Figure 6.11). Interestingly, disruption of the gene for estrogen receptor beta, a likely gene duplication product, had no significant effect (Garey et al., 2003). There are additional implications of having so many genes controlling arousal mechanisms. The heterogeneity among the genes involved presumably provides for great flexibility of response. The very multiplicity yields the possibility of large numbers of meaningful patterns of gene expression. In a neuroendocrine context, we have shown that one never could understand gene/behavior relations on a one-by-one basis. Moving beyond Beadle and Tatum’s concept from their work with the fungus Neurospora—their classical “one gene/one enzyme” concept—we reached the conclusion that different patterns of gene expression yield different patterns of sociosexual behaviors (Pfaff, Ogawa, Kia, Frohlich, & Kow, 2002). Finally, how do these generalized arousal forces impact specific neuroendocrine mechanisms? The answer lies in the exact neuroanatomical localizations of the receptors for arousal-related transmitters and neuropeptides in brain regions controlling the pituitary and hormone-dependent behaviors. For female sexual behavior, two arousalenhancing transmitters, norepinephrine and histamine, can influence sexual arousal and lordosis behavior by their excitatory effects on the electrical activity of ventromedial hypothalamic neurons (see earlier discussion). Conversely, mu-receptor opioid agonists (Devidze, Lee, Martin, &
8/17/09 2:00:42 PM
Olfactory
Stimulus
Olfactory
Auditory
0
20
40
60
80
Vestibular
Tactile
Olfactory
Auditory
Auditory
0
20
Stimulus
0
20
40
60
80
Vestibular
Vestibular
**
Olfactory Stimulus
Tactile
*
Olfactory Stimulus
Tactile
*
Total movement
Auditory
Auditory
Note. There is no statistically significant effect of ERKO- gene deletion, compared to their respective WT controls. From “Genetic Contributions to Generalized Arousal of Brain and Behavior,” by Garey et al., 2003, Proceedings of the National Academy of Sciences, 100, 11019–11022. Adapted with permission.
Figure 6.11 Estrogen receptor alpha knockout (ERKO-) mice show decreased arousal responses to stimuli in several sensory modalities, compared to their wildtype (WT) littermate controls.
Stimulus
0
60
80
Tactile
Olfactory
Stimulus
Tactile
**
40
100
ERKO
Vestibular
*
60
80
100
100
0
**
␣ERKO
100
WT
Auditory
20
40
60
80
100
3500
Vestibular
Olfactory
Stimulus
Tactile
*
␣WT
Peak movement
1200
0 Auditory
Vestibular
*
Movement number
200
0
20
Olfactory
Auditory
20
40
20
Tactile
Stimulus
Tactile
**
*
40
Vestibular
Vestibular
**
60
80
100
40
60
80
100
200
0
20
40
60
80
100
Response Response
Movement time
Response Response
Response
Response
Response Response
c06.indd 112
8/17/09 2:00:42 PM
Principle 9
Pfaff, submitted) and prostaglandin D (Mong et al., 2003), reduce generalized arousal, and as a consequence reduce lordosis behavior. The logical relations among generalized arousal, sexual arousal, and sexual behavior are shown in Table 6.1. More detail on CNS arousal and the activation of behavior can be found in Chapter 23.
PRINCIPLE 9: NEUROENDOCRINE MECHANISMS HAVE BEEN CONSERVED AND CONTINUE TO PROVIDE ADAPTIVE COORDINATION OF BRAIN WITH BODY TO REGULATE BEHAVIOR A modern neuroendocrinologist may not try to explain the psychological side of human mentation—the full range of mental, artistic, self-conscious expressions of the person in states of emotion that involve hormonal changes. However, a tremendous number of neuroanatomical, neurophysiological, genetic, and endocrine mechanisms related to sex behavior has been conserved from the animal brain into the human brain (Figure 6.12). To give just a few examples, the steroid hormones are the same in the human brain as in lower mammals. Steroid receptor chemistry, neuroanatomy, and molecular mechanisms are also conserved. Neuroendocrine system anatomy is much
TABLE 6.1
113
more highly conserved than other parts of the forebrain, and the tendency of hormone-responsive neurons to project to other hormone-responsive neurons is also preserved. The neuroendocrine neuropeptide par excellence is gonadotropin releasing hormone (GnRH, also called LHRH). Its human brain chemistry is identical to that in lower animals, its physiology, its release mechanisms, and its receptor physiology (e.g., in the anterior pituitary gland) are likewise conserved. Most strikingly, the unique migration during development of GnRH neurons from the olfactory pit to their final functional positions in the basal forebrain, discovered in mice (Schwanzel-Fukuda & Pfaff, 1989) also occurs in humans. In fact, as stated earlier, a failure of GnRH neuronal migration in men accounts for the loss of libido in X-chromosome-linked Kallmann’s disease (Schwanzel-Fukuda et al., 1989). Basic mechanisms have remained the same. Transcriptional biology, molecular biology of neurons, chemistries of a variety of neurotransmitters and neuropeptides, electrophysiological mechanisms, and many facets of cellular neuropharmacological effects all link human neuroendocrine systems to those of laboratory mammals. Finally, in terms of the neuroendocrinology of reproduction, the chromosomal biology of sex differences, the mechanisms of sex differentiation of the brain, and the basic requirements of behaviors necessary for sperm to meet egg and fertilize have remained very similar as we move in our
Relations among arousal mechanisms and sex behaviors (male and female). Requires
CNS Features
Mechanisms Require
Ag
As
SB
Bilateral
VMN/POA neuronal excitation
Sex hormones
Genes for nonspecific ascend systems
Generalized CNS arousal (Ag)
—
No
No
Yes
No
No
Yes
Sexual arousal (As)
Yes
—
No
Yes
Yes
Yes
Yes
Sex behavior (SB)
Yes
Yes
—
Yes
Yes
Yes
Yes
Genes in CNS. molecular biology of neurons. Hypothalamic neuroanatomy Neuronal connections
Hormone receptors Chemistry neuroanatomy
GnRH neuron migration GnRH physiology
Neurotransmitters Neuropeptides Neurochemistry Neuroanatomy Cellular neuroanatomy Cellular neurophysiology Cellular neuropharmacology
Biology of sex differences Requirements for fertilization Steroid Hormones
c06.indd 113
Figure 6.12 Hormone-dependent mechanisms conserved from animal to human brains (implications for understanding sexual arousal). Note. Since a remarkable number of neuroendocrine mechanisms have been conserved from animal brains into the human brain, modern neuroendocrine research contributes to the understanding of human neuroendocrine functions and their maladies. All of the structures and functions portrayed in this figure are identical or very similar in the human brain compared to a variety of laboratory animal mammalian subjects. From Drive: Neurobiological and Molecular Mechanisms of Sexual Motivation by D. W. Pfaff, 1999, Cambridge, MA: MIT Press.
8/17/09 2:00:42 PM
114
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
thinking from lower mammals to humans. Likewise, sex differences in aggression provide the same types of statistics among humans as in many lower mammals. Therefore, unless nature, having evolved a full set of working mechanisms for mammalian hormone-dependent behaviors, started an entire new set for humans, then we understand well the most primitive neuroendocrine mechanisms that drive hormone-influenced emotional expressions and behaviors in humans.
PRINCIPLE 10: HORMONE EFFECTS ON BRAIN ARE RELEVANT FOR HUMAN BEHAVIORAL PATHOLOGY Examples of disorders in neuroendocrine mechanisms that lead to serious medical conditions of behavioral pathology include hyperthyroid patients who are extremely jittery and nervous hypothyroid patients who are sluggish and dull, and patients with over-expression of CRF who “burn out” in a mixture of anxiety and depression. Many more examples could be given, but we choose to emphasize a neuroendocrine/behavior example that is so extreme that it constitutes a criminal condition. In the following example, the etiology of the disease is complex but its solution represents an application of neuroendocrine engineering. Pedophilia is classified as a psychosexual disorder. Strong obsession and compulsion components of pedophilia make incorporation of cognitive behavioral skills difficult. Medications that lower sex drive may enhance voluntary control (Berlin, 1983). Testosterone-lowering agents, serotonin re-uptake inhibitors, surgical castration, and stereotaxic neurosurgery have been used to reduce libido, deviant sexual arousal and fantasy, and the frequency of deviant sexual behavior. Pharmacotherapy has included testosterone-lowering agents such as medroxyprogesterone acetate (MPA), cyproterone acetate (CPA) and luteinizing hormone-releasing hormone (LHRH, aka GnRH) inhibitors, and gonadotropin-releasing hormone agonists (GnRH), as well as selective serotonin re-uptake inhibitors (SSRIs). Luprolide acetate (LA), one of several synthesized agonist analogs of LHRH, (aka GnRH), the hypothalamic factor that stimulates gonadotropin release from the pituitary (Vance & Smith, 1984) and produces a paradoxical effect on the pituitary, with initial stimulation of the release of luteinizing hormone (LH) and follicle-stimulating hormone (FSH), followed by inhibition after repeated administration (Belchetz, Plant, Nakai, Keogh, & Knobil, 1978; Bergquist, Nillius, & Wide, 1979; Evans, Doelle, Alexander, Uderman, & Rabin, 1984; Vilchez-Martinez et al., 1974) has been used for this type of therapy. LA causes a reduction in sex hormone
c06.indd 114
release (a decrease in testicular steroidogenesis) that is probably secondary to a primary reduction in LH levels (Vance & Smith, 1984). Agonist analogs have also been shown to decrease the number of LH receptors in Leydig’s cells of hypophysectomized rats (Bambino, Schreiber, & Hsueh, 1980; Vance & Smith, 1984). Testosterone levels attained with continued administration of LA were lower than those attained with other medications and may result in LA being a more potent inhibitor of erectile responses compared with other agents. Patients generally become sexually impotent when plasma testosterone levels are less than one quarter of their initial value. A placebo controlled, blinded, multidisciplinary study of LA in pedophiles, detailed the objective effects of this drug on measurable aspects of the arousal response (Schober et al., 2005). One year of therapy on LA, followed by 1 year on a saline-placebo detailed testosterone levels during polygraph testing, plethysmography (PPG) with audio and visual stimuli, viewing time for visual stimuli, and selfreport of urges and masturbatory frequency toward children. Testosterone levels were reduced to castrate levels (less than 50 ng/dI) or one tenth the mean average level. By the second week of treatment, a sustained, profound suppression of testosterone, FSH, and LH remained for duration of treatment with LA. On saline placebo, testosterone levels rose slowly over a 3-month period to approach baseline. During the initial testosterone rise, no patient reported an increase in pedophilic urges, sex drive, or masturbation. Subsequently, levels fell to a mean of 11.6 ng/dL 1 month after the initial injection and remained low until LA was withdrawn. After testosterone fell to castrate levels, the study subjects reported decreased sex drive, decreased pedophilic sexual urges, and decreased masturbation frequency. When LA was replaced with saline placebo, all subjects initially reported no increase in sex drive, pedophilic sexual urges, or masturbation frequency. After 3 months on placebo, testosterone averaged 195 ng/dL, some subjects continued to report no increase in sex drive, pedophilic sexual urges, or masturbation frequency, but others expressed great distress that the medication was losing effectiveness and they were fearful of reoffense. At this time the placebo was revealed to the subjects. All chose to return to LA therapy. Throughout the study, modified visual reaction time results detected the subject’s self-reported choice of interest/preference, and indicated no consistent change in pedophilic interest preference as measured by visual reaction time. Overall, interest preference, as measured by PPG, indicated no consistent change in pedophilic interest preferences with LA therapy. However, the degree to which subjects responded decreased significantly, demonstrating that LA significantly decreased arousal. Penile plethysmography verified self-reported claims of lowered libido, in that LA therapy
8/17/09 2:00:43 PM
References
caused significant reductions in the magnitude of their sexual arousal pattern. LA significantly reduced pedophilic fantasies, urges, and masturbation on self-report. On all polygraphic assessments, at baseline, almost all responses were classified as deceptive. Deceptive responses about masturbation frequency, urges to initiate sexual contact, and sexual thoughts, decreased dramatically with LA therapy and deceptive responses increased dramatically on placebo. A direct and readily apparent correlation existed between deceptive responses and LA injection. When urges and masturbation frequency decreased on LA, a polygraph indicated almost no deceptive responses. Even with profound testosterone suppression, a complete suppression of arousal did not occur. Low levels of arousal and erectile ability persisted with sufficient tumescence to generate detectable levels on PPG. While on LA, subjects indicated they were better able to focus on employment, relaxation activities, and life planning without continual interruptions by sexual thoughts. All subjects noted a decrease in anxiety, better ability to regulate or control their actions, and increased motivation for work/school activities.
SUMMARY Considerable progress has been made in understanding the neuroanatomical, neurophysiological, and genomic mechanisms of neuroendocrine systems that regulate behavior. We expect that as our knowledge of neuroendocrine mechanisms deepens, it will be even easier to see how their functions contribute to behavioral regulation and their disorders lead to behavioral pathologies.
REFERENCES Anzick, S. L., Kononen, J., Walker, R. L., Azorsa, D. O., Tanner, M. M., Guan, X. Y., et al. (1997). AIB1, a steroid receptor coactivator amplified in breast and ovarian cancer. Science, 277, 965–968. Apostolakis, E. M., Ramamurphy, M., Zhou, D., Oñnate, S., & O’Malley, B. (2002). Acute disruption of select steroid receptor coactivators prevents reproductive behavior in rats and unmasks genetic adaptation in knockout mice. Molecular Endocrinology, 16, 1511–1523. Auger, A. P., Tetel, M. J., & McCarthy, M. M. (2000). Steroid receptor coactivator-1 mediates the development of sex specific brain morphology and behavior. Proceedings of the National Academy of Sciences, USA, 97, 7551–7555. Bambino, T. H., Schreiber, J. R., & Hsueh, A. J. (1980). Gonadotropinreleasing hormone and its agonist inhibit testicular luteinizing hormone receptor and steroidogenesis in immature and adult hypophysectomized rats. Endocrinology, 107, 908–917. Beato, M., & Sánchez-Pacheco, A. (1996). Interaction of steroid hormone receptors with the transcription initiation complex. Endocrine Reviews, 17, 587–609.
c06.indd 115
115
Belchetz, P. E., Plant, T. M., Nakai, Y., Keogh, E. J., & Knobil, E. (1978). Hypophysial responses to continuous and intermittent delivery of hypopthalamic gonadotropin-releasing hormone. Science, 202, 631–633. Benoit, G., Cooney, A., Giguere, V., Ingraham, H., Lazar, M., Muscat, G., et al. (2006). International union of pharmacology. LXVI. Orphan nuclear receptors. Pharmacology Review, 58, 798–836. Bergquist, C., Nillius, S. J., & Wide, L. (1979). Intranasal gonadotropinreleasing hormone agonist as a contraceptive agent. Lancet, 2, 215–217. Berlin, F. S. (1983). Sex offenders: A biomedical perspective and a status report on biomedical treatment. In J. C. Greer & I. R. Stuart (Eds.), The sexual aggressor: Current perspectives on treatment (pp. 82–123). New York: Van Nostrand-Reinhold. Blaustein, J. D., & Mani, S. K. (2006). Feminine sexual behavior from neuroendocrine and molecular neurobiological perspectives. In J. D. Blaustein (Ed.), Handbook of neurochemistry and molecular neurobiology (pp. 95–150). New York: Springer. Bousios, S., Karandrea, D., Kittas, C., & Kitraki, E. (2001). Effects of gender and stress on the regulation of steroid receptor coactivator-1 expression in the rat brain and pituitary. Journal of Steroid Biocheminstry and Molecular Biology, 78, 401–407. Camacho-Arroyo, I., Neri-Gomez, T., Gonzalez-Arenas, A., & GuerraAraiza, C. (2005). Changes in the content of steroid receptor coactivator-1 and silencing mediator for retinoid and thyroid hormone receptors in the rat brain during the estrous cycle. Journal of Steroid Biocheminstry and Molecular Biology, 94, 267–272. Charlier, T. D., Ball, G. F., & Balthazart, J. (2005). Inhibition of steroid receptor coactivator-1 blocks estrogen and androgen action on male sex behavior and associated brain plasticity. Journal of Neuroscience, 25, 906–913. Charlier, T. D., Ball, G. F., & Balthazart, J. (2006). Plasticity in the expression of the steroid receptor coactivator-1 in the Japanese quail brain: Effect of sex, testosterone, stress and time of the day. Neuroscience, 172, 333–343. Charlier, T. D., Lakaye, B., Ball, G. F., & Balthazart, J. (2002). Steroid receptor coactivator SRC-1 exhibits high expression in steroid-sensitive brain areas regulating reproductive behaviors in the quail brain. Neuroendocrinology, 76, 297–315. Choleris, E., Gustafsson, J.-Å., Korach, K. S., Muglia, L. J., Pfaff, D. W., & Ogawa, S. (2003). An estrogen-dependent four-gene micronet regulating social recognition: A study with oxytocin and estrogen receptorand: Knockout mice. Proceedings of the National Academy of Sciences, 100, 6192–6197. Choleris, E., Little, S. R., Mong, J. A., Puram, S. V., Langer, R., & Pfaff, D. W. (2007). Microparticle-based delivery of oxytocin receptor antisense DNA in the medial amygdala blocks social recognition in female mice. Proceedings of the National Academy of Sciences, 104, 4670–4675. DeMarzo, A., Beck, C. A., Oñate, S. A., & Edwards, D. P. (1991). Dimerization of mammalian progesterone receptors occurs in the absence of DNA and is related to the release of the 90-kDa heat shock protein. Proceedings of the National Academy of Sciences, 88, 72–76. Devidze, N., Lee, A., Martin, E, & Pfaff. D. (Submitted). Molecular properties of medullary reticular neurons. Dutertre, M., & Smith, C. L. (2003). Ligand-independent interactions of p160/steroid receptor coactivators and CREB-binding protein (CBP) with estrogen receptor-alpha: Regulation by phosphorylation sites in the A/B region depends on other receptor domains. Molecular Endocrinology, 17, 1296–1314. Evans, R. M. (1988). The steroid and thyroid hormone receptor superfamily. Science, 240, 889–895. Evans, R. M., Doelle, G. C., Alexander, A. N., Uderman, H. D., & Rabin, D. (1984, May). Gonadotropin and steroid secretory patterns during chronic treatment with a luteinizing hormone-releasing hormone agonist analog in men. Journal of Clinical Endocrinology and Metabolism, 58, 862–867.
8/17/09 2:00:43 PM
116
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
Freedman, L. P., & Luisi, B. F. (1993). On the mechanism of DNA binding by nuclear hormone receptors: A structural and functional perspective. Journal of Cellular Biochemistry, 51, 140–150. Garey, J., Goodwillie, A., Frohlich, J., Morgan, M., Gustafsson, J.-A., Smithies, O., et al. (2003). Genetic contributions to generalized arousal of brain and behavior. Proceedings of the National Academy of Sciences, 100, 11019–11022. Gorski, J., Toft, D., Shyamala, G., Smith, D., & Notides A. (1968). Hormone receptors: Studies on the interaction of estrogen with the uterus. Recent Progress in Hormone Research, 24, 45–80. Grenier, J., Trousson, A., Chauchereau, A., Cartaud, J., Schumacher, M., & Massaad, C. (2005). Differential recruitment of p160 coactivators by glucocorticoid receptor between Schwann cells and astrocytes. Molecular Endocrinology, 20, 254–267.
Lonard, D. M., & O’Malley, B. W. (2006). The expanding cosmos of nuclear receptor coactivators. Cell, 125, 411–414. Maerkel, K., Durrer, S., Henseler, M., Schlumpf, M., & Lichtensteiger, W. (2007). Sexually dimorphic gene regulation in brain as a target for endocrine disrupters: Developmental exposure of rats to 4-methylbenzylidene camphor. Toxicology and Applied Pharmacology, 218, 152–165. Mangelsdorf, D. J., Thummel, C., Beato, M., Herrlich, P., Schütz, G., Umesono, K., et al. (1995). The nuclear receptor superfamily: The second decade. Cell, 83, 835–839. Mani, S. K., Reyna, A. M., Chen, J. Z., Mulac-Jericevic, B., & Conneely, O. M. (2006). Differential response of progesterone receptor isoforms in hormone-dependent and -independent facilitation of female sexual receptivity. Molecular Endocrinology, 20, 1322–1332.
Horvitz, J., Stewart, T., & Jacobs, B. (1997). Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Research Reviews, 759, 251–258.
Martinez de Arrieta, C., Koibuchi, N., & Chin, W. W. (2000). Coactivator and corepressor gene expression in rat cerebellum during postnatal development and the effect of altered thyroid status. Endocrinology, 141, 1693–1698.
Hubscher, C., & Johnson, R. (2002). Inputs from spinal and vagal sources converge on individual medullary reticular neurons. Society for Neuroscience, 271, 2.
McCarthy, M. M., McDonald, C. H., Brooks, P. J., & Goldman, D. (1996). An anxiolytic action of oxytocin is enhanced by estrogen in the mouse. Physiology and Behavior, 60, 1209–1215.
Iannacone, E. A., Yan, A. W., Gauger, K. J., Dowling, A. L. S., & Zoeller, R. T. (2002). Thyroid hormone exerts site-specific effects on SRC-1 and NCoR expression selectively in the neonatal rat brain. Molecular and Cellular Endocrinology, 186, 49–59.
McGinnis, M. Y., Lumia, A. R., Tetel, M. J., Molenda-Figuiera, H. A., & Possidente, B. (2007). Effects of androgenic steroids on the development and expression of running wheel activity and circadian rhythms in male rats. Physiology and Behavior, 92, 1010–1018.
Jensen, E. V., Suzuki, T., Kawasima, T., Stumpf, W. E., Jungblut, P. W., & de Sombre, E. R. (1968). A two-step mechanism for the interaction of estradiol with rat uterus. Proceedings of the National Academy of Sciences, USA, 59, 632–638.
McInerney, E. M., Tsai, M. J., O’Malley, B. W., & Katzenellenbogen, B. S. (1996). Analysis of estrogen receptor transcriptional enhancement by a nuclear hormone receptor coactivator. Proceedings of the National Academy of Sciences, 93, 10069–10073.
Kamei, Y., Xu, L., Heinzel, T., Torchia, J., Kurokawa, R., Gloss, B., et al. (1996). A CBP integrator complex mediates transcriptional activation and AP-1 inhibition by nuclear receptors. Cell, 85, 403–414.
McKenna, N. J., Nawaz, Z., Tsai, S. Y., Tsai, M. J., & O’Malley, B. W. (1998). Distinct steady-state nuclear receptor coregulator complexes exist in vivo. Proceedings of the National Academy of Sciences, 95, 11697–11702.
Kastner, P., Krust, A., Turcotte, B., Stropp, U., Tora, L., Gronemeyer, H., et al. (1990). Two distinct estrogen-regulated promoters generate transcripts encoding the two functionally different human progesterone receptor forms, A & B. EMBO Journal, 9, 1603–1614.
Meijer, O. C., Steenbergen, P. J., & de Kloet, E. R. (2000). Differential expression and regional distribution of steroid receptor coactivators SRC1 and SRC-2 in brain and pituitary. Endocrinology, 141, 2192–2199.
Kininis, M., Chen, B. S., Diehl, A. G., Isaacs, G. D., Zhang, T., Siepel, A. C., et al. (2007). Genomic analyses of transcription factor binding, histone acetylation, and gene expression reveal mechanistically distinct classes of estrogen-regulated promoters. Molecular Cell Biology, 27, 5090–5104. Klein-Hitpass, L., Tsai, S. Y., Weigel, N. L., Allan, G. F., Riley, D., Rodriguez, R., et al. (1990). The progesterone receptor stimulates cellfree transcription by enhancing the formation of a stable preinitiation complex. Cell, 60, 247–257. Kuiper, G. G. J. M., Enmark, E., Pelto-Huikko, M., Nilsson, S., & Gustafsson, J. (1996). Cloning of a novel estrogen receptor expressed in rat prostate and ovary. Proceedings of the National Academy of Sciences, 93, 5925–5930. Lange, C. A. (2007). Editorial: Membrane and nuclear steroid hormone receptors: Two integrated arms of the same signaling pathway? Steroids, 72, 105–106. Lees, J. A., Fawell, S. E., & Parker, M. G. (1989). Identification of two transactivation domains in the mouse oestrogen receptor. Nucleic Acid Research, 17, 5477–5487. Leung, C. G., & Mason, P. P. (1998). Physiological survey of medullary raphe and magnocellular reticular neurons in the anesthetized rat. Journal of Neurophysiology, 80, 1630–1646. Leung, C. G., & Mason, P. P. (1999). Physiological properties of raphe magnus neurons during sleep and waking. Journal of Neurophysiology, 81(2), 584–595. Li X., Wong, J., Tsai, M. J., & O’Malley, B. (2003). Progesterone and glucocorticoid receptors recruit distinct coactivator complexes and promote distinct patterns of local chromatin modification. Molecular and Cellular Biology, 23, 3763–3773.
c06.indd 116
Meijer, O. C., van der Laan, S., Lachize, S., Steenbergen, P. J., & de Kloet, E. R. (2006). Steroid receptor coregulator diversity: What can it mean for the stressed brain? Neuroscience, 138, 891–899. Misiti, S., Schomburg, L., Yen, P. M., & Chin, W. W. (1998). Expression and hormonal regulation of coactivator and corepressor genes. Endocrinology, 139, 2493–2500. Mitev, Y. A., Wolf, S. S., Almeida, O. F., & Patchev, V. K. (2003). Developmental expression profiles and distinct regional estrogen responsiveness suggest a novel role for the steroid receptor coactivator SRC-l as a discriminative amplifier of estrogen signaling in the rat brain. Federation of American Societies for Experimental Biology. Journal, 17, 518–519. Molenda, H. A., Kilts, C., Allen, R. L., & Tetel, M. J. (2003). Nuclear receptor coactivator function in reproductive physiology and behavior. Biology of Reproduction, 69, 1449–1457. Molenda, H. A., Griffin, A. L., Auger, A. P., McCarthy, M. M., & Tetel, M. J. (2002). Nuclear receptor coactivators modulate hormone-dependent gene expression in brain and female reproductive behavior in rats. Endocrinology, 143, 436–444. Molenda-Figueira, H. A., Williams, C. A., Griffin, A. L., Rutledge, E. M., Blaustein, J. D., & Tetel, M. J. (2006). Nuclear receptor coactivators function in estrogen receptor- and progestin receptor-dependent aspects of sexual behavior in female rats. Hormones and Behavior, 50, 383–392. Mong, J. A., Devidze, N., Frail, D. E., O’Connor, L. T., Samuel, M., Choleris, E., et al. (2003). Estradiol differentially regulates lipocalintype prostaglandin D synthase transcript levels in the rodent brain: Evidence from high-density oligonucleotide arrays and in situ hybridization. Proceedings of the National Academy of Sciences, 100, 318–323.
8/17/09 2:00:44 PM
References Mong, J. A., & Pfaff, D. W. (2004). Hormonal symphony: Steroid orchestration of gene modules for sociosexual behaviours. Molecular Psychiatry, 9, 550–556. Nomura, M., Andersson, S., Korach, K. S., Gustafsson, J. A., Pfaff, D. W., & Ogawa, S. (2006). Estrogen receptor-beta gene disruption potentiates estrogen-inducible aggression but not sexual behaviour in male mice. European Journal of Neuroscience, 23, 1860–1868. Nomura, M., Durbak, L., Chan, J., Smithies, O., Gustafsson, J. A., Korach, K. S., et al. (2002). Genotype/age interactions on aggressive behavior in gonadally intact estrogen receptor beta knockout (betaERKO) male mice. Hormones and Behavior, 41, 288–296. Numan, M., & Insel, T. (2003). The neurobiology of parental behavior. Heidelberg: Springer Verlag. Ogawa, H., Nishi, M., & Kawata, M. (2001). Localization of nuclear coactivators p300 and steroid receptor coactivator 1 in the rat hippocampus. Brain Research, 890, 197–202. Ogawa, S., Choleris, E., & Pfaff, D. W. (2004). Genetic influences on aggressive behaviors and arousability in animals. Annals, New York Academy of Sciences, 1036, 257–266. Ogawa, S., Eng, V., Taylor, J., Lubahn, D., Korach, K., & Pfaff, D. (1998). Roles of estrogen receptor-alpha gene expression in reproductionrelated behaviors in female mice. Endocrinology, 139, 5070–5081. Ogawa, S., Washburn, T., Taylor, J., Lubahn, D., Korach, K., & Pfaff, D. (1998). Modifications of testosterone-dependent behaviors by estrogen receptor-alpha gene disruption in male mice. Endocrinology, 139, 5058–5069. O’Malley, B. W. (2006). Molecular biology. Little molecules with big goals. Science, 313, 1749–1750. Oñate, S. A., Tsai, S. Y., Tsai, M. J., & O’Malley, B. W. (1995). Sequence and characterization of a coactivator for the steroid hormone receptor superfamily. Science, 270, 1354–1357. Oñate, S. A., Boonyaratanakornkit, V., Spencer, T. E., Tsai, S. Y., Tsai, M. J., Edwards, D. P., et al. (1998). The steroid receptor coactivator-1 contains multiple receptor interacting and activation domains that cooperatively enhance the activation function 1 (AF1) and AF2 domains of steroid receptors. Journal of Biological Chemistry, 273, 12101–12108. Peterson, B. W., Anderson, M. E., & Filion, M. (1974). Responses of pontomedullary reticular neurons to cortical, tectal and cutaneous stimuli. Experimental Brain Research, 21, 19–44. Pfaff, D. (2006). Brain arousal and information theory. Cambridge, MA: Harvard University Press. Pfaff, D. (1999). Drive: Neurobiological and molecular mechanisms of sexual motivation. Cambridge, MA: MIT Press. Pfaff, D. (1980). Estrogens and brain function: Neural analysis of a hormone-controlled mammalian reproductive behavior. New York: Springer-Verlag. Pfaff, D. (1997). Hormones, genes, and behavior. Proceedings of the National Academy of Sciences, 94, 14213–14216. Pfaff, D., Ogawa, S., Kia, K., Frohlich, J. & Kow, L.-M. (2002). Genetic mechanisms in controls over female reproductive behaviors. In. D. Pfaff,. A. Arnold, S. Fahrbach, A. Etgen, R. Rubin (Eds.), Hormones, brain, and behavior (pp. 441–510). San Diego, CA: Academic Press/Elsevier. Phillips, J., Ling, L., & Fuchs, A. (1999). Action of the brainstem saccade generator during horizontal gaze shifts: Pt. I. Discharge patterns of omnidirectional pause neurons. Journal of Neurophysiology, 81, 1284–1295. Pratt, W. B., Galigniana, M. D., Morishima, Y., & Murphy, P. J. (2004). Role of molecular chaperones in steroid receptor action. Essays in Biochemistry, 40, 41–58. Ramos, H. E., & Weiss, R. E. (2006). Regulation of nuclear coactivator and corepressor expression in mouse cerebellum by thyroid hormone. Thyroid, 16, 211–216.
c06.indd 117
117
Rønnekleiv, O. K., & Kelly, M. J. (2005). Diversity of ovarian steroid signaling in the hypothalamus. Frontiers in Neuroendocrinology, 26, 65–84. Rosenfeld, M. G., Lunyak, V. V., & Glass, C. K. (2006). Sensors and signals: A coactivator/corepressor/epigenetic code for integrating signal-dependent programs of transcriptional response. Genes and Development, 20, 1405–1428. Schober, J. M., Kuhn, P. J., Kovacs, P. G., Earle, J. H., Byrne, P. M., & Fries, R. A. (2005). Leuprolide acetate suppresses pedophilic urges and arousability. Archives of Sexual Behavior, 34, 691–705. Schwanzel-Fukuda, M., Bick, D., & Pfaff, D. W. (1989). Luteinizing hormone-releasing hormone (LHRH)-expressing cells do not migrate normally in an inherited hypogonadal (Kallmann) syndrome. Brain Research: Molecular Brain Research, 6, 311–326. Schwanzel-Fukuda, M., & Pfaff, D. W. (1989). Origin of luteinizing hormone-releasing hormone neurons. Nature, 338, 161–164. Shearman, L. P., Zylka, M. J., Reppert, S. M., & Weaver, D. R. (1999). Expression of basic helix-loop-helix/PAS genes in the mouse suprachiasmatic nucleus. Neuroscience, 89, 387–397. Shiau, A. K., Barstad, D., Loria, P. M., Cheng, L., Kushner, P. J., Agard, D. A., et al. (1998). The structural basis of estrogen receptor/coactivator recognition and the antagonism of this interaction by tamoxifen. Cell, 95, 927–937. Smith, C. L., Oñate, S. A., Tsai, M. J., & O’Malley, B. W. (1996). CREB binding protein acts synergistically with steroid receptor coactivator-1 to enhance steroid receptor-dependent transcription. Proceedings of the National Academy of Sciences, 93, 8884–8888. Tanenbaum, D. M., Wang, Y., Williams, S. P., & Sigler, P. B. (1998). Crystallographic comparison of the estrogen and progesterone receptor ’s ligand binding domains. Proceedings of ther National Academy of Sciences, 95, 5998–6003. Tetel, M. J., Jung, S., Carbajo, P., Ladtkow, T., Skafar, D. F., & Edwards, D. P. (1997). Hinge and amino-terminal sequences contribute to solution dimerization of human progesterone receptor. Molecular Endocrinology, 11, 1114–1128. Tetel, M. J., Siegal, N. K., & Murphy, S. D. (2007). Cells in behaviourally relevant brain regions coexpress nuclear receptor coactivators and ovarian steroid receptors. Journal of Neuroendocrinology, 19, 262–271. Tetel, M. J., Ungar, T. C., Hassan, B., & Bittman, E. L. (2004). Photoperiodic regulation of androgen receptor and steroid receptor coactivator-1 in Siberian hamster brain. Molecular Brain Research, 131, 79–87. Tora, L., Gronemeyer, H., Turcotte, B., Gaub, M. P., & Chambon, P. (1988). The N-terminal region of the chicken progesterone receptor specifies target gene activation. Nature, 333, 185–188. Tsai, M. J., & O’Malley, B. W. (1994). Molecular mechanisms of action of steroid/thyroid receptor superfamily members. Annual Review of Biochemistry, 63, 451–486. Vance, M. A., & Smith, J. A. (1984). Endocrine and clinical effects of leuprolide in prostatic cancer. Clinical Pharmacology and Therapeutics, 36, 350–354. Vasudevan, N., & Pfaff, D. W. (2007). Membrane-initiated actions of estrogens in neuroendocrinology: Emerging principles. Endocrine Reviews, 28, 1–19. Vasudevan, N., Zhu, Y. S., Daniel, S., Koibuchi, N., Chin, W. W., & Pfaff, D. (2001). Crosstalk between oestrogen receptors and thyroid hormone receptor isoforms results in differential regulation of the preproenkephalin gene. Journal of Neuroendocrinology, 13, 779–790. Vilchez-Martinez, J. A., Coy, D. H., Arimura, A., Coy, E. J., Hirotsu, Y., & Schally, A. V. (1974). Synthesis and biological properties of (Leu-6)-LHRH and (D-Leu-6,desGly-NH210)-LH-RH ethylamide. Biochemical and Biophysical Research Communications, 59, 1226–1232. Voegel, J. J., Heine, M. J. S., Zechel, C., Chambon, P., & Gronemeyer, H. (1996). TIF2, a 160 kDa transcriptional mediator for the liganddependent activation function AF-2 of nuclear receptors. EMBO Journal, 15, 3667–3675.
8/17/09 2:00:44 PM
118
Neuroendocrinology: Mechanisms by Which Hormones Affect Behaviors
Webb, P., Nguyen, P., Shinsako, J., Anderson, C., Feng, W., Nguyen, M. P., et al. (1998). Estrogen receptor activation function 1 works by binding p160 coactivator proteins. Molecular Endocrinology, 12, 1605–1618. Wooley, C., & Cohen, R. (2002). Sex steroids and neuronal growth in adulthood. In D. Pfaff, A. Arnold, A. Etgen, S. Fahrbach, & R. Rubin (Eds.), Hormones brain and behavior (Vol. 4, pp. 717–778). San Diego, CA: Academic Press/Elsevier.
c06.indd 118
Wu, H. Stavarache, M., Pfaff, D.W., & Kow, L. (2007). Arousal of cerebral cortex electroencephalogram consequent to high-frequency stimulation of ventral medullary reticular formation. Proceedings of the National Academy of Sciences, 104, 18292–18296. Ylikomi, T., Bocquel, M. T., Berry, M., Gronemeyer, H., & Chambon, P. (1992). Cooperation of protosignals for nuclear accumulation of estrogen and progesterone receptors. EMBO Journal, 11, 3681–3694.
8/17/09 2:00:45 PM
Chapter 7
Neuroimmunology STEVEN F. MAIER AND LINDA R. WATKINS
by the discoveries that the activity of immune processes can be classically conditioned (Ader & Cohen, 1975) and that exposure to stressors can alter immune responding (Solomon, 1969). Since learning and stress are “in the brain,” these discoveries suggested that the brain must be able to regulate immune processes. For this reason, this area of investigation has become known as psychoneuroimmunology. These seminal findings were soon followed by work indicating that immune organs are innervated by the sympathetic nervous system (Felten et al., 1987) and that immune cells express receptors for a variety of hormones controlled by the brain (Fahey, Gure, & Munck, 1981), thereby providing pathways by which the brain can contact immune organs and cells and mediate the effects of conditioning and stressors. More recently, it has become clear that communication between the brain and immune systems is bidirectional, with products of the immune system potently regulating the nervous system and phenomena that are the result of neural activity. We focus on this arm of the brain-immune interaction in this chapter because it is immune-to-nervous system communication that has the greatest implications for understanding behavior.
Disciplines within the biological and biomedical sciences have often been defined by organ systems: cardiology, audiology, endocrinology, are but a few examples. This has naturally led investigators to view the system that is the focus of their study as being disconnected from, and operating independently of, other systems. The tendency to think of systems as disconnected has nowhere been more apparent than in the fields of immunology and neuroscience. The immune system and the nervous system have been traditionally thought to operate independently of one another, and even after the research of the past several decades that has documented the close interplay and interaction between these two systems, the most recent texts in immunology do not even contain a reference to the nervous system in their indices, and conversely neuroscience texts typically do not contain even a reference to the immune system. However, living organisms are integrated wholes, and parts do not function in isolation. Because the nervous system is the command and integrative center of the organism, it would be most unlikely for important functions to operate without neural regulation. Functions such as digestion, excretion, the production of energy, and so on are all regulated by the nervous system. For example, the sympathetic nervous system regulates the rate of digestion, and adrenal hormones whose production is under the ultimate control of the hypothalamus regulate energy balance. Host defense against infection and injury is a key function needed for survival and involves widespread mechanisms throughout the body, therefore, it would be unlikely to operate without control from the nervous system. For the nervous system to exercise such control, it must “know” about the status and functioning of the immune system. Thus, it would be unlikely for neural activity to proceed independently of events in the immune system. These two systems are indeed in close communication and potently modulate each other ’s activities. The early research examining the possibility that the nervous and immune systems interact focused primarily on neural regulation of immunity. This emphasis was fueled
THE IMMUNE SYSTEM Defense against infection by microorganisms has been crucial for survival since the earliest periods of evolution. As a result, organisms have developed a complex array of defensive mechanisms to fight infection, and relatedly, to promote tissue repair after injury. If a microorganism succeeds in evading a set of passive bodily defenses (e.g., the epithelial surfaces of the body) it must first be recognized as “nonself ” and/or “dangerous.” You are most likely familiar with what is called “adaptive” or “acquired” immunity. Here, several types of leukocytes (T-cells and B-cells) recognize specific molecular sites on foreign invaders called antigenic sites (an antigen is defined as a molecule that can 119
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c07.indd 119
8/18/09 4:58:08 PM
120
Neuroimmunology
lead to the generation of antibodies). These leukocytes can do so because they express very specific antigen receptors on their surface. Because there are an enormous number of different antigens (roughly 106 for humans), there can be only a small number of T- or B-cells that express the receptor for any particular antigen. This small number of cells would not be sufficient for immune defense, and so the first step in adaptive immunity after antigen presentation to the T-cells is a clonal expansion of cells that express the antigen receptor in question. It takes 3 or 4 days to go through a large enough number of cell cycles to generate a sufficient number of T-cells that recognize the invading microbe, and only then can the effector processes that rid the organism of the invader develop. Innate Immunity Because it takes 3 or 4 days to even begin to generate the effectors of the adaptive immune system (cytotoxic T-cells, antibody, etc.) that can attack the bacteria, virus, and so on, the adaptive immune system cannot be the first line of immune defense. This more rapid defense is accomplished by the innate immune system. If a microorganism crosses an epithelial barrier, it rapidly encounters mononuclear phagocytes called macrophages (literally “big eaters”) that reside in submucosal tissues (see Figure 7.1). Macrophages mature continuously from monocytes that leave the circulation. As do T-cells and B-cells, they recognize nonself via surface receptors, but the macrophage receptors are quite different from those on the cells of the adaptive immune system. Instead of recognizing highly specific antigens, they instead recognize very general features of microorganisms called pathogen-associated molecular patterns (PAMPs). These PAMPs are general molecular motifs that are present on many microorganisms, but are not present in host tissue. For example, viruses almost always express double-stranded RNA, and so macrophages have a receptor (TLR-3) that is ligated by double-stranded RNA. There are only roughly 20 different receptors on macrophages that are used to recognize PAMPs and discriminate self from nonself, rather than the 106 on cells of the adaptive immune system. Thus, all macrophages express all of the receptors, and so clonal expansion to generate more macrophages is not necessary. That is, macrophages are ready right away to engage in immune defense. The binding of microorganisms by the macrophage receptors initiates a number of responses. First, some of the receptors lead the macrophage to engulf the microorganism, leading to its destruction. A variety of toxic substances are also produced (e.g., nitric oxide) that can kill microbes that are engulfed and that are near the macrophage. This process of phagocytosis is very rapid, beginning almost
c07.indd 120
Figure 7.1 A macrophage extending toward a bacterium.
immediately after contact with an invader. Second, other receptors initiate signaling cascades that produce substances that initiate and maintain an inflammatory response at the site of the infection. Inflammation plays three roles: it (1) delivers additional cells (e.g., neutrophils) that can aid in killing, (2) produces a physical barrier that prevents the spread of infection to the bloodstream, and (3) aids in tissue repair. The most important of the substances that are produced are the pro-inflammatory cytokines, chemokines, and prostaglandins. These alter the local blood vessels and cooperate to lead a number of cell types to migrate to the site of infection, producing the redness, swelling, heat, and pain characteristic of local inflammation. Cytokines are small soluble proteins, first discovered to be released by immune cells, which diffuse away from the producing immune cell and serve as communication molecules between cells. The most important cytokines for producing local inflammation are interleukin-1 (interleukin stands for “between leukocytes”), interleukin-6, and tumor-necrosis factor alpha (IL-1, IL-6, TNF). Chemokines are chemoattractant proteins that stimulate the migration and activation of cells. There are many excellent textbooks you can consult to learn more about the immune system (e.g., Janeway, Travers, Walport, & Shlomchik, 2005). To this point, only local responses at the site of infection have been described. However, the very same cytokines that are critical for local innate immune reactions orchestrate a complex set of changes throughout the body that aid
8/18/09 4:58:08 PM
Role of the Brain in Host Defense
in protecting the host. This widespread set of changes is often called the acute-phase response (APR; Baumann & Gauldie, 1994). There are increases in the number of circulating leukocytes (leukocytosis), changes in plasma ions (e.g., iron), the liver shifts from producing the proteins that it normally makes (e.g., carrier proteins) to proteins that aid in fighting infection (e.g., haptoglobin), and so on. These changes are all adaptive. For example, plasma iron is reduced, and bacteria need iron in order to replicate. Again, these all occur within a few hours of infection.
ROLE OF THE BRAIN IN HOST DEFENSE The APR and Sickness Up to this point, none of the host responses to infection described have involved the brain. However, the fever that occurs during infection has long been considered to be part of the APR. Fever is an adaptive defensive mechanism because many microorganisms replicate poorly at elevated core body temperatures, and a number of enzymatic processes involved in the killing and removal of invading pathogens increase in activity at elevated temperatures (Kluger, Kozak, Conn, Leon, & Soszynski, 1996). The important point here, though, is that fever is mediated by the brain. Fever occurs because the set point of temperature-sensitive cells in the hypothalamus is raised, thereby leading to behaviors designed to drive up core body temperature (huddling to conserve heat, shivering to produce heat, etc.; Boulant, 2000). A number of other brain-mediated changes have come to be recognized as being part of the host response to infection. These include alterations in sleep patterns, reductions in activity, a loss of interest in social and sexual activities, increased sensitivity to pain, reduced food and water intake, and altered cognitive functions such as impaired hippocampal-dependent memory formation, among others (Dantzer, 2004). This pattern, which should be familiar to anyone who has had the flu, has been called “sickness.” In addition, there are several brain-mediated changes that are less obvious. The most important are activation of the hypothalamo-pituitary-adrenal (HPA) axis and the sympathetic nervous system (Berkenbosch, de Goeij, Rey, & Besedovsky, 1989). That is, there is a classic physiological “stress response” during sickness. Adaptiveness of Sickness It is important to understand that these sickness responses are not pathological or a reflection of weakness or debilitation produced by infection, but rather represent an adaptive evolved strategy to help combat the infection (Hart, 1988).
c07.indd Sec1:121
121
For example, fever is quite energy intensive, requiring an extra 10% to 13% of metabolism for each degree rise in core body temperature. Thus, it would make sense for the sick organism to use its available energy stores to fight the infection. Reduced activity, reduced social behavior, and so on can be viewed as changes that reduce the energy used by behavior. Foraging for food would also use considerable energy and would expose the organism to an increased risk of predation during the illness, a time of decreased defensive abilities. Finally, the stress response that occurs can be understood in this context. The peripheral endpoints of the HPA axis and the sympathetic nervous system, cortisol and catecholamines, respectively, function to liberate energy from bodily stores, converting glycogen to glucose, and so forth. See Maier and Watkins (1998) for a more extended form of this argument. The manner in which interfering with hippocampaldependent memory formation could be adaptive is not obvious. It may not be adaptive per se. However, a plausible argument can be made. Some forms of memory depend critically on the hippocampus, and some do not. In particular, the hippocampus is involved in the formation of memories involving places or contexts, as opposed to discrete cues such as a tone or a light (Squire, 2004). Thus, for example, if a footshock is paired with a tone in some apparatus, fear will become conditioned to both the tone and the context of the apparatus so that fear responses will occur later to either the tone or the apparatus context. However, the memory of fear to the context requires the hippocampus, but memory of fear to the tone does not. It is thought that this is because the hippocampus is the part of the brain that forms representations of places and contexts (Rudy & Sutherland, 1995). Thus, animals and humans learn to be fearful of and, consequently, to avoid places where aversive events such as footshocks have occurred. This makes good adaptive sense. If a predator has been encountered or detected in a particular place, it is likely that the predator will be there again, and so the place is a good one to be wary about and perhaps avoid altogether. If an organism becomes ill or sick rather than being externally threatened, the symptoms will still appear while the individual is in some context or place, perhaps while it is foraging away from its home territory. However, it is likely that this place had nothing to do with producing the sickness or illness, particularly since the encounter with the microbe or poison will have been several hours before symptom manifestation. Thus, it would not be adaptive for the organism to become fearful of, or to avoid the place where sickness symptoms occurred. Indeed, organisms do not readily learn to avoid places where they became ill, but they readily learn to avoid tastes that precede illness (Garcia, Brett, & Rusiniak, 1989). This pattern
8/18/09 4:58:11 PM
122
Neuroimmunology
would be accomplished if the hippocampus were taken “off line” during sickness. Motivational System Each of these aspects of sickness could be an independent brain-mediated response to microbial invasion of the body. However, it has been convincingly argued (Dantzer et al., 2005) that instead, they are the reflection of a common motivational change. That is, sickness is a motivational state (a central state that organizes the actions of the individual) that then drives these behaviors. That the innate immune response to infection produces brain-mediated changes that involve a shift in motivational state has a number of important implications. First, there must be mechanisms by which the innate immune system can signal the brain, informing the brain that viruses, bacteria, and so on, have entered the body. If there were not pathways by which the immune system could communicate to the brain, the brain would have no way to respond to peripheral infection. Second, discrete areas of the brain must change activity during sickness, otherwise the behavioral and other alterations would not occur. Regions of the brain almost never regulate only a single activity or process, and thus other behavioral and psychological activities not generally thought to be part of sickness may well be profoundly affected during sickness. A number of poorly understood phenomena may be understandable in this way (see the following discussion). Moreover, motivational states are often thought to compete with each other in a hierarchy. For example, fear competes with hunger—a person does not want to forage for food while a threat is present. Since fighting infection and promoting repair is a life and death issue, sickness should command the motivational stage and alter the expression of many other motivational systems. The next several sections examine these issues. Immune-to-Brain Communication Pathways As noted previously, the fact that a number of components of sickness are mediated by the brain implies that the immune system must have a way to “talk” to the brain. From this perspective, in addition to its other well-known functions, the immune system is a diffuse sense organ, scattered throughout the body, whose job it is to inform the nervous system about peripheral infection and injury (Blalock, 1984). This raises two key questions. First, what are the signals that arise from the immune system that serve to communicate to the CNS, and second, by what mechanism(s) do they communicate? The very same pro-inflammatory cytokines that participate in the orchestration of the peripheral inflammatory response also initiate immune communication to the CNS.
c07.indd Sec1:122
This conclusion comes from findings indicating that (a) the peripheral administration of IL-1 in the absence of infection produces many of the brain-mediated sickness responses (Dinarello, 1991) and (b) peripheral blockade of pro-inflammatory cytokine receptors and the peripheral immunoneutralization and/or inhibition of synthesis of these cytokines, reduces or eliminates brain-mediated sickness responses as well as the activation of brain structures during infection (Bluthe, Dantzer, & Kelley, 1992). That is, the CNS does now not “know.” Of the cytokines, IL-1 appears to play the most prominent role, and the discussed manipulations are not always effective (Dunn & Swiergiel, 1998); other molecules may also be involved. This leaves the question of exactly how cytokines such as IL-1 signal the CNS. When considering this issue, a number of facts need to be kept in mind. First, mRNA for IL-1 and other cytokine receptors are widely distributed in the brain (Ericsson, Liu, Hart, & Sawchenko, 1995) and receptors are present on both neurons and glial cells. Second, activation of the innate immune response leads to the de novo synthesis of IL-1 within the brain. That is, cells within the brain, make (Buttini & Boddeke, 1995) and release (Ma, Chen, Oliver, Horvath, & Phelps, 2000) IL-1 in response to peripheral infection. Furthermore, simply injecting IL-1 peripherally leads cells within the brain to synthesize IL-1 (Hansen, Taishi, Chen, & Krueger, 1998). This brain IL-1 production is largely by microglial cells (Van Dam, Bauer, Tilders, & Berkenbosch, 1995; see discussion that follows). The nature of the signal(s) within the brain that leads microglia to become active in response to peripheral infection is still being explored, but adenosine triphosphate (ATP) and heat shock proteins (HSPs) are likely candidates (Mingam et al., 2007). Because IL-1 and other cytokine receptors are expressed on cells within the brain, the most obvious idea would be that cytokines accumulate in the bloodstream during infection, travel to the brain in the blood, then enter the brain parenchyma from the cerebral vasculature and then bind to their receptors. The difficulty is that cytokines are fairly large (IL-1 is roughly 15 kDa) and hydrophilic, and so would not be expected to be able to passively diffuse across the tight junctions involved in the cells of the cerebral blood vessels (the blood-brain barrier). However, a number of other mechanisms allow bloodborne cytokines to signal the brain: (a) Cytokines enter the brain from the blood at the circumventricular organs (Blatteis, 1990), such as the organum vasculosum lamina terminalis (OVLT), where the blood-brain barrier is absent or weak. Since cytokines cannot diffuse very far, they bind to receptors on astrocytes and other cells resident in the region, leading to the production of prostaglandins that can then travel to neural targets (Lin & Lin, 1996). Circulating
8/18/09 4:58:12 PM
Role of the Brain in Host Defense
pathogens themselves can bind to PAMP-recognizing receptors on macrophage-like cells residing in these structures (Lacroix, Feinstein, & Rivest, 1998). (b) Even though cytokines cannot passively diffuse across the blood-brain barrier, there are active transport mechanisms that carry cytokines across the barrier (Banks, 2005) where they can bind with receptors on cells near the blood vessels, such as perivascular macrophages. These macrophages then produce products, such as prostaglandins and nitric oxide, that can activate neurons. (c) Cytokines such as IL-1 bind to receptors on the luminal side of cerebral blood vessels, inducing the production of soluble mediators within the vessels, such as prostaglandins, which then diffuse into the brain parenchyma (Konsman, Vigues, Mackerlova, Bristow, & Blomqvist, 2004). Although the evidence for each of these blood-borne routes of communication is convincing, there are aspects of immune-to-brain signaling that these mechanisms cannot readily explain. First, the brain’s initial response to peripheral immune activation is rapid, with activation of brain structures. For example, Ericsson, Kovacs, and Sawchenko (1994) observed nuclear Fos protein (the gene c-fos is an immediate-early gene that is used as a neuronal activation marker) at 1 hour—the first time point tested. Since it takes considerable time for the Fos protein to be produced after the c-fos gene is activated, communication to the brain must have occurred within minutes. The bloodborne mechanisms almost certainly take longer than this to operate. Second, brain-mediated responses occur after the peripheral injection of quantities of immune activators that are too small to produce increases in blood levels of cytokines or other measured mediators. For example, the injection of very small quantities of lipopolysaccharide (LPS) that do not elevate peripheral circulating cytokines nevertheless produce fever (Hansen, O’Connor, Goehler, Watkins, & Maier, 2001). LPS is a constituent of the cell walls of gram-negative bacteria and is recognized by one of the PAMP-recognizing receptors (TLR-4). Third, it is difficult to understand how blood-borne cytokines can provide the brain with detailed information, such as the site of infection. All of these suggest that there ought to be neural as well as blood-borne mechanisms of immune-to-brain communication. Indeed, if the idea that the immune system functions as a sense organ is more than whimsical, then there ought to be communication from the immune system to the CNS over peripheral nerves—sense organs communicate to the CNS via nerves. The search for such a peripheral nerve can begin by inquiring as to whether there is a nerve that innervates sites in the body where immune responses happen (e.g., lymph nodes) and that sends afferent fibers to the brain. The vagus nerve is prominent in this regard
c07.indd Sec1:123
123
because it innervates sites where pathogens enter the body (e.g., the lungs, the peritoneal cavity) and organs important for innate immunity (e.g., the liver). Indeed, peripheral immune activation, such as is produced by an intraperitoneal injection of LPS, activates afferent vagal nerves (Goehler, Gaykema, Hammack, Maier, & Watkins, 1998). Sensory structures associated with afferent vagal terminals (paraganglia) express IL-1 receptors, providing a mechanism by which peripheral cytokines can activate the vagus (Goehler et al., 1997). The vagus terminates in the nucleus tractus solitarius (NTS) in the brain stem, and during infection a neural cascade of activation spreads from the NTS to regions to which it projects (e.g., the parabrachial nucleus; Ericsson et al., 1994). Furthermore, electrical stimulation of afferent vagal fibers produces neural changes characteristic of sickness (Roosevelt, Smith, Clough, Jensen, & Browning, 2006). Supporting the importance of vagal signaling, severing the vagus often reduces or eliminates the brain activation (Wan, Wetmore, Sorensen, Greenberg, & Nance, 1994) and brain-mediated sickness responses (produced by injection of LPS, IL-1, and so on (Watkins et al., 1994). However, vagotomy does not always do so. For example, vagotomy blocks the fever produced by low, but not high, doses of LPS (Hansen et al., 2001). The most reasonable conclusion to date is that the vagus is an especially important signaling route early during infection, before high blood levels of cytokines have developed. Once blood levels are high, blood-borne routes are especially important. There also appears to be specificity of signaling to different regions of the brain, so that some behavioral endpoints lean more heavily on vagal signaling than do others. For example, Konsman, Luheshi, Bluthe, and Dantzer (2000) reported that severing the vagus blocks the reduction in social exploration, but not the fever produced by the very same dose of peripheral LPS. Finally, it should be noted that the vagus is not the only peripheral nerve that can signal the occurrence of immune activation. The vagus does not innervate regions such as the skin and oral cavity, and other nerves that innervate these regions may function to communicate immune activation to the CNS (Romeo, Tio, & Taylor, 2003). In sum, the immune system communicates to the CNS over multiple pathways. This multiplicity of communication routes likely functions to provide the CNS with detailed information concerning the immune status of the body, rather than simply that an infection is, or is not, present. Indeed, the pattern of brain activation differs after the injection of different immune activators. For example, Serrats and Sawchenko (2006) have reported that the peripheral injection of the bacterial superantigen staphylococcal enterotoxin B (SEB), which produces a peripheral cytokine pattern different from LPS, produces a different
8/18/09 4:58:12 PM
124
Neuroimmunology
pattern of brain activation than does LPS. Thus, it may be that that the brain is able to construct an “image” of immune patterns in the periphery. The bidirectional pathways between the brain and immune system are depicted in Figure 7.2. Brain IL-1 Microglia As already noted, immune signals to the CNS lead cells in the brain to synthesize IL-1. That is, peripheral IL-1 “begets” central IL-1. This is a most unusual arrangement that has important implications. The IL-1 induction occurs largely in glial cells, with the initial response being most prominent in microglia (Van Dam et al., 1995). Because brain IL-1 is critical in immune-brain interactions (see the discussion that follows), microglia are key cells in these processes. This is entirely fitting because microglia
CRH
Pituitary
B NE, Enkephalins, SP, NPY Autonomic nervous system
β-Endorphin prolactin, GH other hormones Adrenal
ACTH
C NE Epi Enk
A CORT Spleen, thymus, and other immune organs
Brain IL-1 as a Mediator Immune cells (B & T cells, Mφ, etc.)
Proinflammatory cytokines (IL1, TNF, IL6)
Figure 7.2 Bidirectional pathways between the brain and the immune system. Note. A and B schematize the major outflow pathways from the brain to the immune system, the autonomic nervous and hypothalamo-pituitary-adrenal systems, respectively. C schematizes communication from the immune system to the brain.
c07.indd Sec1:124
can be considered to be the brain’s resident immune cells, or at least “immune-like.” Microglia constitute 6% to 12% of all of the cells in the CNS and are of hematopoietic origin (Cuadros & Navascues, 2001), as are peripheral immune cells. In the adult, resident microglia arise from two sources. Resident microglia continue to divide throughout life, and peripheral blood monocytes migrate into the CNS and mature into microglia. Microglia express many or all of the same receptors as do peripheral macrophages, and thus can recognize bacteria, viruses, and so on. They are constantly monitoring the CNS environment and are responsible for immune surveillance within the CNS (Nimmerjahn, Kirchhoff, & Helmchen, 2005), and when they detect an invader, signs of cell damage via heat shock protein receptors, or other changes in the brain microenvironment, they change their morphology and function. When microglia are not stimulated, they have a highly ramified stellate morphology with little or no expression of activation markers such as CR3. They have often been described as appearing to be “damped” macrophages, held in check by the CNS microenvironment (Kreutzberg, 1996). When microglia detect danger, they change morphology, upregulate a variety of cell surface receptors, and synthesize and release a variety of pro-inflammatory mediators such as IL-1, prostaglandins, reactive oxygen species, and chemokines. That is, they initiate an inflammatory response in the CNS. Although it is common to think of microglia as being in either an inactivated or an activated state, microglia can actually be in a continuum of activational states. Of particular interest in the following discussion, microglia can be in a state that has been called “primed” (Perry, Newman, & Cunningham, 2003). In this state, they have a quiescent morphology and do not express upregulated activation markers. However, if stimulated the microglia overproduce inflammatory products such as IL-1. Microglia can remain in this primed or sensitized state for prolonged periods of time (Felton & Perry, 2005).
The fact that IL-1 is made in the CNS in response to peripheral immune activation is more than a curiosity. Importantly, the IL-1 that is synthesized in the brain is a key mediator of brain-mediated sickness responses. Many sickness responses can be prevented by injecting an IL-1 receptor antagonist (IL-1ra) into the brain (Maier, Watkins, & Nance, 2001). However, because IL-1 can enter the brain in small amounts, this finding could be explained by arguing that sickness responses are prevented by IL-1ra because it blocks the binding of this IL-1 that has entered from the outside. It is also the case that injecting IL-1 into the brain produces most of the brain-mediated sickness responses (Maier et al., 2001), but this also does not conclusively
8/18/09 4:58:12 PM
Role of the Brain in Host Defense
implicate brain-made IL-1. IL-1, like many peptides, is synthesized as a larger prohormone. Mature IL-1 is formed by cleavage of the prohormone by an enzyme called the IL-1 converting enzyme (ICE). Thus, inhibiting the actions of ICE would prevent new IL-1 from being manufactured. The injection of ICE inhibitors into the brain would prevent the brain from making new IL-1, but would have no effect on IL-1 that has entered from outside the brain. In the few experiments that have been conducted, intracerebral injection of ICE inhibitors has blocked or reduced brain-mediated sickness responses (Imeri, Bianchi, & Opp, 2006). More remarkably, IL-1 within the brain is also involved in the mediation of peripheral innate immune responses. For example, the blockade of IL-1 receptors in the brain reduces the gastrointestinal hypomotility produced by peripheral LPS (Plaza, Fioramonti, & Bueno, 1997). Conversely, elevating IL-1 in the brain can stimulate peripheral immune changes. Injecting IL-1 into the striatum leads the liver to manufacture acute phase proteins (Campbell et al., 2005). Since peripheral IL-6 is critical in regulating the liver APR, it would then be expected that brain IL-1 would lead to increases in blood levels of IL-6. Indeed, the intracerebral administration of IL-1 produces large increases in blood levels of IL-6, as well as IL-1. This increase in plasma IL-6 is not mediated by the injected IL-1 leaking to the periphery because a peripheral injection of the same small amount of IL-1 did not increase plasma IL-6. Furthermore, intracerebral administration of IL-1ra blocked the effects of intracerebral IL-1 (De Simoni et al., 1993), indicating that brain IL-1 receptors are the site at which the increase in blood IL-6 (as well as IL-1) is mediated. Thus, brain cytokines “beget” peripheral cytokines, as well as the other way around. Although the precise pathways are not well understood, it is clear that IL-1 in the brain initiates a communication process to the periphery
Sickness behaviors IL-1 Fever Liver IL-1
H I IL-1 P P
Hypo HPA activation IL-1 IL-1 IL-1 Paraganglia
LPS
NTS
s Vagu
Hyperalgesia
Macrophage
Figure 7.3 The complete brain-immune loop, with IL-1 in the brain regulating behavior and initiating an outflow from brain to the periphery.
c07.indd Sec1:125
125
that signals the immune system. This arrangement is schematized in Figure 7.3. There is also an anti-inflammatory pathway from the brain to the immune system. The activation of efferent vagal fibers originating in the dorsal motor nucleus of the vagus can inhibit the release of peripheral pro-inflammatory cytokines via a cholinergic mechanism (Tracey, 2007). Finally, microglia can also release anti-inflammatory products (Rasley, Tranguch, Rati, & Marriott, 2006). Thus, there is truly bidirectional communication between the CNS and the innate immune system, involving both pro- and anti-inflammatory influences, with the balance between them doubtlessly critical. Downstream Mediators The fact that IL-1 and other cytokines within the brain are critical to the mediation of a variety of behavioral phenomena does not mean that IL-1 does so via a direct action on neuronal activity. Consider the memory impairments that are produced during peripheral infection. It is known that IL-1 induced in the hippocampus mediates the memory impairment because blocking IL-1 receptors in the hippocampus prevents the memory impairment from occurring (Pugh et al., 1999) and the microinjection of IL-1 into the hippocampus produces memory impairment (Barrientos et al., 2002). However, IL-1 may alter memory not because increases in IL-1 directly interfere with memory consolidation, but because IL-1 influences other molecules that are critical in memory formation. For example, the neurotrophin brain-derived neurotrophic factor (BDNF) within the hippocampus is critical in forming long-term hippocampal memories. Experiences such as contextual fear conditioning, that engage the hippocampus, induce the production of BDNF within specific layers of the hippocampus (Hall, Thomas, & Everitt, 2000). Moreover, this BDNF increase is necessary to form long-term hippocampally based memories (Alonso et al., 2002). This is noted because high levels of IL-1 in the hippocampus prevent the learning experience from increasing BDNF (Barrientos et al., 2004), and this may be the basis for the memory impairment, not the increased IL-1 per se. It is not known how IL-1 interferes with BDNF induction. However, the prostaglandins provide an intriguing possibility. The binding of IL-1 to its receptor activates the nuclear factor-kappa B (NF-kappaB) intracellular signaling pathway. NF-kappaB is a transcription factor that, among other things, increases the expression of the COX-2 enzyme. COX-2 induction, in turn, leads to the production of prostaglandins such as PGE2 (Hoozemans, Veerhuis, Janssen, Rozemuller, & Eikelenboom, 2001). There are four known subtypes of prostaglandin receptors (EP1–EP4), and one, the EP3 receptor, is primarily localized to neurons and is abundantly expressed in the hippocampus (Sugimoto & Narumiya, 2007).
8/18/09 4:58:13 PM
126
Neuroimmunology
The EP3 receptor inhibits cAMP, and BDNF gene transcription is dependent on the cAMP-CREB pathway. Thus, PGE2 may decrease BDNF levels by binding to the EP3 receptor and reducing intracellular cAMP. The scenario would then be one in which IL-1 interferes with memory because it interferes with experience-dependent BDNF induction in the hippocampus, and IL-1 interferes with BDNF because it induces prostaglandins. Indeed, as this hypothesis would predict, infusion of PGE2 into the hippocampus after a learning experience interferes with both the induction of BDNF and the formation of memory, and the pharmacological blockade of prostaglandin production in the hippocampus prevents the memory impairment normally produced by intra-hippocampal injection of IL-1 (Hein et al., 2007). This discussion illustrates the complexity of the cascades involved. Indeed, the previous discussion is likely a gross oversimplification because IL-1 also activates mitogenactivated protein kinase pathways (MAPkinase), and these are also involved in the effects of IL-1 on plasticity (Kelly et al., 2003). There are doubtlessly many pathways and interactions initiated by IL-1 that mediate the behavioral changes that occur. It is likely that different behavioral endpoints (e.g., reduced food and water intake instead of memory formation) are mediated by different molecular cascades of IL-1-initiated events. However, it is interesting to note that the reduction in sexual behavior that is produced by immune activation is also blocked by prostaglandin inhibitors (Avitsur, Weidenfeld, & Yirmiya, 1999). Basal versus Elevated IL-1 The discussion to this point considered the impact of levels of IL-1 in the brain that are elevated above basal levels. However, basal IL-1 function may be necessary for the proper function of some of the very processes that are impaired by high levels. For example, memory is actually impaired if basal IL-1 function is removed, either by genetic deletion or by receptor blockade in the absence of manipulations that increase IL-1 above normal (Goshen & Yirmiya, 2005). Thus, the relationship between IL-1 and memory is really U shaped, and this may be the case for endpoints other than memory. Relationship between Peripheral and Brain IL-1 Before turning to a consideration of what all of this implies for phenomena of interest to behavioral scientists, it helps rationalize what is to follow to speculate on why there is this peculiar relationship between peripheral and brain IL-1. Elements of the innate immune system are phylogenetically quite old, whereas the adaptive immune system is a more recent evolutionary development. Even organisms as primitive as sponges (the most primitive multicellular
c07.indd Sec1:126
organisms) can distinguish self from nonself, and contain phagocytic cells that can defend against invading microorganisms. Cells called amoebocytes in these invertebrates accomplish phagocytosis and PAMP recognition. These cells migrate to sites of infection in organisms such as mollusks and contain enzymes that are very similar to those in vertebrate macrophages. Importantly for the present discussion, amoebocytes synthesize and release cytokines such as IL-1 (Beck et al., 1993), and in vertebrates, they orchestrate innate immune defense (Clatworthy, 1998). In these organisms, the amoebocyte-produced cytokines communicate with and regulate neural processes, altering neural excitability (Clatworthy & Grose, 1999). This cytokine-to-neural tissue communication is involved not only in defense against pathogens, but also in defense against predators or external threats. For example, organisms such as mollusks identify predators or external threats via contact with their body surface. This body contact elicits withdrawal reflexes and locomotion, moving the body surface away from the threat. The cytokines released by amoebocytes can sensitize these withdrawal reflexes by increasing the excitability of the neurons involved (Clatworthy, 1998). The amoebocytes, in turn, receive signals from the neural tissue, providing bidirectional immune-neural communication even in these primitive organisms. Cells within the mollusk neural structures express cytokine-like molecules, and so cytokines might communicate in both directions. It is important to understand that organisms such as mollusks have a series of separate ganglia rather than a discrete brain and there is no blood-brain or blood-ganglia barrier. That is, immune-derived cytokines can communicate directly to neural tissue in the service of host defense in these organisms. With the development of a blood-brain barrier in vertebrates, IL-1 released by immune cells could no longer communicate directly with neural tissue. This may help to explain the very peculiar arrangement in vertebrates in which peripheral IL-1 induces the production of the very same molecule in the brain. Under this arrangement, when immune cells release IL-1, IL-1 still makes contact with neural tissue, but it is IL-1 that has been induced within the brain. One final point with regard to the role of cytokines in organisms such as mollusks: As previously noted, host defense requires the production of energy. In mollusks, the amoebocyte and cytokines are critical to this process. In mammals, the hypothalamus is key, with the hypothalamus being activated during infection by IL-1 (Schiltz & Sawchenko, 2007). The hypothalamus (a) releases corticotropin-releasing hormone (CRH) into the portal blood. The CRH travels to the pituitary gland where it stimulates the release of glucocorticoids into the bloodstream, the glucocorticoids then leading to energy production, and
8/18/09 4:58:13 PM
Implications of Immune-Brain Relationships for Behavior 127
(b) initiates sympathetic activation that releases catecholamines into the blood, then leading to energy production. Mollusks do not have hypothalami, pituitaries, adrenals, or a sympathetic nervous system. However, all of these molecules are contained within the amoebocyte, and the amoebocyte releases glucocorticoids and catecholamines under the control of IL-1 (Ottaviani & Franchini, 1995).
IMPLICATIONS OF IMMUNE-BRAIN RELATIONSHIPS FOR BEHAVIOR In considering the implications of immune-brain relationships for behavior, we begin by discussing behavioral changes that occur during infection. We then move to a discussion of whether there are any implications of what will have been described for circumstances other than infection. Infection-Induced Changes As noted, microbial stimulation of innate immune cells initiates a cascade that communicates the presence of infection to the brain, with the brain then orchestrating a coordinated set of sickness responses that include increased sleep (particularly NREM sleep), social withdrawal, decreased food and water intake, impaired cognitive function, and so on. It is interesting that in other research domains many of these behaviors would be viewed as indicative of either depressed mood or anxiety. For example, the tendency to engage in social interaction with a conspecific is a well-validated “animal model of anxiety” (File & Seth, 2003). Manipulations that decrease anxiety in humans (e.g., anxiolytic drugs such as benzodiazepines) increase interaction, and manipulations that increase anxiety (e.g., anxiogenic drugs such as beta-carbolines) decrease interaction. Peripheral immune activation, as mentioned, decreases social interaction (Yirmiya, 1996). For these reasons, a number of investigators have explored whether peripheral immune activation might not also produce other behavioral changes that are used to implicate the presence of anxiety or depressed mood. Anxiety is often defined as fear in the absence of clearly threatening stimuli. Thus, assessments of anxiety typically involve the presentation of ambiguous or potentially threatening situations that are not overtly dangerous, with the behavioral measure being the animal’s tendency to avoid the potentially threatening circumstance. For example, in the elevated plus maze (EPM) there are four arms at right angles from a central area, with two being enclosed and two open (no sides or top). The entire maze is raised above the floor, so there is a potential drop from the open arms. Being in the open puts
c07.indd Sec2:127
animals at greater risk, but animals also have a tendency to explore, so a typical subject (rat or mouse) spends considerable time in both open and closed arms when first placed in the apparatus. Manipulations that increase anxiety decrease the time that subjects spend in the open arms, and manipulations that decrease anxiety do the reverse (Carobrez & Bertoglio, 2005). The peripheral injection of either immune activators such as LPS, or the administration of IL-1 itself, decrease time spent in the open arms (Swiergiel & Dunn, 2007). As for depressed mood, a cardinal symptom of depression is adhedonia, a loss of interest in normal pleasures. Yirmiya and his colleagues have conducted an extensive series of studies exploring whether peripheral immune stimulation leads to behavioral changes that indicate a loss of pleasure. These involve an examination of sexual behavior and preference for sweet tastes. The peripheral administration of LPS and/or IL-1 reduces sexual behavior in females (Avitsur & Yirmiya, 1999) and the rat’s normal preference for sweet solutions (Yirmiya, 1996). Immune activation reduces motor activity and fluid intake in general. Simple reduction in motor activity doesn’t explain the reduced sexual behavior because the immune activation alters quite specific aspects of sexual behavior. A reduction in drinking cannot explain the sweet solution data because in these experiments the rats have two drinking tubes, one containing water and one the sweet solution. Immune stimulation causes a reduction in the percentage of total fluid intake that is the sweet solution. It can also be noted that these effects of immune activation can be prevented by chronic treatment with antidepressant drugs such as the SSRIs (Yirmiya et al., 2001). Nevertheless, the difficulty of disentangling effects that reflect anxiety or depressed mood from general reductions in motor activity and reduced food and water intake has been noted (Swiergiel & Dunn, 2007) and caution should be exercised. However, it has been demonstrated that lowlevel infection with the bacterium Campylobacter jejuni in mice, that does not produce overt signs of sickness, nevertheless leads to anxious behavior (Goehler, Lyte, & Gaykema, 2007). Research in this area with humans has the great advantage that anxiety and depressed mood can be assessed in ways that are not confounded with the behavioral changes, such as reduced motor activity characteristic of sickness. Very low doses of LPS, so low that humans cannot even discriminate whether they have been administered LPS or the vehicle, produce self-reports of increased anxiety (Krabbe et al., 2005). The most extensive studies here have been conducted in the context of interferon-alpha (IFN␣) administration. IFN␣ has antiviral and antitumor properties and is used in the treatment of hepatitis C infection and cancer.
8/18/09 4:58:14 PM
128
Neuroimmunology
IFN␣ is itself a cytokine with properties that overlap those of the classic pro-inflammatory cytokines, and in addition, IFN␣ induces the production of IL-1 (Taylor & Grossberg, 1998). Chronic IFN␣ administration in these patients produces depressed mood that is so severe that roughly 50% of the patients come to meet the diagnostic criteria for major depression (Musselman et al., 2001). Importantly, prior administration of antidepressant drugs can prevent the depressogenic effect of the IFN␣ treatment (Raison et al., 2007). Anxiety and depressed mood are not mediated in the periphery, but in the brain. This means that immune activation produces these phenomena because it leads to neural changes. Because peripheral cytokines induce the production of cytokines within the brain, it is natural to inquire whether it is these cytokines induced in the brain that produce anxious and depressed behavior. IL-1 in the brain leads to alterations in serotonergic, noradrenergic, and dopaminergic function (Dunn, Wang, & Ando, 1999), thereby providing a tie to neurotransmitter systems traditionally thought to be involved in these processes. The research to date directed at determining the role of cytokines within the brain on anxiety- and depression-related behaviors produced by infection reveals a complex picture. For example, the reduction in social interaction produced by peripheral immune activation is prevented by blockade of IL-1 receptors in the brain, but the reduction in food and water intake is only partially blunted (Kent et al., 1992). The effects of brain IL-1 receptor blockade may also depend on the particular immune stimulus (Bluthe et al., 1992). This issue is the topic of current investigation and will be further discussed in the context of stress, rather than infection induced behavioral change. Findings that peripheral immune activation in animals and humans lead to depressed mood and anxiety lead naturally to an inquiry into whether peripheral immune activation and it’s consequent induction of brain cytokines might not be involved in mood and anxiety disorders. This issue is beyond the scope of this chapter and has been the subject of numerous reviews (e.g., Anisman, Merali, Poulter, & Hayley, 2005). Microglial Priming The research reviewed previously indicates that peripheral inflammatory events initiate a process that ultimately alters many aspects of behavior, mood, and cognition, and that the induction of cytokines within the brain is critical to the neural cascade that mediates these changes. Any organismic or other variable that would magnify this process would amplify the impact of infection or injury on these behavioral, emotional, and cognitive processes. It may
c07.indd Sec2:128
well be that exaggerated immune-CNS interactions are involved in the mediation of a number of poorly understood phenomena of societal importance. It will be recalled that cytokines within the CNS are manufactured primarily by glial cells, with microglia being more reactive than astrocytes. As described previously, microglia can be in a continuum of activation states, passing from inactivated, to primed or sensitized, to fully activated. In the fully activated state, microglia synthesize and secrete high levels of pro-inflammatory molecules such as IL-1 on a continuous basis, whereas in the primed state they do not do so, but will produce exaggerated levels if they receive stimulating input. Aging Many of the changes that occur with aging are poorly understood. There has been considerable interest in the role of brain cytokines in the cognitive declines that occur with aging, largely because elevated levels of cytokines are associated with neurodegenerative disorders such as Alzheimer ’s disease (Cacquevel, Lebeurrier, Cheenne, & Vivien, 2004). However, with improvements in health care, many individuals are undergoing “normal healthy aging.” By the year 2030, roughly 20% of the population will be over 65 years of age. As life expectancy continues to increase, it is important to understand the factors underlying the decline in memory and cognition that occurs with normal aging, in addition to the processes associated with the more devastating pathological neurodegenerative disorders. Although there is disagreement about the extent to which memory and other cognitive functions decline during normal aging, there is agreement that the variability in individual performance increases with age, with some individuals suffering large declines (Laursen, 1997). In addition, vulnerability to cognitive declines associated with a variety of challenges, such as surgery and heart attacks, increases as people age. It is commonplace for an aging individual to be functioning at a high level, but then to display poor cognitive function after surgery, for example, hip replacement, that is seemingly unrelated to cognition or the brain. Indeed, the term postoperative cognitive dysfunction (POCD) has been used to identify this phenomenon (Bekker & Weeks, 2003). These precipitous declines in mental function have been difficult to understand. However, consider the possibility that glial cells are primed by normal healthy aging. Events such as surgery are inflammatory and produce high levels of circulating cytokines such as IL-1 (Carter & Whelan, 2001). The research reviewed previously indicates that these peripheral cytokines will communicate to the brain, leading to a neural cascade that includes the activation of glial cells. But if the glial cells are primed by aging,
8/18/09 4:58:14 PM
Implications of Immune-Brain Relationships for Behavior 129
then the result should be exaggerated production of proinflammatory cytokines such as IL-1. This exaggerated production of brain IL-1 after peripheral inflammatory events such as surgery could then be triggering the memory and other cognitive impairments that follow. So, do glial cells change with age? With normal aging, there are indeed characteristic changes in microglial and astrocytic morphology, expression of activation makers, and function. For example, aging increases the microglial expression of complement type 3 receptor (CD11b), major histocompatibility complex (MHC) Class II, leukocyte common antigen (LCA; CD45), and CD4 (Perry et al., 2003). Functional changes such as increased phagocytic activity have also been reported (Sheng, Mrak, & Griffin, 1998). In general terms, this pattern can be described as a loss of the normal suppression that is produced by the CNS microenvironment. There are numerous possible sources for this loss of suppression, such as neuroendocrine dysregulation, cumulative oxidative stress, and mitochondrial changes. Neurons inhibit glial function via cell-to-cell contact (Neumann, 2001), and neuronal loss or reduced function may, itself, increase glial activation. The contact suppression of microglial activation by neurons is accomplished by a number of proteins that are expressed on the surface of neurons that bind to specific receptors for those proteins on microglia. For example, neurons express a glycoprotein identified as CD200 ligand, and it binds to a receptor, CD200R, that, within the brain, is expressed only by microglia (Hoek et al., 2000). Frank et al. (2006) reported that the expression of CD200 by neurons decreases in aging animals, providing a mechanism by which microglia are released from their normal inhibitory restraints. Whether CD200 is reduced because of neuronal loss or function is not known. It should be noted, however, that there are likely a number of mechanisms involved in age-related glial changes. For example, IL-10, an anti-inflammatory cytokine, decreases with age (Ye & Johnson, 2001), and IL-10 functions to inhibit glia. The important point is that glia do take on an activated morphology with age. This leaves the issue of whether the glial cells are fully activated or primed in animals that are aging, but not senescent. The results from studies that have examined pro-inflammatory cytokine expression with aging are somewhat inconsistent, but the available data suggest that PIC expression is at most elevated to only a minor degree in aged humans (Sheng et al., 1998) or rodents (Kyrkanides, O’Banion, Whiteley, Daeschner, & Olschowka, 2001). IL6 may be an exception because about a 20% increase in basal protein expression has been reported in older animals (Ye & Johnson, 2001). However, even 20% is a small fraction of the increases in brain IL-6 that occur after peripheral immune stimulation.
c07.indd Sec2:129
The morphological changes in glia with age described here do not, in themselves, indicate that there would be an exaggerated pro-inflammatory response to peripheral immune stimulation. To examine this issue, Barrientos et al. (2006) injected both aging and young rats with live, replicating E. coli bacteria. This bacteria causes only a mild infection, and E. coli is cleared in both old and young rats in less than 24 hours. Importantly, E. coli did not produce a larger peripheral increase in pro-inflammatory cytokines in aging than in young rats, so any differences in brain responses could not be attributed to an augmented peripheral response. Protein levels of IL-1 within the hippocampus increased rapidly after infection in both young and old animals, but in the young rats levels returned to baseline between 4 and 24 hours after infection. In aging rats, IL-1 levels were elevated for 10, but not 14 days. This is an enormous difference in the duration of hippocampal IL-1 increase produced by peripheral infection. It is always possible that the IL-1 measured by Barrientos et al. (2006) did not derive from microglia. To further explore this issue, Frank et al. (2006) isolated microglia from the hippocampi of aging and young rats and studied them in vitro. The microglia from aging rats exhibited features of activation (e.g., increased expression of MHCII mRNA), but did not secrete more IL-1 than did microglia from young rats. However, when LPS was added to the culture to stimulate the microglia, the microglia from the aging animals produced much more IL-1 than did the microglia from the young subjects. Because there were no other cells in the culture, the microglia had to be the source of the IL-1, and because there were the same number of cells in culture for both groups, it had to be that individual microglia from aging rats secreted more IL-1 when stimulated. The foregoing makes clear that microglia can be primed or sensitized by aging. If the sensitized IL-1 response that results from primed glia produces cognitive impairments such as memory, then memory should be impaired in aging animals following infection or other inflammatory events for roughly the same duration as the IL-1 increase. To test this idea, aging and young rats of exactly the same ages and strain as used in the previous experiments were given either a peripheral infection with E. coli or vehicle injection. Learning tasks were conducted from 4 to 14 days later, with memory being tested 48 hours after learning. In young animals, exposure to E. coli had no effect at any of the E.coli-to-learning intervals. This was expected because the mild E. coli infection only increases hippocampal IL-1 for between 4 and 24 hours in young rats. In the aging animals, hippocampal memory formation was very poor when learning was 4 to 10 days after E. coli exposure. However, if 14 days intervened, memory was now normal. Thus, the
8/18/09 4:58:14 PM
130
Neuroimmunology
time course of interference with the formation of longterm memory mirrored the time course of hippocampal IL-1 elevation. Importantly, memory for fear of the tone from the fear-conditioning task was not impaired, suggesting selectivity to the hippocampus. To determine whether these effects in the aging animals represented impairments of long-term memory formation, as opposed to deficits in learning or the processing of the information during the fearconditioning task, short-term memory 1 hour after training was also assessed. Short-term memory was unaffected by prior E. coli exposure in the aging subjects. That is, fear of the context was perfectly intact 1 hour after conditioning, supporting the idea that it is the consolidation of longterm memories, a process that requires the hippocampus, which is impaired. Finally, inhibition of prostaglandin synthesis within the hippocampus has been reported to block the effects of E. coli exposure on memory in aging animals (Hein et al., 2007), supporting the conclusion that the same cascade as described earlier is involved in the effects of peripheral immune stimulation on memory in aging. The effects of infection on memory and hippocampal IL-1 levels in aging animals persisted for only 10 days in the E. coli experiments. However, E. coli exposure in rats produces a very mild infection, fever persisting in the old animals for only 2 days. It may well be that a more powerful inflammatory stimulus would produce a much more prolonged increase in brain IL-1 in aging animals. There is also some evidence that inflammatory challenges may cumulate in aging animals—a “multiple hit” hypothesis. For example, surgery occurring 2 weeks after infection produces memory impairments for several months (Barrientos et al., unpublished data). Before leaving the discussion of aging it should be noted that there is no reason to expect that the impact of glial priming should be restricted to the effects of infection and inflammation on cognitive processes. Indeed, Huang, Henry, Dantzer, Johnson, and Godbout (2007) have reported that a central administration of LPS produces exaggerated sickness behavior in old animals. The only cells in the brain that express the receptor for LPS (TLR-4 receptor) are microglial cells and perivascular macrophages, supporting the priming hypothesis. It should also be noted that any factors that activate glia other than peripheral inflammation, should also have a potentiated effect in aged individuals. As discussed next, stressors may activate microglia, and brain IL-1 may mediate some of the effects of exposure to stressors. Thus, the experience of stressors would be expected to have an exaggerated impact on the aging individual. Early Infection Aging may be but one of many factors that prime glial cells. Maternal infection during the third trimester in
c07.indd Sec2:130
humans has been associated with a number of neuropsychological difficulties in adulthood. Brain development in the rat and human are quite different, and the rat is born at a much more immature stage. At roughly postnatal day 4, the rat is considered to be equivalent in development to the third trimester in humans. Bilbo, Biedenkapp, et al. (2005) infected rats with E. coli on postnatal day 4 and examined their behavior in adulthood. The rats appeared normal until challenged with a peripheral injection of LPS under conditions that had no effect on control subjects. Now, the early-infected rats showed hippocampal-dependent memory impairments identical to those that occur in aging rats. An examination of both glial morphology and IL-1 in the brain indicated that the early infection did prime glia into adulthood, and inhibition of IL-1 synthesis by an ICE inhibitor in the adult subjects blocked the deleterious effects of peripheral LPS on memory (Bilbo, Levkoff, et al., 2005). Clearly, the existence of glial priming would predict that a broad range of behaviors would be altered by early infection. Whether this is so is under investigation. Stress It is not obvious how stress would be related to the processes so far discussed in this chapter. Several factors led a number of investigators to explore whether cytokines within the brain might be involved in mediating some of the consequences of exposure to stressors. First, some of the behavioral consequences of stressor exposure appear to be quite similar to those of infection (Maier & Watkins, 1998). Second, both infection and stressors activate the HPA axis and sympathetic nervous systems. This can be rationalized by considering that stressors or external threats evoke fight/flight, and fight/flight requires energy production, as does host defense against infection. As discussed previously, even the most primitive organisms engage in host defense against infection, and this involves the production of energy. Fight/flight evolved later, as it requires an organism complex enough to detect predators or threats at a distance, direct motor responses in complex ways, and integrate the two. Since evolution often works by cooptation of existing solutions to solve related problems (Gould, 1982), it may be that as the fight/flight response evolved it used the mechanism that was already present to produce energy—the sickness machinery. These considerations led a number of investigators to determine whether (a) stressors induce the production of cytokines in the brain as does infection, and (b) blockade of IL-1 or other cytokines in the brain would block the behavioral or endocrine effects of exposure to stressors. The literature is not extensive, but a variety of stressors do
8/18/09 4:58:15 PM
Implications of Immune-Brain Relationships for Behavior 131
lead to IL-1 increases in specific regions of the brain (O’Connor et al., 2003). This suggests that stressors activate glial cells, and Nair and Bonneau (2006) found that restraint increased microglial proliferation. The literature on the blockade of IL-1 in the brain is even less extensive, but intracerebral injection of IL1-ra has been reported to reduce both the endocrine (Shintani et al., 1995) and behavioral (Maier & Watkins, 1995) effects of exposure to stressors. Stressors only induce IL-1 increases for a brief period of time (Nguyen et al., 2000). However, stressors might be able to prime microglia for a longer period of time. Microglia isolated from the hippocampi of rats stressed 24 hours earlier do not produce more IL-1 than do microglia from nonstressed controls, but produce exaggerated levels of IL-1 when LPS is added as a stimulus (Frank, Baratta, Sprunger, Watkins, & Maier, 2007). That is, stress produces the same pattern of microglial changes as does aging. The duration of this glial priming is not known. Cross-Sensitization The existence of stress-induced glial activation and priming has a number of implications (see Figure 7.4). First, if both stressors and peripheral inflammatory events activate and prime microglia, then there should be crosssensitization between stress and infection. That is, individuals that have experienced a stressor in the recent past should over-respond to infectious agents, and conversely, with the duration of such effects depending on the persistence of glial priming. Indeed, exposure to a stressor (inescapable tail shock in rats) has been shown to potentiate a variety of sickness-related responses to peripheral LPS (Johnson, O’Connor, Hansen, Watkins, & Maier, 2003), with the sensitization persisting for 4 to 10 days after only a single stressor session. For example, the HPA response to LPS is potentiated for 4, but not 10 days following tail shock (Johnson, O’Connor, Deak, Spencer, et al., 2002). Conversely, immune activation also senitizes responses to stresssors. For example, a single peripheral administration of IL-1 sensitizes the HPA response
Signal ATP, HSPs, etc.
Cytokines Cytokines
Behavior, mood, and cognition
Aging, early infection, stress
Figure 7.4 A variety of conditions can prime the glial response to normal input, thereby producing exaggerated cytokine responses and the products of brain cytokines.
c07.indd Sec2:131
to novelty 22, but not 42 days later (Schmidt, Aguilera, Binnekade, & Tilders, 2003). The existence of cross-sensitization between stressors and immune stimuli does not indicate that glial priming and/or brain cytokines are key mediators of the process. There has been only a small amount of research directed at this issue. However, three findings do support this possibility. First, exposure to tail shock exaggerates the increase in brain IL-1 that is produced later by LPS (Johnson, O’Connor, Deak, Stark, et al., 2002). Second, the intracerebral administration of the IL-1 receptor antagonist blocks the sensitizing effect of tail shock to later LPS (Johnson, O’Connor, Watkins, & Maier, 2004), and third, the intracerebral injection of IL-1 produces sensitization to subsequent LPS (Johnson et al., 2004). Thus, IL-1 in the brain is both necessary and sufficient to produce cross-sensitzation, but the role of glia has not been explored. Stress and Peripheral Responses The idea that stressors and stimuli to the immune system ultimately act on overlapping neural circuitry involving IL-1 has another implication. This implication is that stressors should produce some of the same peripheral changes as occur during infection because the induction of IL-1 in the brain leads to signals to the periphery that impact on peripheral immune function (see previous discussion). Indeed, stressors as mild as exposure to a novel environment increase plasma IL-6 (LeMay, Vander, & Kluger, 1990), and more potent stressors lead to increases in plasma IL-1 as well (Johnson et al., 2003). As previously noted, peripheral IL-6 is a primary mediator of the acute phase response during infection. Thus, it should be the case that stressors, which induce IL-1 in the brain, should produce a peripheral acute phase response similar to that produced by infection. Not much research has been directed at this question, but it can be noted that exposure to a single session of tail shock produces various aspects of the APR including fever and shifts in liver function characteristic of infection—a reduction in the production of carrier proteins and an increase in production of acute phase proteins (haptoglobin and alpha 1-acid glycoprotein; Deak et al., 1997). This stressor-induced peripheral APR might have adaptive consequences. The period after a fight /flight emergency is likely a high probability period for infection from an injury, and exposure to tail shock has been shown to lead to enhanced recovery from a subcutaneous bacterial infection, as might occur during an injury (Deak, Nguyen, Fleshner, Watkins, & Maier, 1999). Thus, a number of hard-to-understand consequences of exposure to stressors may be understandable from the present perspective.
8/18/09 4:58:15 PM
132
Neuroimmunology
SUMMARY In this chapter, we described the existence of a bidirectional immune-brain network with extensive communication pathways going in each direction. We focused on immuneto-brain communication because immune modulation of the brain has has the most important implications for behavior. We summarized a broad array of research indicating that processes within the immune system, via communication to the brain and consequent modulation of neural activity, potently influence and direct aspects of behavior, mood, and cognition. We concentrated on acute rather than chronic immune activation and related phenomena such as stress in order to highlight the adaptive nature of the processes involved, and it is likely that it is with regard to acute events that these processes are adaptive or beneficial. During evolution, organisms that experienced chronic infection or chronic stress would seem to have been less likely to reproduce or survive. The mechanisms that we described may not be beneficial when infection or stress becomes chronic. It is here that physiology may shade into pathology, with outcomes such as neurodegeneration and clinical depression. The existing experimental research has tended to employ immune activators that are quite potent, but it should be recognized that nonhuman and human animals frequently encounter molecules that are recognized as nonself by the immune system, but that are not infectious and of which the organism is unaware. These molecules might also initiate immune-to-brain signaling, and the types of behavioral changes described here. For example, Besedovsky, Sorkin, Keller, and Muller (1975) exposed subjects to sheep red blood cells (SRBC). SRBCs are not bacteria, viruses, or the like, but they are foreign proteins and thus will activate immune responses. At the peak of the immune response to the SRBCs, the subjects exhibited increased HPA responding, just as if there were a stressor present. Thus, the SRBCs initiated signaling to the brain, and the brain altered its pattern of activity. There is much unexplained variability across time in an individual’s behavior, mood, and cognition, and immune-to-brain signaling initiated by encounters with foreign proteins that are not overtly infectious could be an important source of such variation. Moreover, consider the implications of the crosssensitization phenomenon described combined with immune-to-brain signaling initiated simply by substances that are foreign. Individuals who have experienced stress in the recent past might now undergo a large change in behavior, mood, or cognition after exposure to a benign but foreign protein. Conversely, individuals who have recently been exposed to a substance of which they are unaware but is foreign might show exaggerated reactions to a stressful
c07.indd Sec2:132
event. There are many poorly understood real-life phenomena to which these scenarios might apply. It is now commonplace to acknowledge hormonal influences on the brain, and immune modulation of the brain may be equally widespread. REFERENCES Ader, R., & Cohen, N. (1975). Behaviorally conditioned immunosuppression. Psychosomatic Medicine, 37, 333–340. Alonso, M., Vianna, M. R., Depino, A. M., Mello e Souza, T., Pereira, P., Szapiro, G., et al. (2002). BDNF-triggered events in the rat hippocampus are required for both short- and long-term memory formation. Hippocampus, 12(4), 551–560. Anisman, H., Merali, Z., Poulter, M. O., & Hayley, S. (2005). Cytokines as a precipitant of depressive illness: Animal and human studies. Current Pharmaceutical Design, 11, 963–972. Avitsur, R., Weidenfeld, J., & Yirmiya, R. (1999). Cytokines inhibit sexual behavior in female rats: II. Prostaglandins mediate the suppressive effects of interleukin-1beta. Brain Behavior and Immunity, 13, 33–45. Avitsur, R., & Yirmiya, R. (1999). The immunobiology of sexual behavior: Gender differences in the suppression of sexual activity during illness. Pharmacology Biochemistry and Behavior, 64, 787–796. Banks, W. A. (2005). Blood-brain barrier transport of cytokines: A mechanism for neuropathology. Current Pharmaceutical Design, 11, 973–984. Barrientos, R. M., Higgins, E. A., Biedenkapp, J. C., Sprunger, D. B., Wright-Hardesty, K. J., Watkins, L. R., et al. (2006). Peripheral infection and aging interact to impair hippocampal memory consolidation. Neurobiology of Aging, 27, 723–732. Barrientos, R. M., Higgins, E. A., Sprunger, D. B., Watkins, L. R., Rudy, J. W., & Maier, S. F. (2002). Memory for context is impaired by a post context exposure injection of interleukin-1 beta into dorsal hippocampus. Behavioral Brain Research, 134(1/2), 291–298. Barrientos, R. M., Sprunger, D. B., Campeau, S., Watkins, L. R., Rudy, J. W., & Maier, S. F. (2004). BDNF mRNA expression in rat hippocampus following contextual learning is blocked by intrahippocampal IL-1beta administration. Journal of Neuroimmunology, 155(1/2), 119–126. Baumann, H., & Gauldie, J. (1994). The acute phase response. Immunology Today, 15(2), 74–80. Beck, G., O’Brien, R. F., Habicht, G. S., Stillman, D. L., Cooper, E. L., & Raftos, D. A. (1993). Invertebrate cytokines. III: Invertebrate interleukin-1-like molecules stimulate phagocytosis by tunicate and echinoderm cells. Cellular Immunology, 146, 284–299. Bekker, A. Y., & Weeks, E. J. (2003). Cognitive function after anaesthesia in the elderly. Best Practice and Research: Clinical Anaesthesiology, 17, 259–272. Berkenbosch, F., de Goeij, D. E., Rey, A. D., & Besedovsky, H. O. (1989). Neuroendocrine, sympathetic and metabolic responses induced by interleukin-1. Neuroendocrinology, 50, 570–576. Besedovsky, H., Sorkin, E., Keller, M., & Muller, J. (1975). Changes in blood hormone levels during the immune response. Proceedings of the Society for Experimental Biology and Medicine, 150, 466–470. Bilbo, S. D., Biedenkapp, J. C., Der-Avakian, A., Watkins, L. R., Rudy, J. W., & Maier, S. F. (2005). Neonatal infection-induced memory impairment after lipopolysaccharide in adulthood is prevented via caspase-1 inhibition. Journal of Neuroscience, 25, 8000–8009. Bilbo, S. D., Levkoff, L. H., Mahoney, J. H., Watkins, L. R., Rudy, J. W., & Maier, S. F. (2005). Neonatal infection induces memory impairments following an immune challenge in adulthood. Behavioral Neuroscience, 119, 293–301.
8/18/09 4:58:15 PM
References 133 Blalock, J. E. (1984). The immune system as a sensory organ. Journal of Immunology, 132, 1067–1070.
interleukin-1 on stress-related neuroendocrine neurons. Journal of Neuroscience, 14, 897–913.
Blatteis, C. M. (1990). Neuromodulative actions of cytokines. Yale Journal of Biological Medicine, 63, 133–146.
Ericsson, A., Liu, C., Hart, R. P., & Sawchenko, P. E. (1995). Type 1 interleukin-1 receptor in the rat brain: Distribution, regulation, and relationship to sites of IL-1-induced cellular activation. Journal of Comparative Neurology, 361(4), 681–698.
Bluthe, R. M., Dantzer, R., & Kelley, K. W. (1992). Effects of interleukin-1 receptor antagonist on the behavioral effects of lipopolysaccharide in rat. Brain Research, 573, 318–320. Boulant, J. A. (2000). Role of the preoptic-anterior hypothalamus in thermoregulation and fever. Clinical Infectious Diseases, 31(Suppl. 5), S157–S161. Buttini, M., & Boddeke, H. (1995). Peripheral lipopolysaccharide stimulation induces interleukin-1 beta messenger RNA in rat brain microglial cells. Neuroscience, 65, 523–530. Cacquevel, M., Lebeurrier, N., Cheenne, S., & Vivien, D. (2004). Cytokines in neuroinflammation and Alzheimer ’s disease. Current Drug Targets, 5, 529–534. Campbell, S. J., Perry, V. H., Pitossi, F. J., Butchart, A. G., Chertoff, M., Waters, S., et al. (2005). Central nervous system injury triggers hepatic, CC, and CXC chemokine expression that is associated with leukocyte mobilization and recruitment to both the central nervous system and the liver. American Journal of Pathology, 166, 1487–1497. Carobrez, A. P., & Bertoglio, L. J. (2005). Ethological and temporal analyses of anxiety-like behavior: The elevated plus-maze model 20 years on. Neuroscience Biobehavioral Review, 29, 1193–1205. Carter, J. J., & Whelan, R. L. (2001). The immunologic consequences of laparoscopy in oncology. Surgical Oncology Clinics of North America, 10, 655–677. Clatworthy, A. L. (1998). Neural-immune interactions: An evolutionary perspective. Neuroimmunomodulation, 5(3/4), 136–142. Clatworthy, A. L., & Grose, E. (1999). Immune-mediated alterations in nociceptive sensory function in Aplysia californica. Journal of Expermintal Biology, 202(Pt. 5), 623–630. Cuadros, M. A., & Navascues, J. (2001). Early origin and colonization of the developing central nervous system by microglial precursors. Progress in Brain Research, 132, 51–59. Dantzer, R. (2004). Cytokine-induced sickness behaviour: A neuroimmune response to activation of innate immunity. European Journal of Pharmacology, 500(1/3), 399–411. Dantzer, R., Bluthe, R., Castanon, M., Kelley, K. W., Konsman, J., Laye, S., et al. (2005). Cytokines, sickness behavior, and depression. In R. Ader (Ed.), Psychoneuroimmunology (4th ed., pp. 281–317). San Diego, CA: Academic Press. Deak, T., Meriwether, J. L., Fleshner, M., Spencer, R. L., Abouhamze, A., Moldawer, L. L., et al. (1997). Evidence that brief stress may induce the acute phase response in rats. American Journal of Physiology, 273(6, Pt. 2), R1998–R2004. Deak, T., Nguyen, K. T., Fleshner, M., Watkins, L. R., & Maier, S. F. (1999). Acute stress may facilitate recovery from a subcutaneous bacterial challenge. Neuroimmunomodulation, 6(5), 344–354. De Simoni, M. G., De Luigi, A., Gemma, L., Sironi, M., Manfridi, A., & Ghezzi, P. (1993). Modulation of systemic interleukin-6 induction by central interleukin-1. American Journal of Physiology, 265(4, Pt. 2), R739–R742. Dinarello, C. A. (1991). Interleukin-1 and interleukin-1 antagonism. Blood, 77(8), 1627–1652. Dunn, A. J., & Swiergiel, A. H. (1998). The role of cytokines in infection-related behavior. Annals of the New York Academy of Science, 840, 577–585. Dunn, A. J., Wang, J., & Ando, T. (1999). Effects of cytokines on cerebral neurotransmission. Comparison with the effects of stress. Advances in Experimental Medicine and Biology, 461, 117–127. Ericsson, A., Kovacs, K. J., & Sawchenko, P. E. (1994). A functional anatomical analysis of central pathways subserving the effects of
c07.indd Sec3:133
Fahey, J. V., Gure, P. M., & Munck, A. (1981). Mechanisms of antiinflammatory actions of glucocorticoids. Inflammation Research, 2, 21–51. Felten, D. L., Felten, S. Y., Bellinger, D. L., Carlson, S. L., Ackerman, K. D., Madden, K. S., et al. (1987). Noradrenergic sympathetic neural interactions with the immune system: Structure and function. Immunology Review, 100, 225–260. Felton, L. M., & Perry, H. V. (2005). The interaction between brain inflammation and systemic infection. In R. Ader (Ed.), Psychoneuroimmunology (4th ed., pp. 429–449). San Diego, CA: Academic Press. File, S. E., & Seth, P. (2003). A review of 25 years of the social interaction test. European Journal of Pharmacology, 463(1/3), 35–53. Frank, M. G., Baratta, M. V., Sprunger, D. B., Watkins, L. R., & Maier, S. F. (2007). Microglia serve as a neuroimmune substrate for stressinduced potentiation of CNS pro-inflammatory cytokine responses. Brain Behavioral Immunity, 21, 47–59. Frank, M. G., Barrientos, R. M., Biedenkapp, J. C., Rudy, J. W., Watkins, L. R., & Maier, S. F. (2006). MRNA up-regulation of MHC II and pivotal pro-inflammatory genes in normal brain aging. Neurobiology of Aging, 27, 717–722. Garcia, J., Brett, L. P., & Rusiniak, K. W. (1989). Limits of Darwinian conditioning. In S. B. Klein & R. J. Mowrer (Eds.), Contemporary learning theories: Instrumental conditioning and the impact of biological constraints on learning (pp. 181–203). Hillsdale, NJ: Erlbaum. Goehler, L. E., Gaykema, R. P., Hammack, S. E., Maier, S. F., & Watkins, L. R. (1998). Interleukin-1 induces c-Fos immunoreactivity in primary afferent neurons of the vagus nerve. Brain Research, 804, 306–310. Goehler, L. E., Lyte, M., & Gaykema, R. P. (2007). Infection-induced viscerosensory signals from the gut enhance anxiety: Implications for psychoneuroimmunology. Brain Behavioral Immunity, 21, 721–726. Goehler, L. E., Relton, J. K., Dripps, D., Kiechle, R., Tartaglia, N., Maier, S. F., et al. (1997). Vagal paraganglia bind biotinylated interleukin-1 receptor antagonist: A possible mechanism for immune-to-brain communication. Brain Research Bulletin, 43, 357–364. Goshen, I., & Yirmiya, R. (2005). The role of pro-inflammatory cytokines in memory processes and neural plasticity. In R. Ader (Ed.), Psychoneuroimmunology (4th ed., pp. 337–370). San Diego, CA: Academic Press. Gould, S. J. (1982). Darwinism and the expansion of evolutionary theory. Science, 216, 380–387. Hall, J., Thomas, K. L., & Everitt, B. J. (2000). Rapid and selective induction of BDNF expression in the hippocampus during contextual learning. National Neuroscience, 3, 533–535. Hansen, M. K., O’Connor, K. A., Goehler, L. E., Watkins, L. R., & Maier, S. F. (2001). The contribution of the vagus nerve in interleukin-1beta-induced fever is dependent on dose. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 280, R929–R934. Hansen, M. K., Taishi, P., Chen, Z., & Krueger, J. M. (1998). Vagotomy blocks the induction of interleukin-1beta (IL-1beta) mRNA in the brain of rats in response to systemic IL-1beta. Journal of Neuroscience, 18, 2247–2253. Hart, B. L. (1988). Biological basis of the behavior of sick animals. Neuroscience Biobehavioral Review, 12, 123–137. Hein, A. M., Stutzman, D., Barrientos, R. M., Watkins, L. R., Rudy, J. W., et al. (2007). Prostaglandins are necessary and sufficient to induce contextual fear learning impairments after interleukin-1 beta injections into the dorsal hippocampus. Manuscript submitted for publication.
8/18/09 4:58:15 PM
134
Neuroimmunology
Hoek, R. M., Ruuls, S. R., Murphy, C. A., Wright, G. J., Goddard, R., Zurawski, S. M., et al. (2000). Down-regulation of the macrophage lineage through interaction with OX2 (CD200). Science, 290, 1768–1771. Hoozemans, J. J., Veerhuis, R., Janssen, I., Rozemuller, A. J., & Eikelenboom, P. (2001). Interleukin-1beta induced cyclooxygenase 2 expression and prostaglandin E2 secretion by human neuroblastoma cells: Implications for Alzheimer ’s disease. Experimental Gerontology, 36, 559–570. Huang, Y., Henry, C. J., Dantzer, R., Johnson, R. W., & Godbout, J. P. (2007). Exaggerated sickness behavior and brain proinflammatory cytokine expression in aged mice in response to intracerebroventricular lipopolysaccharide. Neurobiology of Aging, 29, 1744–1753. Imeri, L., Bianchi, S., & Opp, M. R. (2006). Inhibition of caspase-1 in rat brain reduces spontaneous nonrapid eye movement sleep and nonrapid eye movement sleep enhancement induced by lipopolysaccharide. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 291, R197–R204. Janeway, C. A., Travers, P., Walport, M., & Shlomchik, M. (2005). Immunobiology: The immune system in health and disease (6th ed.). London: Garland Science. Johnson, J. D., O’Connor, K. A., Deak, T., Spencer, R. L., Watkins, L. R., & Maier, S. F. (2002). Prior stressor exposure primes the HPA axis. Psych oneuroendocrinology, 27(3), 353–365. Johnson, J. D., O’Connor, K. A., Deak, T., Stark, M., Watkins, L. R., & Maier, S. F. (2002). Prior stressor exposure sensitizes LPS-induced cytokine production. Brain Behavioral Immunity, 16, 461–476. Johnson, J. D., O’Connor, K. A., Hansen, M. K., Watkins, L. R., & Maier, S. F. (2003). Effects of prior stress on LPS-induced cytokine and sickness responses. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 284, R422–R432. Johnson, J. D., O’Connor, K. A., Watkins, L. R., & Maier, S. F. (2004). The role of IL-1beta in stress-induced sensitization of proinflammatory cytokine and corticosterone responses. Neuroscience, 127, 569–577. Kelly, A., Vereker, E., Nolan, Y., Brady, M., Barry, C., Loscher, C. E., et al. (2003). Activation of p38 plays a pivotal role in the inhibitory effect of lipopolysaccharide and interleukin-1 beta on long term potentiation in rat dentate gyrus. Journal of Biological Chemistry, 278, 19453–19462. Kent, S., Bluthe, R. M., Dantzer, R., Hardwick, A. J., Kelley, K. W., Rothwell, N. J., et al. (1992). Different receptor mechanisms mediate the pyrogenic and behavioral effects of interleukin 1. Proceedings of the National Academy of Sciences, USA, 89, 9117–9120. Kluger, M. J., Kozak, W., Conn, C. A., Leon, L. R., & Soszynski, D. (1996). The adaptive value of fever. Infectious Disease Clinic of North America, 10, 1–20. Konsman, J. P., Luheshi, G. N., Bluthe, R. M., & Dantzer, R. (2000). The vagus nerve mediates behavioural depression, but not fever, in response to peripheral immune signals; a functional anatomical analysis. European Journal of Neuroscience, 12, 4434–4446. Konsman, J. P., Vigues, S., Mackerlova, L., Bristow, A., & Blomqvist, A. (2004). Rat brain vascular distribution of interleukin-1 type-1 receptor immunoreactivity: Relationship to patterns of inducible cyclooxygenase expression by peripheral inflammatory stimuli. Journal of Comparative Neurology, 472, 113–129. Krabbe, K. S., Reichenberg, A., Yirmiya, R., Smed, A., Pedersen, B. K., & Bruunsgaard, H. (2005). Low-dose endotoxemia and human neuropsychological functions. Brain Behavioral Immunity, 19, 453–460. Kreutzberg, G. W. (1996). Microglia: A sensor for pathological events in the CNS. Trends in Neuroscience, 19, 312–318. Kyrkanides, S., O’Banion, M. K., Whiteley, P. E., Daeschner, J. C., & Olschowka, J. A. (2001). Enhanced glial activation and expression of specific CNS inflammation-related molecules in aged versus young rats following cortical stab injury. Journal of Neuroimmunology, 119, 269–277. Lacroix, S., Feinstein, D., & Rivest, S. (1998). The bacterial endotoxin lipopolysaccharide has the ability to target the brain in upregulating its
c07.indd Sec3:134
membrane CD14 receptor within specific cellular populations. Brain Pathology, 8, 625–640. Laursen, P. (1997). The impact of aging on cognitive functions. An 11 year follow-up study of four age cohorts. Acta neurologica Scandinavia 172(Suppl.), 7–86. LeMay, L. G., Vander, A. J., & Kluger, M. J. (1990). The effects of psychological stress on plasma interleukin-6 activity in rats. Physiological Behavior, 47, 957–961. Lin, J. H., & Lin, M. T. (1996). Nitric oxide synthase-cyclo-oxygenase pathways in organum vasculosum laminae terminalis: Possible role in pyrogenic fever in rabbits. British Journal of Pharmacology, 118, 179–185. Ma, X. C., Chen, L. T., Oliver, J., Horvath, E., & Phelps, C. P. (2000). Cytokine and adrenal axis responses to endotoxin. Brain Research, 861, 135–142. Maier, S. F., & Watkins, L. R. (1995). Intracerebroventricular interleukin1 receptor antagonist blocks the enhancement of fear conditioning and interference with escape produced by inescapable shock. Brain Research, 695, 279–282. Maier, S. F., & Watkins, L. R. (1998). Cytokines for psychologists: Implications of bidirectional immune-to-brain communication for understanding behavior, mood, and cognition. Psychological Review, 105, 83–107. Maier, S. F., Watkins, L. R., & Nance, D. M. (2001). Multiple routes of action of interleukin-1 on the nervous system. In R. Ader, D. L. Felten, & N. Cohen (Eds.), Psychoneuroimmunology (3rd ed., pp. 563–585). San Diego, CA: Academic Press. Mingam, R., DeSmedt, V., Amedee, T., Bluthe, R. M., Kelley, K. W., Dantzer, R., et al. (2007). In vitro and in vivo evidence for a role of the P2X7 receptor in the release of IL-1 in the murine brain. Brain, Behavior, and Immunity, 22, 234–244. Musselman, D. L., Lawson, D. H., Gumnick, J. F., Manatunga, A. K., Penna, S., Goodkin, R. S., et al. (2001). Paroxetine for the prevention of depression induced by high-dose interferon alfa. New England Journal of Medicine, 344, 961–966. Nair, A., & Bonneau, R. H. (2006). Stress-induced elevation of glucocorticoids increases microglia proliferation through NMDA receptor activation. Journal of Neuroimmunology, 171(1/2), 72–85. Neumann, H. (2001). Control of glial immune function by neurons. Glia, 36, 191–199. Nguyen, K. T., Deak, T., Will, M. J., Hansen, M. K., Hunsaker, B. N., Fleshner, M., et al. (2000). Timecourse and corticosterone sensitivity of the brain, pituitary, and serum interleukin-1beta protein response to acute stress. Brain Research, 859, 193–201. Nimmerjahn, A., Kirchhoff, F., & Helmchen, F. (2005). Resting microglial cells are highly dynamic surveillants of brain parenchyma in vivo. Science, 308, 1314–1318. O’Connor, K. A., Johnson, J. D., Hansen, M. K., Wieseler Frank, J. L., Maksimova, E., Watkins, L. R., et al. (2003). Peripheral and central proinflammatory cytokine response to a severe acute stressor. Brain Research, 991, 123–132. Ottaviani, E., & Franchini, A. (1995). Immune and neuroendocrine responses in molluscs: The role of cytokines. Acta Biologica Academiae Scientiarum Hungaricae, 46(2/4), 341–349. Perry, V. H., Newman, T. A., & Cunningham, C. (2003). The impact of systemic infection on the progression of neurodegenerative disease. National Review of Neuroscience, 4, 103–112. Plaza, M. A., Fioramonti, J., & Bueno, L. (1997). Role of central interleukin-1 beta in gastrointestinal motor disturbances induced by lipopolysaccharide in sheep. Digestive Diseases and Sciences, 42, 242–250. Pugh, C. R., Nguyen, K. T., Gonyea, J. L., Fleshner, M., Wakins, L. R., Maier, S. F., et al. (1999). Role of interleukin-1 beta in impairment of contextual fear conditioning caused by social isolation. Behavioral Brain Research, 106, 109–118.
8/18/09 4:58:16 PM
References 135 Raison, C. L., Woolwine, B. J., Demetrashvili, M. F., Borisov, A. S., Weinreib, R., Staab, J. P., et al. (2007). Paroxetine for prevention of depressive symptoms induced by interferon-alpha and ribavirin for hepatitis C. Alimentary Pharmacology and Therapeutics, 25, 1163–1174. Rasley, A., Tranguch, S. L., Rati, D. M., & Marriott, I. (2006). Murine glia express the immunosuppressive cytokine, interleukin-10, following exposure to Borrelia burgdorferi or Neisseria meningitidis. Glia, 53, 583–592. Romeo, H. E., Tio, D. L., & Taylor, A. N. (2003). Effects of glossopharyngeal nerve transection on central and peripheral cytokines and serum corticosterone induced by localized inflammation. Journal of Neuroimmunology, 136(1/2), 104–111. Roosevelt, R. W., Smith, D. C., Clough, R. W., Jensen, R. A., & Browning, R. A. (2006). Increased extracellular concentrations of norepinephrine in cortex and hippocampus following vagus nerve stimulation in the rat. Brain Research, 1119, 124–132. Rudy, J. W., & Sutherland, R. J. (1995). Configural association theory and the hippocampal formation: An appraisal and reconfiguration. Hippocampus, 5, 375–389. Schiltz, J. C., & Sawchenko, P. E. (2007). Specificity and generality of the involvement of catecholaminergic afferents in hypothalamic responses to immune insults. Journal of Comparative Neurology, 502, 455–467. Schmidt, E. D., Aguilera, G., Binnekade, R., & Tilders, F. J. (2003). Single administration of interleukin-1 increased corticotropin releasing hormone and corticotropin releasing hormone-receptor mRNA in the hypothalamic paraventricular nucleus which paralleled long-lasting (weeks) sensitization to emotional stressors. Neuroscience, 116, 275–283. Serrats, J., & Sawchenko, P. E. (2006). CNS activational responses to staphylococcal enterotoxin B: T-lymphocyte-dependent immune challenge effects on stress-related circuitry. Journal of Comparative Neurology, 495, 236–254. Sheng, J. G., Mrak, R. E., & Griffin, W. S. (1998). Enlarged and phagocytic, but not primed, interleukin-1 alpha-immunoreactive microglia increase with age in normal human brain. Acta Neuropathologica, 95, 229–234. Shintani, F., Nakaki, T., Kanba, S., Sato, K., Yagi, G., Shiozawa, M., et al. (1995). Involvement of interleukin-1 in immobilization stress-induced increase in plasma adrenocorticotropic hormone and in release of
c07.indd Sec3:135
hypothalamic monoamines in the rat. Journal of Neuroscience, 15(3, Pt. 1), 1961–1970. Solomon, G. F. (1969). Stress and antibody response in rats. International Archives of Allergy and Applied Immunology, 35, 97–104. Squire, L. R. (2004). Memory systems of the brain: A brief history and current perspective. Neurobiology of Learning and Memory, 82(3), 171–177. Sugimoto, Y., & Narumiya, S. (2007). Prostaglandin E receptors. Journal of Biological Chemistry, 282, 11613–11617. Swiergiel, A. H., & Dunn, A. J. (2007). Effects of interleukin-1beta and lipopolysaccharide on behavior of mice in the elevated plus-maze and open field tests. Pharmacology Biochemistry and Behavior, 86, 651–659. Taylor, J. L., & Grossberg, S. E. (1998). The effects of interferonalpha on the production and action of other cytokines. Seminars in Oncology, 25(1 Suppl. 1), 23–29. Tracey, K. J. (2007). Physiology and immunology of the cholinergic antiinflammatory pathway. Journal of Clinical Investigation, 117, 289–296. Van Dam, A. M., Bauer, J., Tilders, F. J., & Berkenbosch, F. (1995). Endotoxin-induced appearance of immunoreactive interleukin-1 beta in ramified microglia in rat brain: A light and electron microscopic study. Neuroscience, 65, 815–826. Wan, W., Wetmore, L., Sorensen, C. M., Greenberg, A. H., & Nance, D. M. (1994). Neural and biochemical mediators of endotoxin and stressinduced c-fos expression in the rat brain. Brain Research Bulletin, 34, 7–14. Watkins, L. R., Wiertelak, E. P., Goehler, L. E., Mooney-Heiberger, K., Martinez, J., Furness, L., et al. (1994). Neurocircuitry of illnessinduced hyperalgesia. Brain Research, 639, 283–299. Ye, S. M., & Johnson, R. W. (2001). An age-related decline in interleukin10 may contribute to the increased expression of interleukin-6 in brain of aged mice. Neuroimmunomodulation, 9, 183–192. Yirmiya, R. (1996). Endotoxin produces a depressive-like episode in rats. Brain Research, 711, 163–174. Yirmiya, R., Pollak, Y., Barak, O., Avitsur, R., Ovadia, H., Bette, M., et al. (2001). Effects of antidepressant drugs on the behavioral and physiological responses to lipopolysaccharide (LPS) in rodents. Neuropsychopharmacology, 24, 531–544.
8/18/09 4:58:16 PM
Chapter 8
Neuroanatomy/Neuropsychology BRYAN E. KOLB AND IAN Q. WHISHAW
recognized that the body is complex and operates much like a machine, but he proposed that the mind was separate and was nonmaterial and nonspatial. Descartes argued that the mind and body are separate but interact to produce thought and behavior, a position referred to as dualism. The problem with the dualistic view is that for the mind to affect the body requires that energy be created, a proposition that violates fundamental laws of physics. Nonetheless, even today many people have a dualistic view of the mind and body and appear to believe that there is more to behavior than the brain. One example is the belief in a nonmaterial soul that influences behavior during life and exits after the body dies.
A challenge for science over the past 150 years has been to identify a general conceptual framework for how the human brain is organized to produce the amazing complexity of human behaviors ranging from movement to emotion to language. The human brain is composed of more than 180 billion cells, more than 80 billion of which are directly engaged in information processing. Given that each nerve cell receives up to 15,000 connections from other nerve cells, there is a challenge in understanding such complexity. The challenge is made simpler by understanding the brain’s underlying organization. Our understanding of brain organization starts with an examination of the historical development of current thinking. We then review how the brain is organized and identify rules that govern its operation. Finally, we consider how complex psychological functions emerge from the anatomical complexity.
The emergence of evolutionary theory in the mid-nineteenth century provided a different perspective: Rational behavior can be fully explained by the activity of the nervous system without the need for a nonmaterial mind. Darwin emphasized the important idea of “descent with modification” in which all living animals are descended from a common ancestor. He thus identified the principle that the workings of the human brain reflect a long history of adaptation of an early primitive brain. One implication of this view is that the human brain is not special but rather represents an elaboration of the brains of our nearest relatives such as the great apes and, likewise, the ape brains are elaborations of more primitive mammalian brains. The recognition that mammalian brains are fundamentally similar in general organization was an important step because it meant that the organization of the human brain could be studied using relatively simpler surrogate brains such as those of monkeys, carnivores, and rodents.
HISTORICAL IDEAS OF BRAIN ORGANIZATION People knew what the brain looked like long before they had any idea what it might do. Early humans must have noticed that all animals had a brain and that it was connected to other parts of the body by what we now know to be nerves. Understanding what the brain does requires the important philosophical leap, however, of first recognizing that the production of thought and behavior is based on biology rather than on some sort of “will” or energy force. Recognizing that behavior is related to biological activity is only the first step because there also needs to be a recognition that the nervous system, and not other organs such as the heart or liver, produces behavior. Although Alcmaeon of Croton (Greece) located mental processes in the brain about 2,500 years ago (and thus developed what is now called the brain hypothesis), the merit of this hypothesis has been hotly debated ever since and is still not universally accepted. The modern views were first clearly identified by Descartes in the seventeenth century. Descartes
Recognizing that the brain controls behavior was an important step, but the question of how the brain did this remained unanswered. One important philosophical issue that emerged in the late nineteenth century revolved around the question of whether the brain can be divided into separate parts representing separate functions or whether the brain operated as a whole and thus was indivisible. The identification of Broca’s area, and later Wernicke’s area, 136
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c08.indd 136
8/17/09 2:02:13 PM
Methods of Studying Brain and Behavior Relations
137
Figure 8.1 Neurons. Note. A spiny stellate neuron from the nucleus accumbens showing the cell body and dendrites. The enlargement of a dendrite on the right shows the dendritic spines that provide the location of excitatory synapses.
as key to specific aspects of language suggested that functions were localized and led many neurologists to seek specific functions for every brain region. Parallel investigations revealed that the severity of cognitive loss was related to the extent of brain injury rather than the precise locus. Thus, other neurologists argued that functions like language could not be localized because so much of the brain was involved. One problem is that it is clear that behaviors such as emotion, memory, or language emerge from the activity of the entire nervous system and thus do not respect specific anatomical structures. This is not to say that specific regions do not play larger (or smaller) roles in different psychological functions but rather that all psychological functions require the contribution of many different neural systems, which by their nature are not precisely housed in single places. The final step in the historical development of current thoughts about brain organization and function was the recognition that the nervous system is composed of discrete autonomous units, neurons and glia, that are not physically connected but interact to generate nervous activity (Figure 8.1). The neuron hypothesis stipulates that neurons carry out the brain’s major functions whereas glia aid and modulate the neuron’s activities—for example, forming the fatty covering, or insulation, over neurons, as well as producing various chemicals that influence neuronal functions. The idea that neurons represented individual units in brain function dates back to Ramon y Cajal early in the twentieth century but it was only later that it was discovered that there are two distinctly different types of neurons that function to excite or inhibit the activity of other neurons. This dichotomy is philosophically very important because we now can see that not only does the brain produce thought and behavior but it also inhibits thoughts
c08.indd Sec1:137
and behaviors. The recognition of this distinction is fundamental to understanding both brain function and its dysfunction.
METHODS OF STUDYING BRAIN AND BEHAVIOR RELATIONS The first insights into brain function were based on observations of naturally occurring injuries in people. It was not until the mid-1800s that experimental techniques, such as using electrical stimulation of the brain to determine functions, began to emerge. It was well into the twentieth century before a systematic science of brain function began to emerge both experimentally and clinically. We next review the principal methods used to outline the general anatomical and functional organization of the brain. Human Neuropsychology Human neuropsychology is the science that relates brain function to cognitive behavior. There is a rich history of clinical neurology dating well before Broca’s 1861 description of his patient Tan who had lost the ability to speak, but Broca’s patient was particularly important because it was the first time that a brain function was placed in a particular location in the brain—in this case “Broca’s area.” It was not until World War I that systematic descriptions of large numbers of cases of head-injured soldiers began to provide a basis for modern neuropsychology (e.g., Holmes, 1918). Soldiers with gunshot or shrapnel wounds to the brain showed specific symptoms, many of which were unexpected. For example, soldiers who were unable to respond to stimuli on their left side seemed unaware that
8/17/09 2:02:13 PM
138
Neuroanatomy/Neuropsychology
there was any problem and would even deny that they had any difficulties. Neuropsychology became a systematic field of investigation only after World War II, however, and the term neuropsychology was first formally used in 1949 (Hebb, 1949). A study of an amnesic patient, who retained past memories but lost the ability to form new memories (Scoville & Milner, 1957), marked an important step in the development of modern neuropsychology because it led the systematic investigation of memory. The patient, referred to as H.M., had a selective removal of the hippocampus, a structure in the temporal lobe, to reduce his epileptic seizures. Immediately after the surgery, H.M. exhibited a severe amnesia in which he could recall virtually nothing that happened to him after his surgery—even though his memory of presurgical events appeared to be intact. Although amnesic patients had been described before, H.M. was the first case in which the symptoms could be attributed to a localized cerebral injury. Indeed, up until the description of H.M., studies on laboratory animals had been unable to find any injury that produced a specific memory loss, so H.M.’s condition was a major breakthrough. A further advance occurred with the studies of splitbrain patients by Sperry (1974). Patients who had the corpus callosum, which connects the two cerebral hemispheres, cut to relieve epileptic seizures revealed that each hemisphere makes complementary but different contributions to behavior. In particular, it became clear that not only was the left hemisphere verbal but that the left and right hemispheres had a different opinion about the world, and in the absence of the corpus callosum they acted pretty much independently of one another. The description of patient D.F. by Milner and Goodale (2006) led to another important shift in thinking about the organization of visual perception and action. D.F. suffered carbon monoxide poisoning and appeared to be essentially blind. Curiously, however, when she made arm and hand movements, she sometimes acted as though she could see. Thus, although she was unable to identify objects, she could reach and grasp the objects as if she knew what they were. Consider that our hand position in reaching for a glass is quite different from the position for picking up a pencil, but to make different movements would seem to require a recognition of the object. D.F.’s actions showed that some of our actions are conscious (such as identifying objects) and others are unconscious (such as making movements to manipulate objects). D.F. had no conscious vision but still had unconscious vision. Most neuropsychological studies in the past 50 years have not been case studies but rather studies of groups of patients with fairly circumscribed lesions, perhaps best exemplified by studies of patients with frontal or temporal
c08.indd Sec1:138
lobectomies for the treatment of intractable epilepsy (e.g., Kolb & Whishaw, 2008). Sixty years of human neuropsychology have now clearly shown not only that functions are relatively localized in the cerebral cortex, but they have laid the groundwork for our current understanding of how the brain is organized. Electrophysiological Confirmation of Localization In 1870, Fristch and Hitzig (1956) described an extraordinary finding that electrical stimulation of the small portion of the cortex of a rabbit and a dog produced movements. Importantly, not only were there movements but the movements were selective such that stimulating different points led to different movements. Later studies showed similar effects in humans, other primates, and many laboratory animals (Woolsey, 1958). These studies led to the development of detailed somatosensory, motor, and language maps in the cerebral cortex (Penfield & Roberts, 1956). One of the difficulties of the early studies was that the cranial bones had to be opened to allow access to the brain, but in the past decade noninvasive techniques have been developed using magnetic stimulation on normal waking subjects. Laboratory Animal Studies of Brain Organization Although there were early isolated studies of animals such as dogs and birds with cerebral injuries, the first systematic studies of large numbers of animals were begun in the early part of the twentieth century. These studies had three distinct foci: (1) studies of how nerves sent messages using “simple” models such as the squid (e.g., Hodgkin & Huxley, 1952); (2) electrophysiological studies of spinal cord worked (e.g., Sherrington, 1948); and (3) behavioral studies of animals with discrete cerebral injuries (Lashley, 1960) directed toward studying how animals thought and remembered. By the 1960s, both the electrophysiological and behavioral techniques had evolved to a point that a new field of behavioral neuroscience began to emerge that paralleled the human studies described earlier. The recognition that the rodent is an excellent model for understanding basic principles of cerebral organization in primates allowed researchers to expand investigations of a wide range of topics ranging from motor control to memory, as well as allowing investigations of factors influencing recovery from brain injury such as stroke (e.g., Kolb & Tees, 1990; Whishaw & Kolb, 2006). Noninvasive Imaging One of the historical impediments to studying brain organization in humans has been the difficulty in studying brain
8/17/09 2:02:14 PM
General Brain Organization 139
function in normal volunteers. Although the electroenceophalogram (EEG), which measures electrical activity of the brain through the skull, was developed in the 1930s, its primary use was in identifying states of consciousness or brain pathology. The major advances in computer technology in the latter part of the twentieth century allowed EEG, to be used to identify discrete neural processing that is best exemplified by event-related potentials (ERPs). ERPs are signals that are correlated with specific forms of sensory processing, such as the recognition of some specific information. The difficulty with ERPs, however, is that although the temporal resolution of the electrical signal is good, the spatial localization of the signal is difficult because the recording is through the skull. The development of metabolic and blood flow measures of brain activity using positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) provide a more precise view of brain function (see Kolb & Whishaw, 2008). The logic of these methods is that regions of the brain that are more active have a higher metabolic activity that can be seen by the brain’s use of glucose, oxygen, and blood. Thus, just as somatic muscles that are more active have higher metabolic demands than when they are less active, brain regions that are active use more resources than regions that are less active. It is now possible to do parallel studies using ERP and fMRI measures and thus allow better-refined temporal and spatial resolution. The development of computer-based data collection and sophisticated statistical procedures to average signals across subjects has allowed researchers to compare the brain activity of multiple subjects. For example, Hasson, Nir, Levy, Fuhrmann, & Malach (2004) allowed five subjects to freely view a 30-minute segment of a feature film, The Good, The Bad, and The Ugly, while cortical activity was monitored via fMRI. The authors reasoned that such a rich and complex visual stimulation would be far more similar to natural vision than the highly constrained visual stimuli normally used in the laboratory. Comparison of brain activation across the subjects showed the brains of different individuals tended to act in unison during the free viewing. This surprising activity coherence suggests that a large expanse of the human cortex is stereotypically responsive to naturalistic audiovisual stimuli. Furthermore, although overall there was widespread activity in the cerebrum during the viewing, there also were selective activations related to the precise moment-to-moment film content. For example, specific regions were activated by faces or places, suggesting relatively localized processing. The generalized and specific nature of the cerebral processing is clearly relevant to the
c08.indd Sec2:139
debates regarding localization of functions in the brain discussed earlier. Molecular Studies With the description of the genome of both humans and laboratory animals in the early part of the twenty-first century, there has been a radical shift toward studying the genetic bases of brain function. Although these studies are likely to provide considerable insight into the detailed molecular mechanisms underlying neuronal functions, such studies have had little time to provide much impact on our general understanding of how the brain is organized functionally. One exception to this generalization comes from work showing that experience can modify gene expression, which in turn influences how behavior is expressed. For example, Weaver et al. (2004) showed that the amount of time a mother rat spends licking and grooming her infants can influence gene expression related to functions of the hypothalamic-pituitary-adrenal (HPA) stress axis. This genetic expression later influences the reactivity of the offspring to stress in adulthood but also determine how the female offspring interact with their pups.
GENERAL BRAIN ORGANIZATION Understanding the basic organization of the brain can be most easily seen in the evolutionary and ontogenetic development of the brain. Figure 8.2 shows a basic three-part structural plan that divides the brain into front, middle, and back components. The front and back components expand greatly in mammals and become further subdivided into five regions. Historically, embryologists gave rather cumbersome names to the various regions and these names remain, although they are seldom used in behavioral studies. The three regions of the primitive developing brain are first recognizable as enlargements at the front end of a fluid-filled tube. In the simple brain the front region (the prosencephalon) is responsible for olfaction and basic body functions such as feeding and drinking; the middle region (mesencephalon) for hearing and vision; and the back region (rhombencephalon) controls movement, balance, and breathing. The back end of the tube extends to form the spinal cord. As the brain enlarges in mammals and birds, cerebral hemispheres develop and existing functions are elaborated in the prosencephalon. The original tube also becomes elaborated to form pockets of fluid known as ventricles, as shown in Figure 8.3.
8/17/09 2:02:14 PM
140
Neuroanatomy/Neuropsychology
(A) Fish, amphibian, reptile, human embryo at 25 days
(B) Mammals such as rat, human embryo at 50 days
(C) Fully developed human brain Telencephalon
Telencephalon Diencephalon Mesencephalon
Prosencephalon Mesencephalon
Myelencephalon
Spinal cord
Rhombencephalon
Diencephalon Mesencephalon Metencephalon Myelencephalon Spinal cord
Metencephalon
Spinal cord
Telencephalon (end brain)
Neocortex, basal ganglia, limbic system olfactory bulb, lateral ventricles
Diencephalon (between brain)
Thalamus, epithalamus, hypothalamus, pineal body, third ventricle
Mesencephalon
Tectum, tegmentum, cerebral aqueduct
Metencephalon (across-brain)
Cerebellum, pons, fourth ventricle
Myelencephalon (spinal brain)
Medulla oblongata, fourth ventricle
Spinal cord
Spinal cord
Prosencephalon (forebrain)
Mesencephalon (midbrain)
Figure 8.2 Steps in the ontogenetic development of the brain. Forebrain
Brainstem
Rhombencephalon (hindbrain) Spinal cord
Spinal cord
Note. (A) A three-chambered brain; (B) a five-chambered brain; (C) side view through the center of a human brain. From Fundamentals of Human Neuropsychology, fifth edition, by B. Kolb and I. Q. Whishaw, 2003, New York: Worth. Reprinted with permission.
Cerebral cortex Lateral ventricle Epithalamus Third ventricle Hypothalamus Thalamus Optic chiasm Fourth ventricle Cerebellum Medulla Spinal cord
Tegmentum Superior colliculus Pons
Inferior colliculus
Figure 8.3 Medial view through the center of the brain showing structures of the brain stem. Note. From Fundamentals of Human Neuropsychology, fifth edition, by B. Kolb and I. Q. Whishaw, 2003, New York: Worth. Reprinted with permission.
Cerebral aqueduct Reticular formation
The fluid in the ventricles, (cerebrospinal fluid or CSF) is produced by cells that line the ventricular walls. The CSF flows from the ventricles to eventually enter the circulatory system. The expanded prosencephalon is now referred to as the forebrain, and the remaining brain is referred to as the brain stem (Table 8.1). The brain stem receives nerves from all of the body’s senses and it sends nerves to control all of the body’s movements except the most complex movements of the fingers and toes of mammals. The forebrain acts to elaborate on the basic functions of the brain stem. In relatively primitive animals such as frogs, the entire brain is
c08.indd Sec2:140
Tectum
essentially equivalent to the mammalian brain stem. Given that such animals have complex behavioral repertoires, it is obvious that the brain stem is quite a sophisticated piece of machinery. Brain Stem The brain stem can be divided into three functional regions: hindbrain, midbrain, and diencephalon. The diencephalon can be conceived as a “between brain” because it acts as a border between the lower (brain stem) and upper (forebrain) parts of the brain. Each brain stem region contains
8/17/09 2:02:14 PM
General Brain Organization 141 Table 8.1 Anatomical Divisions of the Central Nervous System Anatomical division
Functional division
Principal structures
Forebrain
Forebrain
Cerebral cortex Basal ganglia Limbic system
Brain stem
Diencephalon
Thalamus Hypothalamus
Midbrain
Tectum Tegmentum
Hindbrain
Cerebellum Pons Medulla oblongata Reticular formation
Spinal cord
Spinal nerves
Cervical nerves Thoracic nerves Lumbar nerves Sacral nerves
various subparts and thus performs more than a single task. Although all three regions have both sensory and motor functions, the hindbrain is more important for motor functions and the midbrain for sensory functions. The diencephalon plays a role in regulatory behaviors such as temperature regulation and the control of eating and drinking. Hindbrain The hindbrain has four major subregions (see Figure 8.3). The largest region of the hindbrain is the cerebellum, a structure that becomes progressively larger and more complex as behaviors become more complex as the forebrain expands. The pons and medulla contain substructures that control vital body functions such as breathing and the cardiovascular system. The final hindbrain region, the reticular formation, is a netlike mixture of neurons and nerve fibers with specialized roles in stimulating the forebrain. Midbrain The midbrain is composed of the tectum (roof of the ventricle) and tegmentum (floor of the ventricle). The tectum receives major inputs from the eyes and ears, with the optic nerve going to a region called the superior colliculus and the auditory nerve going to the inferior colliculus. The
c08.indd Sec2:141
colliculi function both to analyze sensory input as well as to produce orienting movements related to sensory inputs, such as turning the eyes or head toward a sound. To allow such orientation the colliculi have a “map” of the external world so that the head and eyes can be directed correctly. The auditory and visual maps must overlap so that the two systems can work together. The tegmentum is composed of multiple structures, largely with movement-related functions. The main nuclei are the red nucleus (for control of limbs), substantia nigra (for the initiation and inhibition of movements), and the periacquiductal gray matter (for species typical behavior such as sexual behavior and for the modulation of pain). Diencephalon The integrating functions of the diencephalon require more anatomical parts than the rest of the brain stem. The two major structures are the hypothalamus and thalamus, both of which are composed of about 20 subnuclei (Figure 8.4). The hypothalamus contains nuclei associated with eating, drinking, thermal regulation, sexual behavior, emotional behavior, and hormone function. The hypothalamus is connected both directly and hormonally with the pituitary gland, which in turn produces hormones that travel to other organs such as the adrenal gland. In contrast to most of the brain stem, which is the same in males and females, there
8/17/09 2:02:15 PM
142
Neuroanatomy/Neuropsychology
Diencephalon
Thalamus
Dorsomedial nucleus (connects to frontal lobe)
Lateral geniculate nucleus to visual cortex Medial geniculate nucleus to auditory cortex Auditory input
Optic tract from left eye
Hypothalamus and pituitary gland
Figure 8.4 Medial view of the diencephalon.
Hypothalamus
Note. (Right) Enlargement of the thalamus. (Bottom) Enlargement of the hypothalamus and pituitary. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
Pituitary stalk Pituitary gland
are sex differences in the anatomy of the hypothalamus that are presumably related both to sex-related hormones as well as sex-related differences in sexual and parental behaviors. The thalamus acts as a gateway for information traveling to the cortex (bark or outer portion of the forebrain). Each sense sends its input to a specific thalamic nucleus, which in turn sends information to specific cortical regions. For example, the lateral geniculate nucleus receives input from the optic nerve, and is thus visual, and the medial geniculate nucleus receives input from the auditory nerve and is auditory. Some thalamic regions have motor functions or act in an integrative fashion. The dorsomedial nucleus is an integrative region that receives input from many subcortical structures, as well as the olfactory system, and passes this integrated information to the frontal lobe of the cortex.
Forebrain The forebrain is the largest region of the mammalian brain and like the brain stem, it is composed of multiple regions, the principle ones being the cerebral cortex, basal ganglia, and limbic lobe. One striking characteristic of the forebrain is that it consists of two nearly symmetrical hemispheres, the left and the right, which have both overlapping and specialized functions.
c08.indd Sec2:142
Cerebral Cortex There are two types of cerebral cortex, the new and the old. The new cortex (neocortex) has six layers of gray matter (cell bodies) atop a layer of white matter (fibers). It is the neocortex that is visible when we view the brain from the outside. The neocortex is unique to mammals and is central to the emergence of mental functions such as language, memory, attention, and so on. The old cortex (sometimes called limbic cortex) has three or four layers of gray over white matter and is considered to be more primitive than neocortex. Although the limbic cortex does play an important role in emotional states, its functions also include other mental functions and thus there normally is little reason to draw much functional distinction between neo and limbic cortex. The cerebral cortex is divided into four lobes (frontal, parietal, temporal, occipital) that are named by the cranial bones that overlie the brain rather than by any particular functional characteristics of the regions (Figure 8.5). The frontal and parietal lobes are divided by a deep fissure known as the central sulcus, and the temporal lobe is divided from the frontal lobe by another fissure known as the lateral (or Sylvian) fissure. Although the lobes each have multiple functions, we can ascribe some gross functions to each lobe. The three posterior lobes all have sensory functions, occipital for vision, temporal for audition, and parietal for somatosensation. In addition, visual functions importantly influence
8/17/09 2:02:16 PM
General Brain Organization 143 Cingulate cortex (limbic cortex)
Lateral View Central sulcus
Frontal lobe
Parietal lobe
Lateral Temporal fissure lobe
Occipital lobe
Figure 8.5 A lateral view of the human brain illustrating the lobes of the brain and the different lobes.
Temporal lobe
Note. The lateral fissure separates the temporal and frontal lobes whereas the central sulcus separates the frontal and parietal lobes. From Fundamentals of Human Neuropsychology, fifth edition, by B. Kolb and I. Q. Whishaw, 2003, New York: Worth. Adapted with permission.
Amygdala Hippocampus
Figure 8.7 Medial view of the human brain showing the principal regions of the limbic system. Note. From Introduction to Brain and Behavior, second edition, by B. Kolb and I.Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
each of these lobes. The extensive distribution of visual functions speaks to the large amount of the cerebral cortex that is involved in some form of visual processing. The frontal lobe has motor, olfactory, and gustatory functions as well as an important integrative role that is sometimes referred to as “executive” function. The cerebral cortex can also be divided on the basis of its cellular architecture (called cytoarchitecture). Brodmann first described a detailed map of cortical regions early in the twentieth century (Kolb & Whishaw, 2003), assigning some 50 numbers to distinctly different regions (e.g., 1, 2, 3; Figure 8.6). Each Brodmann region can be associated with specific subfunctions of the lobes. With the development of newer anatomical methods over the past 100 years, the Brodmann map has been refined and expanded further (e.g., Petrides, 2005) and the general idea of distinct cytoarchitectonic zones that correspond to distinct functions has been enhanced.
evolutionary origin of these structures, some anatomists referred to these regions as the reptilian brain, but the term limbic (meaning lining) is more widely recognized. The limbic lobe is also sometimes referred to as the limbic system but given that the limbic regions do not function as a unified structure, the term limbic system is really a misnomer. The principal structures of the limbic lobe include the amygdala, hippocampus, and cingulate cortex (Figure 8.7). The amygdala plays a central role in emotion, and especially in fear. Removal of the amygdala completely removes fear, whereas overactivation of the amygdala can render a person highly anxious. The hippocampus has a central role in certain kinds of memory as well as in spatial navigation. Limbic regions are also partly responsible for the rewarding properties of experiences, including psychoactive drugs. The cingulate cortex has been associated with pain, emotion, and memory, and it connects extensively with the amygdala and hippocampus.
Limbic Lobe As the brain of amphibians and reptiles evolved, structures lining the brain stem began to emerge. In view of the
Lateral View
9 10
8
3 5 1 2
9
46 45 47
11 38
43
44 52
8
7 19
39
22 37
33
18 11
42 19
17
7
23
35
10
c08.indd Sec2:143
31 30
26 27 29
38
34 35 28
19 18 17
25
20 20
3 1 2 5
24
9
41 40
21
4
6
36
4
6
Medial View
37
19
18
Figure 8.6 Brodmann’s areas of the cortex. Note. From Fundamentals of Human Neuropsychology, fifth edition, by B. Kolb and I. Q. Whishaw, 2003, New York: Worth. Adapted with permission.
8/17/09 2:02:16 PM
144
Neuroanatomy/Neuropsychology
Figure 8.8 Frontal section of the cerebral hemispheres showing the basal ganglia relative to surrounding structures. Note. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
Basal Ganglia The basal ganglia are a collection of nuclei that lie just below the white matter of the anterior region of the cerebral cortex (Figure 8.8). The three principal structures are the caudate nucleus, putamen, and globus pallidus. All cortical regions send connections to the basal ganglia, allowing the basal ganglia to be well informed about the activities of the cortex. The basal ganglia in turn send connections to the motor system, thus influencing movement. The functions of the basal ganglia can be observed by analyzing the behavior of patients with the many diseases that interfere with normal functioning of these regions. Among the most common disorders are Parkinson’s disease, in which movement becomes more difficult, and Huntington’s chorea, in which unwanted tics and gestures interfere with normal movement. Thus, basal ganglia damage does not produce a disorder in producing movements as in paralysis but rather produce a disorder in controlling movements. This distinction is important because it shows that movements are produced at lower levels (such as in the brain stem) but modulated by the forebrain.
RULES OF BRAIN FUNCTION Having considered the general organization of the brain, we are now in a position to look at the general principles that guide how the various parts of the brain work together. The Brain Produces Movement within a Perceptual World It Creates The simplest summary of brain function is that it produces behavior. To do so, however, it must have information about the world. Movements are not made in a vacuum but are related to objects, places, memories, and so on. The representation of the world is dependent on the nature of the information sent to the brain, however. A person who is
c08.indd Sec2:144
color blind has a very different representation of the world than those who perceive color. Similarly, a person who has perfect pitch has a different world than those who do not. Furthermore, animals such as dogs have a rich olfactory world that humans do not share. In contrast, dogs have poor color vision. Our failure to perceive smells and the dogs’ failure to perceive colors does not mean that they are not there—only that the reality we create is different. Although we tend to think that the world that we perceive is what is actually there, it is clear that individual realities (both between and within species) are rough approximations of what is actually present. A special function of the brain of each animal species is to produce a reality that is adaptive for that species. In other words, the behavior that the brain produces is directly related to the details of the world that the brain has created. Dogs and people behave differently toward smells (or colors) because of the nature of the perceptual world that their respective brains have created. The Brain Creates “Maps” of the World Sensory information is represented in the brain in an orderly manner. Consider the feelings from your skin when a fly is walking along your arm. You perceive a place on your body and you can orient to it. Further, when the fly moves along the hand to the arm, you perceive it to be in different body locations. Similarly, when you want to move a finger, you can do so without making movements elsewhere. This specificity in perception and movement is enabled by sensory and motor maps of the body in your brain. Indeed, it was these body maps that Fritsch and Hitzig (1956) first found when they electrically stimulated the cortex. But maps are not just about the body. When we wander about the world we can identify places by sight and sound, so there must be visual and auditory maps as well, and these maps must somehow be coordinated because sights and sounds subjectively appear to be in the same place.
8/17/09 2:02:18 PM
Rules of Brain Function
Each sensory system has more than one map of the world. This is because maps are often quite specific. Although we perceive shape and color of objects to be a single thing, they are represented by separate maps (color versus shape) in the brain. We can make a similar distinction between the sensations of touch and pain in the skin. Similarly, sounds differ in their pitch as well as their meaning (language versus musical sounds). We therefore can think of the brain’s creation of sensory experience as a series of maps of different aspects of sensory information. One major change in the brain during evolution is the creation of more and more maps as the brain grows larger. Furthermore, species differences in sensory capacities reflect not only differences in the number of maps but also in the nature of the maps. Jerison (1991) suggested that the intelligence of a given species is related to the number of maps. As the brain develops more maps, it is necessary to bind these maps together to form single percepts from equivalent maps. One way to do this is to label the equivalencies to organize them. The labels would designate objects by their place and time in the external world. Labels can thus act to organize information and therefore form the basis of thought.
Sensory and Motor Functions Are Relatively Separated One of the oldest established laws of nervous system function is the law of Bell and Magendie (nineteenthcentury Scottish and French anatomists). They noticed in four-legged animals that sensory input to the spinal cord entered the top (dorsal) part of the cord, whereas motor outputs left via the bottom (ventral) part. (In upright-walking animals like humans, the dorsal region becomes the back and the ventral region the front but the principle remains.) This distinction between sensory and motor regions is maintained in the brain as well. Recall that the midbrain has a sensory region, the colliculi, and a motor region, the tegmentum. The sensorymotor distinction is obvious in the forebrain as well. We have separate maps for skin sensation and muscle movements. It is obvious that although sensory and motor functions are separated, they must also be closely related or we could not organize our movements to specific places and things. Recall, for example, that the sensory and motor regions of the midbrain act together to allow the brain to orient the body to visual and auditory stimuli. The integration of sensory and motor functions is therefore a critical function of the brain.
c08.indd Sec3:145
145
Inputs and Outputs of the Brain Are Crossed One peculiar feature of brain organization is that most of the inputs and outputs are “crossed.” For example, the sensory inputs from the right side of the body, and thus information from the right side of the world, go to the left side of the brain, and the motor outputs of the left side of the brain go back to the right side of the body. This crossed organization explains why people with brain injury to the left side of the brain may have difficulty in moving the right side of the body. Animals with eyes on the sides of their heads, such as horses, have an arrangement of visual input with nearly all input from the left eye going to the right side of the brain and the right eye to the left side. Animals with the eyes located side-by-side at the front of the face, such as humans and cats, have a slightly different arrangement in which only about half of the projection crosses. The input from the left side of each eye goes to the right brain, and that from the right side goes to the left side. This arrangement shows that it is the views of each side of the viewer ’s world that are projected to the opposite brain hemisphere. A crossed brain must somehow join the two sides of the perceptual world together. As a result, innumerable connections link the two sides of the brain. The most prominent connection in the human brain is the corpus callosum, which is a large bundle of about 200 million nerve fibers that joins the cortex of the left and right hemispheres of the brain.
Brain Anatomy and Function Display Both Symmetry and Asymmetry Although the left and right hemispheres look very similar, there are some asymmetrical features in both the gross anatomy as well as the details of cytoarchitecture. Asymmetry is critical for certain mental functions because we require a single representation of sensory or motor functions to make appropriate behaviors. Consider language. If language were represented on both sides of the brain, we would have the disconcerting ability to speak out of both sides of our mouth at the same time. A simple solution is to locate language on one side of the brain, the left. The same organization holds for bird song—it is also located on the left side of the bird’s brain. The problem in processing of spatial information is handled in the same way. If we want to make a movement in space, we need to direct both sides of the body to the same place, and so one hemisphere organizes spatial behavior. Note, however, that we still need to be able to move our arms to different places and so exert motor control on both sides of the brain for these movements. Thus, although the hemispheres appear symmetrical structurally, they are asymmetrically involved
8/17/09 2:02:19 PM
146
Neuroanatomy/Neuropsychology
in behavior with language normally found in the left hemisphere and various aspects of spatial behavior, located in the right hemisphere. We can now see why patients with surgery to cut the corpus callosum have difficulties and essentially have two minds, one in the left and one in the right hemisphere. Because language is in the left hemisphere, only the left side can speak, but, similarly, because many spatial functions are organized in the right hemisphere, only the right hemisphere can control some visuospatial functions “normally.” The Brain Works by a Juxtaposition of Excitation and Inhibition Although we have emphasized the brain’s role in making movements, we must also recognize that the brain acts to prevent movements as well. In order to make a directed movement such as picking up a glass of water, we must also not make other movements such as moving the hand back and forth. Thus, in producing movement, the brain through excitation produces some action and through inhibition prevents other action. One of the best examples of the control of excitation and inhibition can be seen in patients with Parkinson’s disease. Parkinson’s patients have an uncontrollable shaking of the hands because they have a failure in the system that inhibits such movements. Paradoxically, they often have difficulty in initiating movements and appear frozen because they are unable to generate the excitation needed to produce movements. This juxtaposition of excitation and inhibition is central to how the brain produces behavior and can be seen at the level of individual neurons. All neurons have a spontaneous rate of activity that can be either increased (excitation) or decreased (inhibition). Additionally, some neurons act to excite others, whereas other neurons are inhibitory. These excitatory and inhibitory actions are produced by specific neurochemicals via which neurons communicate. The primary excitatory chemical in the brain is glutamate and the primary inhibitory chemical in the brain is gamma-aminobutyric acid (GABA). Just as individual neurons can act in an excitatory or inhibitory manner, so can brain regions. This distinction can be seen in the effects of brain disease or injury. A brain injury to a region that normally initiates speech may render the person unable to talk whereas those with an injury to a region that inhibits inappropriate language (such as swearing) may be unable to inhibit this form of talking. Thus, brain injury can produce either a loss or a release of behavior via changes in the balance of excitation and inhibition.
c08.indd Sec3:146
The Nervous System Functions on Multiple Levels Sensory and motor functions are carried out at many places in the brain. For example, both sensory processing and motor control occur in the spinal cord, the brain stem, as well as the forebrain. This multiplicity of functions results from the nature of brain evolution. Simple animals such as worms have mainly a spinal cord, more complex animals such as fish have a brain stem as well, and more complex animals have added a forebrain (Figure 8.9). Each new addition to the brain added a new level of behavioral complexity but did not discard previous levels of control. For example, as animals evolved legs, they also had to add forebrain area to move them and later when they developed independent digit movements this too required more forebrain area. The addition of new brain areas can be viewed as adding new levels of nervous system control. The new levels are not autonomous, however, but must be integrated into the existing neural systems. Adding the capacity to move fingers must be related to the prior capacity to move the limbs. Each new level can be conceived as a way of refining and elaborating the control provided by the earlier levels. The idea of levels can not only be seen in the addition of forebrain areas to refine the control of the brain stem but also within the forebrain we can see the addition of new areas. As mammals evolved, they developed an increased capacity to represent the world in the cortex, an ability that is related to the addition of more maps. The new maps must be related to the older ones, however, and again simply reflect an elaboration of the sensory world that was there before.
Brain Systems Are Organized Both Hierarchically and in Parallel A complication with adding multiple levels of brain area is that the levels must be extensively interconnected. There are two different solutions to the wiring problem: serial and parallel circuits (Figure 8.10). A serial circuit hooks up a linear series of all regions concerned with a particular function. Consider vision. In a serial system the information from the eyes goes to regions that detect the simplest properties such as color or brightness. This information would then be passed to another region that determines shape and then to another region that measures movement and so on until at the most complex level the information is understood to be your grandmother. Information therefore flows in a hierarchical manner sequentially from simpler to more complex regions. In addition to this hierarchy, there are parallel systems. Recall in our earlier example that patient D.F. could not
8/17/09 2:02:19 PM
Rules of Brain Function
Highest Remaining Functional Area
147
Behaviors
Reflexes: Responds by stretching, withdrawal, support, scratching, paw shaking, etc. to appropriate sensory stimulation. Spinal cord (spinal)
Hindbrain (low decerebrate)
Midbrain (high decerebrate)
Hypothalamus, thalamus (diencephalic)
Postural support: Performs units of movement (hissing, biting, growling, chewing, lapping, licking, etc.) when stimulated; shows exaggerated standing, postural reflexes, and elements of sleepwalking behavior. Spontaneous movement: Responds to simple features of visual and auditory stimulation; performs automatic behaviors such as grooming; performs sunsets of voluntary movements (standing, walking, turning, jumping, climbing, etc.) when stimulated. Affect and motivation: Voluntary movements occur spontaneously and excessively but are aimless; shows well-integrated but poorly directed affective behavior; thermoregulates effectively.
Self-maintenance: Links voluntary movements and automatic movements sufficiently well for self-maintenance (eating, drinking) in a simple environment. Basal ganglia (decorticate)
Cortex (normal)
Control and intention: Performs sequences of voluntary movements in organized patterns; responds to patterns of sensory stimulation. Contains circuits for forming cognitive maps and for responding to the relationships between objects, events, and things. Adds emotional value.
consciously perceive objects but could reach for them, thus reflecting a parallel conscious and unconscious visual system. But even within these two systems, there is parallel processing. Our color vision is well suited to distinguishing form and texture, and so is useful for perceiving grandmother. Our black-and-white vision is better suited to detecting the movement of objects. Color vision and blackand-white vision are dependent on different receptors in the eye (cones versus rods), different pathways to the cortex, and different functional areas in the cortex, and eventually different cognitive and behavioral functions. Similar arrangements occur for other sensory systems; for example, the body senses of fine touch and pressure and pain and temperature are mediated by different parallel systems. The parallel/hierarchical organization of the brain is further reflected in the organization of complex mental
c08.indd Sec3:147
Figure 8.9 The anatomical and behavioral levels of the nervous system. Note. Shading indicates the highest remaining functional area, in a hierarchy from spinal cord to cortex. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
processes, such as language and memory. Consider, for example, that the meaning of words can be influenced by tone of voice. The actual word is stored in the left hemisphere language zones but the tone of voice is a function of the right hemisphere, again reflecting parallel processing of auditory information. The various levels of neural organization for mental processes thus may be fairly widely distributed in the brain, leading to the concept that such functions are really a result of the activity of a complex network of connected regions rather than a simple serial network. Functions in the Brain Are Both Localized and Distributed The identification of specific language regions (i.e., Broca’s and Wernicke’s areas) led to the idea that functions
8/17/09 2:02:19 PM
148
Neuroanatomy/Neuropsychology
Sensory Input to the Brain Is Divided for Object Recognition and Motor Control
(A) Secondary
Primary
Tertiary
(B) Level 4 Level 3 Level 4
Level 2 Primary
Level 3 Level 4
Level 2 Level 3
Level 4
Figure 8.10 Models of cortical processing. Note. A: Simple serial hierarchical model of cortical processing. B: Distributed hierarchical model. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
were localized—at least in the forebrain. The fundamental problem, however, is in defining a function. Consider language as an example. Language includes the processes of producing words orally, in writing, and by sign language, as well as constructing complex compositions such as poems, stories, songs, and so on. Language also includes the comprehension of written, oral, and sign language, and even touched letters (Braille). Language also may include the capacity to use multiple languages. It also includes the ability to sing and play musical instruments. Language is clearly not a single function and must require many different types of neural processing that are widely distributed in the brain. People with selective brain injuries may lose specific language abilities to produce words, read words, understand words, and so forth. They may lose the ability to name living things but not inanimate things and vice versa. Only if damage is extensive is language extensively compromised. Thus, we can see that language is distributed in the brain, with specific language-related skills found in relatively discrete locations. Other psychological functions such as memory, social/ emotional behavior, spatial behavior, and so on also show the same pattern of localization and distribution of function. It therefore would take massive disease or injuries to completely eliminate any complex function. Indeed, one of the characteristics of dementia diseases such as Alzheimer ’s is that people can withstand widespread deterioration of the cortex and yet maintain remarkable functions until the disease is well progressed.
c08.indd Sec3:148
Sensory systems evolved first for controlling motion, not for the recognition of specific information. Simple organisms can detect information and move to or from it. It is not necessary to “perceive” an object to direct movements toward or away from it. As animals, and their behaviors, became more complex they began to evolve ways of representing their environment. In animals with complex brains, such as ourselves, there are distinct systems for producing movement toward objects and for recognizing objects. The visual system serves as an example as we discovered in the case of D.F. (Figure 8.11). Visual information goes from the eyes to the brain stem to visual regions of the occipital lobe where it becomes divided: one route, known as the ventral stream, is to the temporal lobe for object recognition whereas a second route, known as the dorsal stream, is to the parietal lobe for the guidance of movement relative to objects (Milner & Goodale, 2006). Evidence that these systems are independent can be seen in people with injuries to the ventral or dorsal stream, respectively (for a review see Milner & Goodale, 2006). People such as D.F. with ventral stream injuries are “blind” for the recognition of objects, yet they nevertheless shape their hand appropriately when asked to reach for the objects that they cannot identify. Consider reaching for a cup, for example. When normal subjects reach for a cup, their hand forms a shape that is different than when they reach for a spoon. People with ventral stream injuries can make appropriate hand shapes yet they do not consciously recognize the object. In contrast, people with dorsal stream injuries can recognize objects Parietal lobe
Occipital lobe
Temporal lobe
Figure 8.11 Two streams of visual processing. Note. The dorsal stream is an unconscious online control of movement. The ventral stream is a conscious system for object recognition. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
8/17/09 2:02:20 PM
Rules of Brain Function
149
but make clumsy reaching movements because they do not form appropriate hand postures until they contact objects and then shape their hand based on tactile information. The recognition that the perception for movement and perception for thought are independent processes has important implications for understanding brain organization. First, these two systems provide an excellent example of parallel processing. Second, although our impression may be that we are aware of our sensory world, it is clear that the sensory analysis required for some movements is not conscious. Third, the presence of nonconscious and conscious brain processing underlies an important difference in our cognitive functions. The unconscious movement system is always acting in the present and in response to online sensory input. In contrast, the recognition system allows us to escape the present and to bring to bear information from the past. Thus, the recognition systems form the neural basis of enduring memory.
The emergence of a large prefrontal region is correlated with the emergence of sophisticated behavioral abilities required to organize behavior in place and time. The temporal aspect includes the development of a concept of the role of the self over time—in the past, present, and the future. Such abilities require a system that can constrain the search for sensory information, a process often referred to as “attention.” Attentional systems require a mechanism for continuous monitoring of both external and internal events, which essentially is a system designed for shortterm, or working, memory. A loss of prefrontal function, such as in diseases like schizophrenia or drug addiction, result in a loss of executive control of behavior, leading to disorganized and maladaptive responses to sensory information. The complexity of prefrontal functioning requires that this cerebral region develop slowly and likely does not fully mature until about 20 years of age in humans.
Prefrontal Cortex Combines Object and Motor Control Systems
Individual Differences in Brain Organization
The division of sensory systems into an object-related and a motor-control system does not shed light on how animals can decide to do something before goal objects are present. This form of long-term planning and behavioral organization evolved in parallel to the sensory systems and resides in a region known as the prefrontal cortex (Figure 8.12). The prefrontal cortex is the cerebral region lying at the frontmost region of the frontal lobe and is found in all mammals (Kolb, 2006). In primitive mammals, the prefrontal cortex is modest in size, but as more sensory maps are added in evolution, there is a corresponding increase in volume of the prefrontal cortex such that in humans this cortex represents about 15% of the cerebral cortex.
Premotor cortex
Motor cortex
It is remarkable how different we can be from each other. In part that is because no two brains are identical. There are, however, several factors that increase the interindividual variation in the brain, two prominent ones being sex and handedness (for a review, see Kolb & Whishaw, 2008). Just as gonadal hormones produce differences in genitalia, these hormones also produce differences in brain structure and thus brain function. Sex-related differences can be seen both in the gross anatomy of brain regions such as in the hypothalamus, as well as in the details of cell structure in the forebrain. These anatomical differences lead to a wide range of behavioral differences, including the superior verbal ability of women and the superior spatial ability of men. The differences are not large, on the order of less
Central sulcus
Dorsolateral prefrontal cortex Prefrontal Cortex
Inferior prefrontal cortex
c08.indd Sec3:149
Orbital cortex
Figure 8.12 The organization of the frontal lobe. Note. From Introduction to Brain and Behavior, second edition, by B. Kolb and I. Q. Whishaw, 2005, New York: Worth. Reprinted with permission.
8/17/09 2:02:21 PM
150
Neuroanatomy/Neuropsychology
than a standard deviation, but they are consistent and are found across a wide range of populations and cultures. Similarly, there are differences in gross anatomy, cell structure, and connectivity in the right- and left-handed brain. Language provides a good example. At least 99% of right-handers have language in the left hemisphere but only about 67% of left-handers do. Although it is not known what anatomical differences predict which left-handers have left versus right hemisphere language, there is little doubt that there is some difference in neuronal organization that leads to the lateralization of language in the left versus the right.
Details of Brain Functioning Constantly Change Variability in neuronal organization is not only related to factors such as sex and handedness but is also related to experience. Experience-dependent variability reflects the brain’s capacity to alter its structure and function in reaction to environmental diversity, thus reflecting a capacity that is often referred to as brain plasticity. Although this term is now commonly used in psychology and neuroscience, it is not easily defined and is used to refer to changes at many levels in the nervous system ranging from molecular events, such as changes in gene expression, to behavior (e.g., Kolb & Gibb, 2008; Shaw & McEachern, 2001). Brain plasticity is required for learning and memory functions. In fact, information is stored in the nervous system only if there are changes in neuronal connectivity. Forgetting presumably reflects a loss of the connections that represented the memory. Brain plasticity is not just a characteristic of the mammalian brain but is found in the nervous system of all animals including even the simplest animals, such as the nematode C. elegans that is only a millimeter or so long (e.g., Rankin, 2005). Nonetheless, larger brains have more capacity for change and thus are likely to show more variability in neuronal organization. Brain plasticity is not always a good thing. Analysis of the brains of animals given addicting doses of drugs such as cocaine or morphine have shown large changes in neuronal connectivity that are suspected of underlying some of the maladaptive behaviors related to addiction (for a review, see Robinson & Kolb, 2004). There are many other examples of pathological plasticity including pathological pain (Baranauskas, 2001), pathological response to sickness (Raison, Capuron, & Miller, 2006), epilepsy (Teskey, 2001), and dementia (Mattson, Duan, Chan, & Guo, 2001).
c08.indd Sec3:150
Psychological Functions Emerge from Extended Cerebral Networks Psychological functions such as memory, attention, emotion, and language can be described by words but they remain hypothetical constructs. A construct like memory is not a single thing but rather a reflection of many subprocesses, which we collectively refer to as memory. For example, we have memory for places, objects, faces, music, words, motor skills, and so on, and each of these requires a distinctive type of sensory processing. Furthermore, we have short-term memories of ongoing events and longterm memories of long past events. We also have memories of specific events as well as memories for which we can ascribe no single experience (e.g., knowing your own name). Nonetheless, although it should be no surprise that the memory of an old song or the rules of tennis are housed independently in the brain, there is a natural temptation to think that memory is found in a place in the brain. It is not. Thus, psychological constructs such as memory are widely distributed in both cortical and subcortical regions. The same is true of other psychological functions. The brain is not built on the concept of psychological functions but rather is built to support the processes that underlie different aspects of the functions. One example is language. We noted that for most people language is processed in the left hemisphere; however this is not because the brain evolved a place for language functions but rather that language requires certain types of auditory and motor processing that are housed in the left hemisphere.
SUMMARY The brain has a long evolutionary history and from a beginning of a few scattered neurons it has evolved in some species into a large, centrally located organ that represents the past, the present, and the future. Nevertheless, despite differences in size and complexity, the brains of different animal species are built on the same plan such that different regions are readily recognizable in different brains. This no doubt accounts for the many similarities displayed by diverse animal species and also allows us to generalize from the behavior of animals with simpler brains to ourselves. The brain has grown larger by the growth of existing structures and the addition of new structures, and, accordingly, behavior has become more complex by the expansion of some abilities and the addition of new behavioral strategies. The human brain has retained more primitive neural systems via which it makes online unconscious responses to the world but added systems via which it can represent that world consciously as a past, present,
8/17/09 2:02:22 PM
References 151
or future. In addition, despite having a basic anatomical plan, the brains of individuals vary enormously depending on their sex, handedness, and personal experience. Despite this plasticity, injury or disease that damages the brain reduces behavioral complexity by producing specific deficits if damage is limited, or generalized deficits if damage is extensive. Despite what neuroscience has learned about the brain, neuroscientists continue to study the factors that produce our conscious awareness of ourselves as individuals.
Kolb, B., & Whishaw, I. Q. (2008). Fundamentals of human neuropsychology (6th ed.). New York: Worth. Lashley, K. S. (1960). Functional determinants of cerebral localization. In F. A. Beach, D. O. Hebb, C. T. Morgan, & H. W. Nissen (Eds.), The neuropsychology of lashley (pp. 328–344). New York: McGraw-Hill. Mattson, M. P., Duan, W., Chan, S. L., & Guo, Z. (2001). Apoptotic and antiapoptotic signaling at the synapse: From adaptive plasticity to neurodegenerative disorders. In C. A. Shaw & J. McEachern (Eds.), Toward a theory of neuroplasticity (pp. 402–426). Philadelphia: Psychology Press. Milner, D., & Goodale, M. A. (2006). The visual brain in action (2nd ed.). New York: Oxford. Penfield, W., & Roberts, L. (1956). Speech and brain mechanisms. Princeton, NJ: Princeton University Press.
REFERENCES Baranauskas, G. (2001). Pain-induced plasticity in the spinal cord. In C. A. Shaw & J. McEachern (Eds.), Toward a theory of neuroplasticity (pp. 373–386). Philadelphia: Psychology Press. Fristch, G., & Hitzig, E. (1956). On the electrical excitability of the cerebrum. In G. von Bonin (Ed.), The cerebral cortex (pp. 73–96). Springfield, IL: Charles C Thomas. Hasson, U. Y., Nir, I. I., Levy, G., Fuhrmann, G. & Malach, R. (2004). Intersubject synchronization of cortical activity during natural vision. Science, 303, 1634–1640. Hebb, D. O. (1949). Organization of behavior. New York: McGraw-Hill. Hodgkin, A. L., & Huxley, A. F. (1952). A quantitative description of membrane current and its application to conduction and excitation in nerve. Journal of Physiology, 116, 497–506. Holmes, G. (1918). Disturbances of vision by cerebral lesions. British Journal of Ophthamology, 2, 353–384. Jerison, H. (1991). Brain size and the evolution of mind. New York: American Museum of Natural History.
Raison, C. L., Capuron, L., & Miller, A. H. (2006). Cytokines sing the blues: Inflammation and the pathogenesis of depression. Trends in Immunology, 27, 24–31. Rankin, C. H. (2005). Nematode memory: Now, where was I? Current Biology, 15, R374–R375. Robinson, T. E., & Kolb, B. (2004). Structural plasticity associated with drugs of abuse. Neuropharmacology, 47(Suppl 1), 33–46. Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal lesions. Journal of Neurology, Neurosurgery, and Psychiatry, 20, 11–21. Shaw, C. A., & McEachern, J. C. (Eds.). (2001). Toward a theory of neuroplasticity (pp. 176–192). Philadelphia: Psychology Press. Sherrington, C. S. (1948). The integrative action of the nervous system. New Haven, CT: Yale University Press. Sperry, R.W. (1974). Lateral specialization in the surgically separated hemispheres. In F.O. Schmitt and JF.G. Worden (Eds.). Neurosciences: Third study program (pp. 5–20). Cambridge, MA: MIT Press.
Kolb, B. (2006). Do all mammals have a prefrontal cortex. In J. Kaas (Ed.), Evolution of nervous systems: A comprehenive review (Vol. 3, pp. 443–450). New York: Elsevier.
Teskey, G. C. (2001). Using kindling to model the nreuoplastic changes associated with learning and memory, neuropsychiatric disorders, and epilepsy. In C. A. Shaw & J. C. McEachern (Eds.), Toward a theory of neuroplasticity (pp. 347–358). Philadelphia: Psychology Press.
Kolb, B., & Gibb, R. (2008). Principles of neuroplasticity and behavior. In D. Stuss, I. Robertson, & G. Winocur (Eds.), Brain plasticity and rehabilitation, pp. 6–21). New York: Oxford University Press.
Weaver, I. C., Cervoni, N., Champagne, F. A., D’Alessio, A. C., Sharma, S., Seckl, J. R., et al. (2004). Epigenetic programming by maternal behavior. Nature Neuroscience, 7, 847–854.
Kolb, B., & Tees, R. C. (1990). The cerebral cortex of the rat. Cambridge, MA: MIT Press.
Whishaw, I. Q., & Kolb, B. (2006). The behavior of the laboratory rat. New York: Oxford.
Kolb, B., & Whishaw, I. Q. (2003). Fundamentals of human neuropsychology (5th ed.). New York: Worth.
Woolsey, C.N. (1958). Organization of somatic sensory and motor areas of cerebral cortex. In H. F. Harlow & C. N. Woolsey (Eds.), Biological and biochemical basis of behavior (pp. 63–81). Madison: University of Wisconsin Press.
Kolb, B., & Whishaw, I. Q. (2005). Introduction to brain and behavior (2nd ed.). New York: Worth.
c08.indd Sec4:151
Petrides, M. (2005). Lateral prefrontal cortex: Architectonic and functional organization. Philosophical Transactions of the Royal Society of London. Series B, 360, 781–795.
8/17/09 2:02:22 PM
Chapter 9
Essentials of Functional Neuroimaging TOR D. WAGER, LUIS HERNANDEZ, AND MARTIN A. LINDQUIST
There has been explosive interest in the use of brain imaging to study cognitive and affective processes in recent years (Wager, Hernandez, Jonides, & Lindquist, 2007). The neuroimaging data from functional magnetic resonance imaging (fMRI) and positron emission tomography (PET) studies are central to the emerging fields of cognitive neuroscience, affective neuroscience, social cognitive neuroscience, neuroeconomics, and related neurobehavioral disciplines. fMRI and PET data are being combined with data on human performance and psychophysiology in increasingly sophisticated ways to yield models of human thought, emotion, and behavior. The best such models are informed by the rich histories of cognitive psychology and psychophysiology and—due largely to the integration of neuroimaging data—are grounded in brain physiology. This grounding permits stronger and more specific connections with the neurosciences and biomedical sciences, allowing behavioral scientists to leverage a vast and growing literature on brain systems developed in these fields. All methods used in the human neurobehavioral sciences have limitations, and neuroimaging is no exception. The current trend is toward increasingly interdisciplinary approaches that use multiple methodologies to overcome some of the limitations of each method used in isolation. Recent advances in engineering and signal processing allow electroencephalography (EEG) and fMRI data to be collected simultaneously (Goldman, Stern, Engel, & Cohen, 2000), which provides improved temporal precision, among other benefits. Combined fMRI and EEG/ magnetoencephalography (MEG) analyses are being developed that can provide better spatiotemporal resolution than either method alone (Dale et al., 2000; V. Menon, Ford, Lim, Glover, & Pfefferbaum, 1997). Neuroimaging data are also being combined with transcranial magnetic
stimulation to integrate the ability of neuroimaging to observe brain activity with the ability of transcranial magnetic stimulation (TMS) to manipulate brain function and examine causal effects (Bohning et al., 1997). The rapid pace of development and interdisciplinary nature of the neurobehavioral sciences presents an enormous challenge to researchers. Moving this kind of science forward requires a collaborative team with expertise in psychology, neuroanatomy, neurophysiology, physics, biomedical engineering, statistics, signal processing, and other disciplines depending on the research questions. True interdisciplinary collaboration is exceedingly challenging, because team members must know enough about the other disciplines to talk intelligently with experts in each field. Lead researchers on neuroimaging projects must know when to ask for help with various aspects of the project and what kind of expertise is needed. Supporting researchers must understand enough about the research questions and possibilities to bring their knowledge to bear in an optimal way. The goal in this chapter is to review the basic techniques involved in the acquisition and analysis of neuroimaging data—and some recent developments—in enough detail to highlight the most important issues and concerns. We also provide an overall road map of what kinds of study design and analysis options are available and some of their important limitations. The aspects of PET and fMRI methodology are organized here into four sections. The first section deals with what neuroimaging techniques measure, including the essentials of PET and fMRI data acquisition and the relationship between brain activity and observed signals in each modality. The second section describes the hierarchical structure of neuroimaging data and how these data are used to make psychological inferences. We emphasize two kinds of inferences: forward inferences about brain
Parts of this chapter are adapted from Wager, Hernandez, Jonides, and Lindquist (2007). Elements of functional neuroimaging. In Cacioppo, Tassinary, and Berntson (Eds.), Handbook of psychophysiology, fourth edition (pp. 19–55). Cambridge: Cambridge University Press. We would like to thank Dr. Doug Noll for providing Figure 9.3, and Matthew Davidson, Damon Abraham, Katherine Dahl, and Bryan Denny for helpful comments on the manuscript. 152
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c09.indd 152
8/18/09 5:25:25 PM
What Neuroimaging Techniques Measure
activity given a psychological experimental manipulation, and reverse inferences about psychology given patterns of brain activation. This section also deals with statistical inferences about populations and the localization of results from functional neuroimaging studies. The third section discusses experimental designs for neuroimaging experiments, including some considerations that are particular to neuroimaging data. The fourth section deals with neuroimaging data analysis, including sections on artifacts and signal processing before analysis (preprocessing), the general linear model (GLM), brain-behavior and brain-physiology relationships, and methods for investigating brain connectivity.
WHAT NEUROIMAGING TECHNIQUES MEASURE There are many ways to measure brain function, including fMRI, PET, single positron emission computerized tomography (SPECT), electroencephalography (EEG) with analysis of event-related potentials (ERP; Fabiani, Gratton, & Federmeier, 2007; Pizzagalli, 2007), magnetoencephalography (MEG; Hämäläinen, Hari, Ilmoniemi, Knuutila, & Lounasmaa, 1993), and near-infrared spectroscopy (Villringer & Chance, 1997). Each of these techniques provides a unique window into the functions of mind and brain. In this chapter, we mainly focus on PET and fMRI, because they are the most widely used and provide the
Table 9.1
153
most anatomically specific information across the entire brain. The relatively good spatial resolution of PET and fMRI complement the precise timing information provided by EEG and MEG. In addition, the ability of fMRI to measure activity over the entire brain every 2 s or so offers great potential for synergy with animal research. Whereas animal electrophysiology and lesion experiments are often focused on a single region, neuroimaging can assess global function and interactions across large-scale brain systems. PET and fMRI can be used in different ways, depending on the software and type of imaging chosen, to measure biological processes related to brain activity. Measures are generally obtained for each of a large number of local regions of brain tissue called “voxels” (three-dimensional pixels; imagine little cubes stacked together), providing 3-D brain maps. The partial list of popular measures and techniques shown in Table 9.1 includes measures of both brain structure and function. Structural measures may be divided into measures related to gray- and white-matter volume and density, and measures related to neurochemical receptors and other biomarkers. The most frequently used functional measures are those that measure processes related to overall neuronal/ glial activity, referred to here as “activation.” These measures include measures of glucose metabolism, blood flow or perfusion in PET and arterial spin labeling (ASL), and the Blood Oxygen Level Dependent (BOLD) signal in fMRI. Activation and deactivation in both PET and fMRI reflect changes in neural activity only indirectly, and they measure
Summary of PET and fMRI methods. Techniques for Studying Brain Structure
What Is Imaged?
Technique
Analysis
Gray/white matter/CSF distinctions
T1-weighted imaging (MRI)
Voxel-based morphometry (VBM), volume-based measures, surface-based measures (e.g., cortical thickness)
Gray/white matter/CSF distinctions
T2-weighted imaging (MRI)
Same as preceding
White-matter structure
Diffusion tensor imaging (DTI); Radioligand binding
Diffusion tractography
Neurochemical receptor occupancy
(PET); GABA-A: [C-11] flumazenil; dopamine D2:
Kinetic modeling, Logan-plot analysis
[C-11 raclopride]; Mu-opioids, [C-11] carfentanil; acetylcholine: [F-18] epibatidine, [C-11] scopolamine, serotonin: [C- 11] benzylamine; others
c09.indd Sec1:153
Gene expression
PET radiolabeling; MR spectroscopy with kinetic modeling
Metabolites and various biomarkers
MR spectroscopy
Regional blood flow (perfusion)
[O-15] PET
Voxel-wise linear modeling; multivariate connectivity techniques
Relative Hb deoxygenation
Blood Oxygen Level Dependent (BOLD) signal, T2*-weighted image
Same as preceding
Glucose metabolism
[F-18]-fluorodeoxyglucose (FDG) PET
Same as preceding
Regional blood flow (perfusion)
Arterial spin labeling (ASL) fMRI
Same as preceding
Task-related neurochemistry
Radioligand binding (PET); see above
Kinetic modeling, Logan-plot analysis followed by linear modeling
8/18/09 5:25:26 PM
154
Essentials of Functional Neuroimaging
Table 9.2
Relative advantages of fMRI and PET.
Advantages of fMRI Cost and availability
fMRI has lower cost, more facilities available
Spatial resolution
fMRI has higher resolution, but new PET scanners can have same functional resolution for group studies
Temporal resolution
fMRI is superior, permitting event-related designs
Brain connectivity analyses
fMRI permits time-series connectivity analysis; PET and fMRI both permit individual differences analysis
Combination with other measures
Simultaneous time-series acquisition of fMRI and EEG provides most detailed mapping of relationships
Single-subject studies
fMRI permits detailed high-resolution studies of individuals
Repeatability
fMRI does not use radioactive substances, so frequent scans are considered safe
Measuring neurochemistry
PET is superior; can be used to directly investigate neurochemistry
Transparency of activation measures
PET provides more direct measures of blood flow or metabolism
Artifacts
PET does not suffer from magnetic susceptibility artifacts and gradient- or RF-related artifacts
Combination with other measures
PET is not magnetic and can be combined with simultaneous EEG, MEG, and TMS
Studying baseline activity
PET provides quantitative measure of baseline state; ASL fMRI also can, but is less commonly available
Naturalness of environment
PET is quieter and has more open physical environment; advantage for auditory and emotion tasks
different biological processes related to brain activity, which may be broadly defined as the energy-consuming activity of neurons and glia, and the electrical and chemical signals they produce. Thus, both PET and fMRI can be used to measure brain activity, though each has unique advantages. These are summarized in Table 9.2. Measures of Brain Structure Structural Scans MRI can provide detailed anatomical scans of gray and white matter with a spatial resolution well below 1 mm3. These images are used to localize functional results in individual or group-averaged brains. A growing set of measures related to brain structure allows for the analysis of changes with practice or development, effects of aging, and differences between healthy individuals and those with psychological disorders. A popular way of analyzing gray-matter density is the voxel-based morphometry (VBM) method (Ashburner & Friston, 2000; Good et al., 2001), which uses structural image intensity to measure gray- and white-matter density. Other methods use measures of cortical thickness derived from surface reconstruction and unfolding (Fischl, Sereno, & Dale, 1999; Van Essen & Dierker, 2007), or the volume of anatomically defined structures. A recent study reported that London taxi drivers, who had developed extensive expertise in spatial navigation, had larger posterior hippocampi (Maguire et al., 2000).
c09.indd Sec1:154
Both structural and functional MRI images are obtained using the same scanner; the only difference is in how the scanner is programmed. A brief overview of the image acquisition process is as follows. A sample (e.g., a brain) is placed in a strong magnetic field and exposed to a radiofrequency (RF) electromagnetic field pulse. The nuclei absorb the energy only at a particular frequency band, which is strongly dependent on their electromagnetic environment, and become “excited” (they change their quantum energy state). The nuclei then emit the energy at the same frequency as they “relax.” The same antenna that produced the RF field detects the returned energy. Pulse sequences, or software programs that implement particular patterns of RF and gradient magnetic field manipulations (manipulations of the magnetic field’s shape), are used to acquire data that can be reconstructed into a map of the MR signal sources, that is, an image of the brain. Pulse sequence programming is the province of physicists and bioengineers; such divisions of labor among physicists, psychologists, neuroscientists, and statisticians are a hallmark of neuroimaging, which is highly interdisciplinary in nature. For more in-depth information, we recommend two approachable texts (Elster, 1994; Huettel, Song, & McCarthy, 2004), and more detailed texts for the advanced reader (Bernstein, King, & Zhou, 2004; Haacke, 1999). The relaxation process can be described by three values: T1, T2, and T2*. T1 and T2 are constants determined by the spin frequency, field strength, and tissue type (largely based on the hydrogen content, which depends in turn on
8/18/09 5:25:26 PM
What Neuroimaging Techniques Measure
A
155
B
C
Figure 9.1 The same slice of brain tissue can appear very different, depending on which relaxation mechanism is emphasized as the source of contrast in the pulse sequence. Note: Using long echo times emphasizes T2 differences among tissue types, and shortening the repetition time emphasizes T1 differences among tissue types. The same slice of the brain acquired as A: a T1weighted image and B: a T2-weighted image. C: Diffusion tensor imaging
how much water is in the tissue). T1 refers to the rate at which spins relax back to alignment with the main magnetic field, and T2 refers to the rate of attenuation of the magnetic field applied by the RF pulse. T2* is like T2, but depends additionally on local inhomogeneities in magnetic susceptibility that are caused by changes in blood flow and oxygenation, among other factors. T1 and T2 are constants determined by the spin frequency, field strength, and tissue type (largely based on the hydrogen content, which depends in turn on how much water is in the tissue). Different pulse sequences—patterns of RF excitations and data collection periods—produce images that are sensitive primarily to T1, T2, or T2*. Because T1 and T2 vary with tissue type but are otherwise constant, T1- and T2-weighted images can produce detailed representations of the boundaries between gray matter (mostly cell bodies), white matter (mostly axons), and cerebrospinal fluid (CSF). Because T2* is sensitive to flow and oxygenation, T2*-weighting is used to create images of brain function. An example of the same slice of tissue imaged with T1 and T2 weighting can be seen in Figure 9.1 A and B. The images look strikingly different. Changing the contrast mechanism can be very useful in differentiating brain structures or lesions, since some structures will be apparent in some kinds of images
c09.indd Sec1:155
allows researchers to measure directional diffusion and reconstruct the fiber tracts of the brain. This provides a way to study how different brain areas are connected. Diffusion image is adapted from “Probabilistic Diffusion Tractography with Multiple Fibre Orientations: What Can We Gain?” by T. E. J. Behrens, H. J. Berg, S. Jbabdi, M. F. S., Rushworth, & M. W. Woolrich, 2007, NeuroImage, 34, pp. 144–155. Adapted with permission.
but not in others. Multiple sclerosis lesions are virtually invisible in T1-weighted images, but appear brightly in T2weighted images. Anatomical Connectivity MRI pulse sequences may also be tuned to be sensitive to directional (anisotropic) patterns of water diffusion, which may be used to track the course of axon (fiber) tracts. Water diffuses more readily along the axons that make up the brain’s white matter than across them. Diffusion tensor imaging (DTI) is an increasingly popular technique for measuring directional diffusion and reconstructing the fiber tracts of the brain (Figure 9.1 C; Denis Le Bihan et al., 2001). New tractography analyses for quantifying the thickness and connectivity of these tracts are being rapidly developed (Behrens et al., 2007). Such tools will increasingly allow researchers to analyze the relationships between structural connectivity and neuropsychological processes such as development, training, aging, cognitive and emotional function, and psychopathology (JohansenBerg & Behrens, 2006). DTI can be combined with other techniques, such as fMRI or other anatomical and neurochemical measures. One study used DTI to define adjacent subregions of the medial prefrontal cortex, and then used
8/18/09 5:25:26 PM
156
Essentials of Functional Neuroimaging
fMRI to show that the subregions responded differentially to different tasks (Johansen-Berg et al., 2004). Other Anatomical Measures PET imaging is complementary to MRI in an important way: It permits estimation of the density of a variety of neurochemical receptors across the brain. A radioactive label is chemically attached to a pharmacological agent and injected into the bloodstream. The agent is transported into the brain, where it binds to a specific class of receptors, depending on its biochemical nature. The PET camera detects the radiation emitted when the radioactive label decays, and so provides a 3-D map of the distribution of labeled substance across the brain. Kinetic models, which use systems of differential equations in conjunction with known kinetic properties of the pharmacological agent, can be used to quantify the label in extravascular space (tissue) and that bound to receptors. Related neurochemical measures, such as the rate of dopamine synthesis, can be obtained as well. This method is often used to study changes in endogenous neurochemical release, and we describe it more fully later in this chapter. In addition, MR spectroscopy provides a way of testing for the presence of biochemicals and some kinds of gene expression in a brain volume of interest, though this has not been widely applied yet in the cognitive neurosciences. Certain compounds produce well-defined peaks in the measured frequency spectrum, and can be readily detected, but many compounds of interest in neuroscience cannot.
quite fast, and their half-lives vary from a couple of minutes to a few hours, which means that a cyclotron must be available nearby to synthesize the radioactive tracer minutes before each PET scan. The tracer is injected into the subject’s bloodstream in either a bolus or a constant infusion that produces a steady-state concentration of tracer in the brain. As the tracer decays within the blood vessels and tissue of the brain, positrons are emitted. The positrons collide with nearby electrons (being oppositely charged, they attract), annihilating both particles and emitting two photons that shoot off in opposite directions. Photoreceptive cells positioned in an array around the participant’s head detect the photons. The fact that matched pairs of photons travel in exactly opposite directions and reach the detectors simultaneously is important for the tomographic reconstruction of the three-dimensional locations where the particles were annihilated. Note that the scanner does not directly detect the positrons themselves; it detects the energy that results from their annihilation. Depending on the design, most PET scanners are made up of an array of detectors that are arranged in a circle around the patient’s head, or in two separate flat arrays that are rotated around the patient’s head by a gantry. To detect simultaneously occurring pairs of photons, each pair of detectors on opposite sides of the participant’s head must be wired to a “coincidence detector” circuit, as illustrated in Figure 9.2. Small tubes (called septa or collimators) are placed around the detectors to shield them from radiation
Measures of Brain Activity Using Positron Emission Tomography Perhaps the most frequent use of both PET and fMRI is the study of metabolic and vascular changes that accompany changes in neural activity. With PET, we can separately measure glucose metabolism, oxygen consumption, and regional cerebral blood flow (rCBF). Each of these techniques allows us to make inferences about the localization of neural activity based on the assumption that neural activity is accompanied by a change in metabolism, in oxygen consumption, or in blood flow. The PET camera provides images by detecting positrons emitted by a radioactive tracer, the frequencies of which are reconstructed into three-dimensional volumes. Positrons are subatomic particles having the same mass but opposite charge as an electron—they are “antimatter electrons.” The most common radioactive tracers are 15O, “oxygen-15,” commonly used in blood-flow studies, 18F (fluorine), used in deoxyglucose mapping, and 13C (carbon) or 123I (iodine), used to label raclopride and other receptor agonists and antagonists. The decay rate of such isotopes is
c09.indd Sec1:156
Positron emitted by isotope decay Neighboring electron
Scintillation counter
Scintillation counter Annihilation emits two photons in opposite directions
180 deg.
Coincidence detector
Computer
Display
Figure 9.2 A schematic diagram of the main components of a PET scanner.
8/18/09 5:25:27 PM
What Neuroimaging Techniques Measure
from the sides and help prevent coincidences due to background radiation. The injected tracer will be distributed throughout the blood vessels and tissue of the brain (indeed, throughout the rest of the body as well). Each pair of detectors counts photons emitted within the column of tissue between them. The density of photons that were emitted at each location can be calculated mathematically from the number of counts at each position or “projection.” PET images are simply maps of how many positron annihilation events occurred in the slice of interest. A more complete explanation of PET image formation, including a discussion of filtered back projection and other methods, can be found in several good texts (Bendriem & Townsend, 1998; Sandler, 2003). What do PET counts reflect? The answer depends on what molecule the label is attached to and where that molecule goes in the brain. Ideally, for 15O PET, counts reflect the rate of water uptake into tissue. 18-fluorodeoxyglucose (FDG) PET measures glucose uptake, whereas 13C Raclopride PET measures dopamine binding. In practice, the observed level of signal depends on several factors, including the concentration of the radiolabeled substance in the blood, the blood flow and volume, the presence of other endogenous chemicals that compete with the labeled substance, and kinetic properties such as the binding affinity of the substance to receptors, the rate of dissociation of the substance from receptors, and the rate at which the substance is broken down by endogenous chemicals. Accurate quantification of binding requires study of the kinetic properties of the substance in animals and the use of this information in kinetic models, which use differential equations to estimate the biological parameters of interest (e.g., ligand bound specifically to the receptor type of interest). Kinetic models have been developed to estimate how much tracer is contained in different categories, or compartments, of blood and tissue. Different forms of kinetic modeling have different numbers of compartments; a twocompartment model estimates how much of the radiolabeled compound is in the vasculature as opposed to in the brain. A three-compartment model used in receptor binding studies estimates tracer quantities in blood, free tracer in tissue, and label bound to receptors. Often a reference region with few or no receptors (the cerebellum for dopamine) is used to model the separation of free from bound tracer; this requires the assumption that none of the signal in the reference region comes from bound tracer. A fourcompartment model additionally separates tracer bound to receptors of a specific type (called specific binding) from those bound to other receptors (called nonspecific binding). For more details, we refer you to Frey (1999).
c09.indd Sec1:157
157
Measures of Brain Activity Using Functional Magnetic Resonance Imaging Unlike PET, which can provide measures of overall activity or specific neurochemical systems, fMRI is principally used to obtain measures of regional brain activity (see Table 9.1). The most popular method is currently the Blood Oxygenation Level Dependent (BOLD) signal (Kwong et al., 1992; Ogawa et al., 1992), which is obtained using T2*-weighted images. Other methods are available but less widely used, including several varieties of Arterial Spin Labeling (ASL; Williams, Detre, Leigh, & Koretsky, 1992), which use pulse sequences sensitive to blood volume or cerebral perfusion. We focus here on BOLD physiology because it is overwhelmingly the most common method in current use. BOLD imaging takes advantage of the difference in T2* between oxygenated and deoxygenated hemoglobin. As neural activity increases, so does metabolic demand for oxygen and nutrients. Capillaries in the brain containing oxygen and nutrient-rich blood are separated from brain tissue by a lining of endothelial cells, which are connected to astroglia, a major type of glial cell that provides metabolic and neurochemical-recycling support for neurons. Neural firing signals the extraction of oxygen from hemoglobin in the blood, likely through glial processing pathways (Shulman, Rothman, Behar, & Hyder, 2004; Sibson et al., 1997). As oxygen is extracted from the blood, the hemoglobin becomes paramagnetic—iron atoms are more exposed to the surrounding water—which creates small distortions in the B0 field that cause a T2* decrease (a faster decay of the signal). Increases in deoxyhemoglobin can lead to a decrease in the BOLD signal, often referred to as the “initial dip.” The initial decrease in signal (whose existence is controversial) is followed by an increase, due to an over-compensation in blood flow that tips the balance toward oxygenated hemoglobin (and less signal loss due to dephasing), which leads to a higher BOLD signal. Initially, fMRI was performed by injection of contrast agents (such as iron) with paramagnetic properties, but the discovery that the T2* relaxation rate of oxygenated hemoglobin was longer than that of deoxygenated hemoglobin led to BOLD imaging as it is currently used with humans, without contrast agents (Kwong et al., 1992; Ogawa, Lee, Kay, & Tank, 1990). How well does the BOLD signal reflect increases in neural firing? The answer to this important question is complex, and understanding the physiological basis of the BOLD response is currently a topic of intense research (Buxton & Frank, 1997; Buxton, Uludag, Dubowitz, & Liu, 2004; Heeger & Ress, 2002; Vazquez & Noll, 1998). Some relationships among factors that contribute to BOLD signal are summarized in Figure 9.3.
8/18/09 5:25:27 PM
158
Essentials of Functional Neuroimaging
T2*-weighted Image Intensity ⫹
MR Properties
Decay Time (T2*) ⫹
Physical Effects
Other Factors Vessel Diameter Magnetic Field Vessel Orientation Uniformity (microscopic) Hematocrit Blood Volume Fraction ⫹ ⫺ Cerebral Blood Volume (CBV)
Blood Oxygenation
Vascular Physiological Effects
⫹ ⫹
⫹ ⫺
Cebral Blood Flow (CBF)
⫹ Metabolic Rates
Glucose and Oxygen Metabolism ⫹
Brain Function
Neuronal Activity
Figure 9.3 Influences on T2*-weighted signal in BOLD fMRI imaging. Note: Courtesy of Dr. Doug Noll.
The BOLD signal corresponds relatively closely to the local electrical field potential surrounding a group of cells—which is itself likely to reflect changes in postsynaptic activity—under many conditions. Demonstrations by Logothetis and colleagues have shown that high-field BOLD activity closely tracks the position of neural firing and local field potentials in cat visual cortex, even to the locations of specific columns of cells responding to particular line orientations (Logothetis, Pauls, Augath, Trinath, & Oeltermann, 2001). Under other conditions, however, neural activity and the BOLD signal may become decoupled (Disbrow, Slutsky, Roberts, & Krubitzer, 2000). For these reasons and others, the BOLD signal is only likely to reflect a portion of the changes in neural activity in response to a task or psychological state. Many regions may show changes in neural activity that is missed because they do not change the net metabolic demand of the region. Another important question is whether BOLD signal increases reflect neural excitation or inhibition. Some
c09.indd Sec1:158
research supports the idea that much of the glucose and oxygen extraction from the blood is driven by glutamate metabolism, a major (usually) excitatory transmitter in the brain. Shulman and Rothman (1998) suggest that increased glucose uptake is controlled by astrocytes, whose end-feet contact the endothelial cells lining the walls of blood vessels. Glutamate, the primary excitatory neurotransmitter in the brain, is released by 60% to 90% of the brain’s neurons. When glutamate is released into synapses, it is taken up by astrocytes and transformed into glutamine. When glutamate activates the uptake transporters in an astrocyte, it may signal the astrocyte to increase glucose uptake from the blood vessels. Although it remains plausible that some metabolic (and BOLD) increases could be caused by increased inhibition of a region, in many tasks where both BOLD studies and neuronal recordings have been made, BOLD increases are found in regions in which many cells increase their activity. This is true in studies of visual processing, eye movements, task switching, working memory, food reward, pain, and other domains.
8/18/09 5:25:28 PM
What Neuroimaging Techniques Measure
Measures of Functional Neurochemistry Using Positron Emission Tomography The affinity of particular pharmacological agents for certain types of neurotransmitter receptors, such as raclopride for dopamine D2 receptors, provides a way to investigate the functional neurochemistry of the human brain. Radioactive labels such as C-11, a radioactive isotope of carbon, are synthesized in a cyclotron and attached to the pharmacological agent. Labeled compounds are injected into the arteries by either a bolus (a single injection) or continuous infusion, typically until the brain concentrations reach steady state. This method can be used to image task-dependent neurotransmitter release. As radioactively labeled neurotransmitters binds to receptors, the label degrades and gamma rays are emitted that are detected by the PET camera. When endogenous neurotransmitters are released in the brain, there is greater competition at receptors, and less binding of the labeled substance (referred to as specific binding). Thus, neurotransmitter release generally results in a reduction in radioactivity detected by the PET camera. The most common radioligands and transmitter systems studied are dopamine (particularly D2 receptors) using [11C]raclopride or [123I]iodobenzamide, muscarinic cholinergic receptors using [11C]scopolamine, opioids using [11C]carfentanil, and benzodiazepines using [11C]flumazenil. In addition, radioactive compounds that bind to serotonin, opioid, and several other receptors have been developed. As described earlier, because the dynamics of radioligands are complex, pharmacological agents must be carefully selected and tested in animals. Parameters from these studies are used in kinetic models to aid in quantifying how much labeled substance is bound to the receptor type of interest (Frey, 1999).
Limitations of Positron Emission Tomography and Functional Magnetic Resonance Imaging As you might expect, both PET and MRI have their share of pitfalls. You should consider the limitations of each technique not only when designing experiments, but also when examining the neuroimaging literature. Always ask the following question: “Are the activations caused by the experimental paradigm or by other unwanted sources?” Conversely, you should also ask: “Were there other active regions that were missed by the experimental paradigm?” Some of these errors may have occurred because of the spatial or temporal limitations of the technique, or they may be due to image artifacts or mischaracterized noise. Spatial Limitations Neither PET nor fMRI is well-suited for imaging small subcortical nuclei or cortical microcircuitry, though advances
c09.indd Sec1:159
159
in high-field imaging and parallel acquisition methods are helping. The spatial resolution of PET is on the order of 1 to 1.5 cm3. fMRI resolution can be less than 1 mm3 in high-field imaging in animals, but is typically on the order of 27 to 36 mm3 or more for human studies. Thus, features such as cortical columns and even major subnuclei (e.g., there are 30 or so in each of the amygdala and thalamus) cannot typically be identified. The limiting factors in fMRI include signal strength and the point-spread function of BOLD imaging, which tends to extend beyond neural activation sites into draining veins (Duong et al., 2002). Careful work in individual participants has demonstrated the imaging of ocular dominance columns in humans (Cheng, Waggoner, & Tanaka, 2001). While this resolution does not sound all that bad, another factor seriously limits the spatial resolution in most studies. That is, making inferences about populations of subjects requires analyzing groups of individuals, each with a different brain. Usually, individual brains are aligned to one another through a registration or warping process (see Data Analysis: Implementation, later in this chapter), which introduces substantial blurring and noise in the group average. Thus, the effective resolution for group fMRI and PET studies is about the same. One estimate based on meta-analysis is that the spatial variation in the location of an activation peak among comparable group studies is 2 to 3 cm (Wager, Reading, & Jonides, 2004). Overcoming these limitations with high-resolution fMRI imaging is a challenging and developing research area. By focusing on particular regions and omitting data collection in much of the brain, voxels on the order of 1.5 mm per side can be acquired, yielding fMRI maps with resolution closer to the physical size of functional subregions (e.g., cortical fields within the hippocampus, or nuclei in the brain stem). This technique provides several advantages over standard mapping techniques. First, resolution can potentially be considerably enhanced, particularly when using high-field imaging and analysis techniques that remove some spread in fMRI signal due to draining veins (R. S. Menon, 2002). Second, collecting thinner slices can reduce susceptibility artifacts and improve imaging around the base of the brain (Morawetz et al., 2008). Finally, limitations in group studies related to interindividual variability can be partially overcome using identification of regions of interest on individual participants’ anatomical images or by advanced cortical unfolding and intersubject warping techniques (Zeineh, Engel, Thompson, & Bookheimer, 2003). However, there are costs as well. There is a substantial loss in signal due to the smaller volume of each voxel. In addition, coregistration techniques that ensure structureto-function correspondence and normalization techniques typically used to provide intersubject registration in group
8/18/09 5:25:28 PM
160
Essentials of Functional Neuroimaging
studies do not work very well when only a portion of the brain is imaged, because there are fewer functional landmarks for registration. High-resolution studies are promising when a small set of subcortical nuclei or nearby cortical regions are of primary interest. Acquisition Artifacts Artifactual activations (patterns of apparent activation arising from nonneural sources) and image distortions may arise from several sources, some unexpected. An early study found a prominent PET activation related to anticipation of a painful electric shock in the temporal pole (Reiman, Fusselman, Fox, & Raichle, 1989). However, it was discovered some time later that this temporal activation was actually located in the jaw—the subjects were clenching their teeth in anticipation of the shock. Important types of artifact include those related to magnetic susceptibility, reconstruction, head movement, heartbeat and breathing, instability in magnetic gradients used to acquire images, and radio-frequency interference from outside sources. Many of these artifacts apply only to or are more pronounced with fMRI; we provide more details on dealing with artifacts in analysis in the section Data Analysis: Implementation. Susceptibility artifacts in fMRI occur because magnetic gradients near air and fluid sinuses and at the edges of the brain cause local inhomogeneities in the magnetic field that affects the signal, causing distortion in echo-planar imaging (EPI) sequences and blurring and dropout in spiral sequences. These problems increase at higher field strengths and provide a significant barrier in performing effective high-field fMRI studies. Not all scanner/sequence combinations can reliably detect BOLD activity near these sinuses, which affects regions including the orbitofrontal cortex, inferior temporal cortex, hypothalamus, and amygdala. Signal may be recovered by using optimized sequences such as “z-shimming” (Constable & Spencer, 1999) or spiral in/out sequences (Glover & Law, 2001) or using a physical magnetic shim held in the mouth of the participant (Wilson & Jezzard, 2003). Signal loss and distortion may be further minimized by using improved reconstruction algorithms (Noll, Fessler, & Sutton, 2005) and “unwarping” algorithms that measure and attempt to correct EPI distortion (Andersson, Hutton, Ashburner, Turner, & Friston, 2001). Functional MRI also contains more sources of signal variation due to noise than does PET, including a substantial slow drift of the signal across time and higher frequency changes in the signal due to physiological processes accompanying heart rate and respiration. The lowfrequency noise component in fMRI can obscure results related to a psychological process of interest and it can
c09.indd Sec1:160
produce false positive results, so it is usually removed statistically prior to analysis. A consequence of slow drift is that it is often impractical to use fMRI for designs in which a process of interest only happens once or unfolds slowly over time, such as drug highs or the experience of strong emotions, though some experimental/analysis approaches have been developed to facilitate such studies (Lindquist & Wager, 2007Lindquist, Waugh, & Wager, 2007). The vast majority of fMRI designs use discrete events that can be repeated many times over the course of the experiment (e.g., the most common method for studying emotion in fMRI is to repeatedly present pictures with emotional content). Temporal Resolution and Trial Structure Another important limitation of scanning with PET and fMRI is the temporal resolution of data acquisition. The details of this limitation are discussed in subsequent sections, but it is important to note here that PET and fMRI measure very different things, over different time scales. Because PET computes the amount of radioactivity emitted from a brain region, at least 30 seconds of scanning must pass before a sufficient sample of radioactive counts is collected. This limits the temporal resolution to blocks of time of at least 30 seconds, well longer than the temporal resolution of most cognitive processes. For glucose imaging (FDG) and receptor mapping using radiolabeled ligands, the period of data collection for a single condition is much longer, on the order of 30 to 40 minutes. Functional MRI has its own temporal limitation due largely to the latency and duration of the hemodynamic response to a neural event. Typically, changes in blood flow do not reach their peak until several seconds after local neuronal and metabolic activity has occurred. Thus, the locking of neural events to the vascular response is not very tight. Because of this limitation, a promising current direction is the estimation of the onset and peak latency of fMRI responses, and other parameters, averaged over many trials (Lindquist & Wager, 2007; R. S. Menon, Luknowsky, & Gati, 1998; Miezin, Maccotta, Ollinger, Petersen, & Buckner, 2000). We provide a more thorough discussion of this and related issues later in this chapter.
FROM DATA TO PSYCHOLOGICAL INFERENCE Goals of Data Analysis: Prediction and Inference A fundamental question in neuroimaging research is determining what the researcher hopes to achieve with the chosen method. Successful research requires a solid grasp of
8/18/09 5:25:28 PM
From Data to Psychological Inference
what kinds of imaging results constitute evidence for a psychological or physiological theory, and a grounded understanding of what kinds of results are likely to be obtainable. There are several potential inferential goals in neuroimaging studies. One goal is prediction of a psychological or disease state using neuroimaging data, which can be accomplished using regression or classification techniques (Norman, Polyn, Detre, & Haxby, 2006). More often, the psychologist would like to infer something about the structure of mental processes from imaging data. Making inferences about psychological states has been termed reverse inference, because it involves estimating the relative probabilities of different psychological hypotheses given the data, whereas what is observed in imaging studies is the probability of the data given a psychological state. Chapter 1 of this Handbook (Cacioppo & Berntson, in press) deals extensively with psychological inference from physiological data. In addition, several excellent papers review this issue in brain imaging (Poldrack, 2006; Sarter, Berntson, & Cacioppo, 1996) and physiological data generally (Cacioppo & Tassinary, 1990). Though we do not recapitulate this discussion here, we note that making psychological inferences based on activation in single brain regions is particularly problematic. Researchers have inferred that romantic love and retribution involve “reward system” activation because these conditions activate the caudate nucleus (Aron et al., 2005; de Quervain et al., 2004), that social rejection is like physical pain because it activates the anterior cingulate (Eisenberger, Lieberman, & Williams, 2003), among countless similar conclusions in the literature. These inferences are problematic because both these regions are involved in a wide range of tasks, including shifting of attention, working memory, and inhibition of simple motor responses, so their activation is not indicative of any particular psychological state (Bush, Luu, & Posner, 2000; Kastner & Ungerleider, 2000; Paus, 2001;
z ⫽ 16
z ⫽ 20
z ⫽ 26
z ⫽ 30
z ⫽ 36
z ⫽ 40
z ⫽ 46
z ⫽ 50
z ⫽ 56
Note: All voxels identified show significant switch costs in at least two switch-no switch contrasts (p < .05 Family-wise error rate corrected in each). Thus, many regions not shown here may also show brain switch costs at less stringent thresholds. Regions colored in red are common activations that show no significant differences among costs for different types
c09.indd Sec2:161
Van Snellenberg & Wager, in press; T. D. Wager, Jonides, & Reading, 2004; T. D. Wager, Jonides, Smith, & Nichols, 2005). There are other types of reverse inference that are less specific about the localization of psychological functions in the brain but are more defensible. These inferences fall into two major categories: those based on dissociations in activation among tasks, and those based on activation overlap across tasks. Both types involve studies that test two or more tasks in the same experiment. Dissociation occurs when a brain region is more active in Task A than in Task B. A double dissociation occurs when each task activates one region more than the other task. Double dissociations are a powerful tool because they imply that the two tasks utilize different processes, and that one task is not a subset of the other. A recent study in our laboratory illustrates this approach. We found that different types of task switching, or switching attention from one feature or object to another, differentially activate a set of regions thought to be involved in the control of attention (T. D. Wager, Jonides, Smith, & Nichols, 2005). Four types of switches were dissociable— each produced higher brain activity in some regions than the others—paralleling behavioral findings that performance switch costs are more highly correlated for similar types of switches (see Figure 9.4). The implication from this converging evidence is that different types of attention switching involve unique processes. Though double dissociations are potentially powerful, they have been criticized on several counts. For one thing, nonlinear relationships between task demands and activation can produce a double dissociation even if there are no processes unique to each task. Sternberg (2001) has proposed a stronger criterion for task separability called “separate modifiability,” which entails finding outcomes that are affected by each task but not the other task.
z ⫽ 10
Figure 9.4 (Figure C.1 in color section) Axial slices showing brain regions responsive to different types of switching and their overlap.
161
Common activation Internal ⬎ External Enternal ⬎ Internal Object ⬎ Attribute Attribute ⬎ Object I ⬎ E and O ⬎ A
of switch (at p < .05 uncorrected). Other regions show evidence for greater activation in some switch types than others, as indicated in the legend. A ⫽ Attribute switch types; E ⫽ External; I ⫽ Internal; O ⫽ Object. From “Accounting for Nonlinear BOLD Effects in fMRI: Parameter Estimates and a Model for Prediction in Rapid Event-Related Studies,” by T. D. Wager, A. Vazquez, L. Hernandez, and D. C. Noll, 2005, NeuroImage, 25, pp. 206–218.
8/18/09 5:25:28 PM
162
Essentials of Functional Neuroimaging
A second type of psychological inference is based on the overlap in activation among tasks, which is often taken as evidence that the tasks share common processes (Sylvester et al., 2003). In the task switching study shown in Figure 9.4, even though there were quantitative dissociations in activation magnitude, all regions responded to at least two types of task switch, and some responded to all four. This implies at least some common processes across switch types, paralleled by significant performance correlations across types (T.D. Wager, Jonides, & Smith, 2006). Though the logic that activation overlap equals process overlap is commonly used, it provides weak support for shared neuronal processes: A single voxel in a neuroimaging study typically contains on the order of one million neurons, and it is entirely possible that different subsets of neurons in the same voxel are activated by different tasks. Paton, Belova, Morrison, and Salzman (2006) found different cells in the monkey amygdala that respond to either positive or negative predictions about upcoming rewards within the volume of a single neuroimaging voxel. Wang, Tanaka, and Tanifuji (1996), using optical imaging, found topographical maps of perceived head orientation in areas of temporal cortex that spanned only about 1 mm of cortex. fMRI-Adaptation Designs These issues have led to another method for assessing the use of common neural substrates across tasks. This method relies on repetition-suppression effects, or adaptation of fMRI responses to repeated events. It is possible to take advantage of this effect to tell whether two stimulus types (A and B) activate the same or different populations of neurons within a voxel (Grill-Spector & Malach, 2001). If a stimulus of type A is presented, then subsequent presentations of A will result in reduced signal (adaptation). The logic is that other stimuli, say of type B, that engage the same set of neurons will also evoke a reduced signal ([Balone– BafterA], cross-adaptation), whereas those that engage different neurons (even within the same voxel) will evoke a larger signal. Thus, small cross-adaptation effects may provide evidence that B engages different populations of neurons, whereas large cross-adaptation effects may be evidence that the circuitry for B and A overlap. However, caution in interpretation is in order, because habituation ([Balone– BafterA]) can be caused by the mechanical properties of the vascular bed (Vazquez et al., 2006; T. D. Wager, Vazquez, et al., 2005), and not to a neuronal habituation process. In fact, the response to A immediately after B is always likely to produce a reduced response compared with A alone because of the time it takes the vessels to regain their original shape after a BOLD response. This complicates the inference that similar adaptation and cross-adaptation implies overlapping neuronal populations. Another issue is that a recent
c09.indd Sec2:162
electrophysiological study designed to test the validity of this paradigm reported differential habituation in single-cell recording to two stimuli, even though they both activated the same neuron (Sawamura, Orban, & Vogels, 2006). This finding challenges the inference that different adaptation and cross-adaptation effects imply different populations of neurons. Finally, though interpretations of fMRI-adaptation effects are often cast in terms of neuronal firing, more global processes related to memory may play an important role as well (Henson, 2003). The Hierarchical Structure of Neuroimaging Data Whichever type of inference is desired, inference is based on data, usually from multiple individuals. This section describes the structure of neuroimaging data, and the following sections describe some conceptual essentials of the steps that lead to psychological inference: valid group analysis, thresholding techniques, and localization of activated regions. Proper analysis of multisubject data in each voxel yields a statistical parametric map (SPM) of the reliability of contrast values—images that contain test statistic values (e.g., t-values) and p-values for the group analysis at each voxel. These statistic images are thresholded, with some provision for correcting for multiple comparisons across the many brain voxels tested, to obtain maps of suprathreshold or activated regions. Activated regions are localized relative to standard brain landmarks, often with the aid of brain atlases and norms, and interpreted in the context of other human and animal literature. Imaging data typically involves repeated observations over time—in fMRI as many as two thousand brain images can be collected in the course of a single imaging session for each participant. These images are nested within task conditions (e.g., tasks A and B, or “switch attention” [for a particular switch type] and “do not switch,” in our example study). Task conditions, in turn, are crossed with participant, meaning that they are assessed for each participant. Participants may be additionally nested within groups (e.g., patients versus controls, young versus elderly). Most often, a statistical model is specified for each participant that estimates the average response to each task condition of interest. Responses to different task conditions are compared by calculating contrasts across two or more conditions. Those measures are called contrast values, and they usually reflect a comparison of the activity levels between task conditions of interest (e.g., A minus B, “switch attention” minus “no switch”) that yields a single number for each participant. Contrast values for each voxel yield contrast images, three-dimensional maps of activation difference values for each participant. T-tests or comparable analyses can be performed for each voxel to discover where in the
8/18/09 5:25:29 PM
From Data to Psychological Inference
brain the difference is reliable. More detail on contrasts is provided in the section Data Analysis: Implementation. Analyzing contrast values has been referred to as the “subtraction method,” the logic of which is this: If you test two experimental conditions that differ by only one process, then a subtraction of the activations of one condition from those of the other should reveal the brain regions associated with the target process. Subtraction logic rests on a critical assumption that has been called the assumption of “pure insertion” (Sternberg, 1969). According to this assumption, changing one process does not change the way other processes are performed. Thus, by this assumption, the process of interest may be purely inserted into the sequence of operations without altering any other processes. Violations of subtraction logic have been demonstrated (Zarahn, Aguirre, & D’Esposito, 1997), and evoked activation depends on baseline cerebral blood flow in an area and other factors (Vazquez et al., 2006). However, subtractions remain widely used because comparisons among relative activity levels are central to the inferencemaking process. The assumption of pure insertion underlies the inference that more observed activity implies more intense neural and metabolic processes. In defense of the subtraction method, pure insertion need not be quantitatively or strictly true in all cases to yield useful comparisons across conditions. The contrast method applies to many comparisons other than the simple Task A – Task B subtraction, including incremental variations in task difficulty and factorial designs. It also applies to brain-performance correlation designs, in which activation contrast values are correlated with performance contrast values. These designs may employ multiple control or comparison conditions to strengthen the case for a relationship between activity in a particular brain region and a psychological process. They also extend beyond imaging of “activation” to studies that image neurochemical activity and other signals. Principles of Population Inference It is usually advantageous to design studies and statistical analyses in a way that permits inferences about a population of participants. Population inference is typical in all kinds of studies (e.g., when testing a new drug, researchers perform statistical tests that allow them to infer that the drug is likely to produce a benefit on average for individuals in a certain population). Even most studies of psychophysics and electrophysiology in monkeys, which often rely on only one or two participants for the entire study, need to be able to claim that their results apply beyond the particular individuals studied. They do so by invoking the additional assumption that all participants will behave the same way
c09.indd Sec2:163
163
as the few observed in the study. In almost all domains of human neuropsychology, this is not a safe assumption, and statistics should be performed that permit population inference in a standard way. This can be achieved by considering the multilevel nature of neuroimaging data. A key to population inference is to treat the variation across participants as an error term in a group statistical analysis, which leads to generalizability of the results to new participants drawn from the same population. The most popular group analysis is the one-sample t-test on contrast estimates (e.g., Task A – Task B) at each voxel. This analysis tests whether the contrast of interest is nonzero on average for the population from which the sample was drawn, and it provides a starting point for our discussion on population inference. The principle, however, applies to any kind of statistical model, including more complex ANOVA and regression models and multivariate analyses such as group independent components analysis (ICA). Mixed versus Fixed Effects The one-sample t-test across contrast values treats the value of that contrast as a random variable with a normal distribution over subjects, and hence the error term in the statistical test is based on the variance across participants. Such an analysis has come to be known as a “random effects” analysis in the neuroimaging literature. Many early studies performed incorrect statistical analyses by lumping data from different participants together into one “supersubject” and analyzing the data using a single statistical model. This is called a “fixed effects” analysis because it treats the participant as a fixed effect, and assumes the only noise is due to measurement error within subjects. It is not appropriate for population inference because it does not account for individual differences. Collecting five hundred images each (250 of Task A and 250 of Task B) on two participants would be treated as the equivalent of collecting two images each (Task A and B) on 500 participants. Some researchers have argued that the fixed analysis allows researchers to make inferences about the brains of participants in the study, but not to a broader population. While this is technically true, inferences about particular individuals are seldom useful; such a lack of generalizability would be unacceptable in virtually any field, and we do not consider it appropriate for neuroimaging studies either. A more complete analysis is the mixed effects analysis, so termed because it estimates multiple sources of error, including measurement error within subjects and inter-individual differences between subjects. The one-sample t-test on contrast estimates described above is actually a simplified mixed-effects analysis that is valid if the standard errors of contrast estimates are the same for all participants.
8/18/09 5:25:29 PM
164
Essentials of Functional Neuroimaging
Full mixed-effects analyses use iterative techniques (such as the Expectation-Maximization [EM] algorithm) to obtain separate estimates of measurement noise and individual differences. They are implemented in popular packages such as Hierarchical Linear Modeling (HLM; Raudenbush & Bryk, 2002), R, and MLwiN (Rasbash, 2002). Neuroimaging data-friendly mixed-effects models are implemented in FSL (Beckmann, Jenkinson, & Smith, 2003; Woolrich, Behrens, Beckmann, Jenkinson, & Smith, 2004) and FMRISTAT (Worsley, Taylor, Tomaiuolo, & Lerch, 2004) software and are potentially implementable in SPM5. Thresholding and Multiple Comparisons The results of neuroimaging studies are often summarized as a set of “activated regions,” such as those shown in Figure 9.4. Such summaries describe brain activation by color-coding voxels whose t-values or comparable statistics (z or F) exceed a certain statistical threshold for significance. The implication is that these voxels are activated by the experimental task. A crucial decision is the choice of threshold to use in deciding whether voxels are active. In many fields, test statistics whose p-values are below .05 are considered sufficient evidence to reject the null hypothesis, with an acceptable false positive rate (alpha) of .05. In brain imaging, we often test on the order of 100,000 hypothesis tests (one for each voxel) at a single time. Hence, using a voxel-wise alpha of .05 means that 5% of the voxels on average will show false positive results. This implies that we actually expect on the order of 5,000 false positive results. Thus, even if an experiment produces no true activation, there is a good chance that without a more conservative correction for multiple comparisons, the activation map will show numerous activated regions, which would lead to erroneous conclusions. The traditional way to deal with this problem of multiple comparisons is to adjust the threshold so that the probability of obtaining a false positive is simultaneously controlled for every voxel (statistical test) in the brain. In neuroimaging, a variety of approaches toward controlling the false positive rate are commonly used—we discuss them in detail later. The fundamental difference between any method that is used is whether they control for the family wise error rate (FWER) or the false discovery rate (FDR). The FWER is the probability of obtaining any false positives in the brain, whereas the FDR is the proportion of false positives among all rejected tests. To illustrate the difference between FWER and FDR, imagine that we conduct a study on 100,000 brain voxels at alpha ⫽.001 uncorrected, and we find 300 significant voxels. According to theory, we would expect that
c09.indd Sec2:164
100 (or 33%) of our significant discoveries, to be false positives, but which ones we cannot tell. Since 33% is a significant proportion of all active voxels, we may have low confidence that the activated regions are true results. Thus, it may be advantageous to set a threshold that limits the expected number of false positives to 5%. This is referred to as FDR control at the .05 level. In this case, we might argue that most of the results are likely to be true activations; however, we still cannot tell which voxels are truly activated and which are false positives. FWER, by contrast, is a stronger method for controlling false positives. Controlling the FWER at 5% implies that we set a threshold so that, if we were to repeat the previously mentioned experiment 100 times, only 5 out of the 100 experiments would result in one or more false positive voxels. Therefore, when controlling the FWER at 5%, we can be fairly certain that all voxels that are deemed active are truly active. The thresholds will typically be quite conservative, leading to problems with false negatives, or truly active voxels that are now deemed inactive. In our example, perhaps only 50 out of the 200 truly active voxels will give significant results. While we can be fairly confident that all 50 are true activations, we have still lost 150 active voxels, most of the true activity, which may distort our inferences and the usefulness of the experiment (see Figure 9.5). Most published PET and fMRI studies do not use either of these corrections; instead, they use arbitrary uncorrected thresholds, as shown in Figure 9.6, with a modal threshold of p < .001. A likely reason is that with the sample sizes typically available, corrected thresholds are so high that power is extremely low. This is extremely problematic when interpreting conclusions from individual studies, as many of the activated regions may simply be false positives. Imposing an arbitrary extent threshold for reporting based on the number of contiguous activated voxels does not necessarily correct the problem because imaging data are spatially smooth, and thus corrected thresholds should be reported whenever possible. Figure 9.6 shows the same activation map with spatially correlated noise thresholded at three different P-value levels. Due to the smoothness, the false-positive activation blobs (outside the squares) are contiguous regions of multiple voxels. Because achieving sufficient power is often not possible, it makes sense to report results at an uncorrected threshold and use meta-analysis or a comparable replication strategy to identify consistent results (T. D. Wager, Lindquist, & Kaplan, 2007), with the caveat that uncorrected results from individual studies cannot be strongly interpreted. Ideally, a study would report both corrected results and results at a reasonable uncorrected threshold (e.g., p < .001 and 10 contiguous voxels) for archival purposes.
8/18/09 5:25:30 PM
From Data to Psychological Inference
165
(A) ␣ⴝ0.10, No correction
0.0974
0.1008
0.1029
0.0988
0.0968
0.0993
0.0976
0.0956
0.1022
0.0965
0.0894
0.1020
0.0992
Proportion of false positives FWER control at 10%
FWER Occurrence of false positive FDR control at 10%
0.0871
0.0952
0.0790
0.0908
0.0761
0.1090
0.0851
Proportion of active voxels that are false positives Figure 9.5 An overview of the effects of various approaches toward dealing with multiple comparisons. Note: (Top) Ten simulated t-maps were analyzed using an uncorrected threshold p < .10. True positives are indicated by white regions inside the gray squares. False positives are white pixels outside of the gray square. The proportion of false positives is listed under each image. They average 10%, as expected. (Middle) The same images with the threshold designed
␣ⴝ0.10
␣ⴝ0.01
␣ⴝ0.001
Figure 9.6. Multiple comparisons in the presence of spatially correlated noise. This figure shows simulated t-maps as in Figure 9.5, but with spatially “smooth” noise as is typical in actual fMRI experiments. In this case, imposing an arbitrary ‘extent threshold’ based on the number of contiguous activated voxels does not necessarily solve the problem of false positives. The same activation map, with spatially correlated noise, is thresholded at three different P-value levels. Due to the smoothness, the false-positive activation blobs (outside of the squares) are contiguous regions of multiple voxels, which can easily be misinterpreted as regions of activity.
c09.indd Sec2:165
to control the Familywise error rate (FWER) at 10% using Bonferroni correction. There is only one false positive in the 10 images, at the cost of a significant increase in the number of false negatives. (Bottom) Similar results obtained using an FDR controlling procedure at the 10% level. The proportion of active voxels that are false positives is listed under each image. They average 10% as expected.
Family Wise Error Rate Correction The simplest way of controlling the FWER is to use Bonferroni correction. Here the alpha value is divided by the total number of statistical tests performed (voxels). If there is spatial dependence in the data—which is almost always the case, because the natural resolution and applied smoothing both lead to spatial smoothness in imaging data—this is an unnecessarily conservative correction that leads to a decrease in power to detect truly active voxels. Gaussian Random Field Theory (RFT; Worsley et al., 2004), used in SPM, FMRISTAT, and BRAINSTAT software (Taylor & Worsley, 2006), is another (more theoretically complicated) approach toward controlling the FWER. If the image is smooth and the number of subjects is relatively high (around 20), RFT is less conservative and provides control closer to the true false positive rate
8/18/09 5:25:30 PM
166
Essentials of Functional Neuroimaging Long-term Memory P-value thresholds used
60 # of Maps
50 0 ⫺50
40 20 0
50
0
⫺50
⫺100
0.0001 0.0005 0.001 0.005 Uncorrected
Figure 9.7 Common thresholds used in neuroimaging experiments. Note: A midline sagittal slice (left) shows the peak activations reported in 195 separate studies of long-term memory. The frequencies of P-value thresholds used for all statistical parametric maps in these studies are shown to the right. The most common threshold is P < .001, uncorrected for
than the Bonferroni method. With small samples, RFT is often more conservative than the Bonferroni method. It is acceptable to use the more lenient of the two, as they both control the FWER, which is what SPM currently does. In addition, RFT is used to assess the probability that k contiguous voxels are exceeding the threshold under the null hypothesis, leading to a “cluster-level” correction. Nichols and Hayasaka (2003) provide an excellent review of FWER correction methods, and they find that while RFT is overly conservative at the voxel level, it is somewhat liberal at the cluster level with small sample sizes. Both methods previously described for controlling the FWER assume that the error values are normally distributed, and that the variance of the errors is equal across all values of the predictors. As an alternative, nonparametric methods instead use the data to find the appropriate distribution. Using such methods can provide substantial improvements in power and validity, particularly with small sample sizes, and we regard them as the gold standard for use in imaging analyses. Thus, these tests can be used to verify the validity of the less computationally expensive parametric approaches. A popular package for doing nonparametric tests in group analyses, “Statistical Non-Parametric Mapping” (SnPM; Nichols & Holmes, 2002), is based on permutation tests. False Discovery Rate Control The false discovery rate (FDR) is a development in multiple comparison problems developed by Benjamini (1995) and Hochberg. While the FWER controls the probability of any false positives, the FDR controls the proportion of false positives among all rejected tests. The FDR controlling procedure is adaptive in the sense that the larger the signal, the lower the threshold. If all of the null hypotheses are true, the FDR will be equivalent to the FWER. Any procedure that controls the FWER will also control the FDR. Hence, any procedure that controls the FDR can only be
c09.indd Sec2:166
0.01
0.05
0.05 Corr.
multiple comparisons. Corr: Corrected threshold. From “Meta-Analysis of Functional Neuroimaging Data: Current and Future Directions,” by T. D. Wager, M. Lindquist, and L. Kaplan, 2007, Social, Cognitive, and Affective Neuroscience, 2, pp. 150–158.
less stringent and lead to increased power. A major advantage is that since FDR controlling procedures work only on the p-values and not on the actual test statistics, it can be applied to any valid statistical test. Regions of Interest Analysis Because of the difficulty in preserving both false positive control and power in experiments with few subjects, researchers often specify regions of interest (ROIs) in which activation is expected before the study is conducted. ROI analyses are conducted variously over the average signal within a region, the peak activation voxel within a region, or preferably on individually defined anatomical or functional ROIs. Another technique involves testing every voxel within an ROI (e.g., the amygdala) and correcting for the number of voxels in the search volume. This is often referred to as a “small volume correction.” Two important cautions must be mentioned. First, conducting multiple ROI analyses increases the false positive rate. While it may be philosophically sound to independently test a small number of areas in which activation is expected, testing many such regions violates the spirit of a priori ROI specification and leads to an increased false positive rate. Small volume corrections in multiple ROIs also do not preserve the false positive rate across ROIs. Second, although activated regions can be used as ROIs for subsequent tests, the test used to define the region must be independent of the test conducted in that region. Acceptable examples include defining a region based on a main effect and then testing to see whether activity in that region is correlated with performance, or using the main effect of (A ⫹ B) to define a region and then testing for a difference (A – B). Problematic examples are defining a region activating in older subjects and then testing to see if its activity is reduced in younger subjects or defining a region based on activity in the first run of an experiment
8/18/09 5:25:31 PM
From Data to Psychological Inference
and then testing whether it shows less activity in subsequent runs. Both of these are not valid tests because they do not control for regression to the mean. Functional Localization and Atlases Accurately identifying the anatomical locations of activated regions is critical to making inferences about the meaning of brain imaging data. Knowing where activated areas lie permits comparisons with animal and human lesion and electrophysiology studies. It is also critical for accumulating knowledge across many neuroimaging studies. Localization is challenging for several reasons: First among them is the problem of variety; each brain is different, and it is not always possible to identify the same piece of brain tissue across different individuals (Thompson, Schwartz, Lin, Khan, & Toga, 1996; Vogt, Nimchinsky, Vogt, & Hof, 1995). Likewise, names for the same structures vary: The same section of the inferior frontal gyrus (IFG) can be referred to as IFG, inferior frontal convexity, Brodmann’s Area 47, ventrolateral prefrontal cortex, the pars orbitalis, or simply the lateral frontal cortex. Standard anatomical atlas brains differ as well, as do the algorithms used to match brains to these atlases. There is currently a wide and expanding array of available tools for localization and analysis. A database of tools is available from the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC; Table 9.3), and another useful list can be found at www.cma.mgh.harvard.edu/iatr/. The most accurate way to localize brain activity is to overlay functional activations on a co-registered, highresolution individual anatomical image. Many groups avoid issues of variability by defining anatomical regions of interest (ROIs) within individual participants and testing averaged activity in each ROI. Functional localizers— separate tasks or contrasts designed to locate functional regions in individuals—are also widely used, and functional and structural localizers can be combined to yield individualized ROIs. Structural ROIs are often used in detailed analysis of medial temporal regions in memory research; and retinotopic mapping, a functional localization procedure, to define individual visual-processing regions (V1, V2, V4, etc.) is now standard in research on the visual system (Tootell, Dale, Sereno, & Malach, 1996). However, the vast majority of studies are analyzed using voxel-wise analysis over much of the brain. In most applications, precise locations are difficult to define a priori within individuals, and often many regions as well as their connectivity are of interest. In such cases, atlas-based localization is used. Such localization can be performed using paper-based atlases (Duvernoy, 1995; Haines, 2000; Mai, Assheuer, & Paxinos, 2004), and there
c09.indd Sec2:167
167
is no substitute for a deep knowledge of neuroanatomy. Automated atlases and digital tools are becoming increasingly integrated with analysis software. Some of the major ones are described next. Early approaches to atlas-based localization were based on the Talairach atlas (Talairach & Tournoux, 1988), a hand-drawn illustration of major structures and Brodmann’s Areas (BAs)—cortical regions demarcated according to their cytoarchitecture by Brodmann in 1909—from the left hemisphere of an elderly French woman. The brain is superimposed on a 3-D Cartesian reference grid whose origin is located at the anterior commissure. This allows brain structures to be identified by their coordinate locations. This stereotactic convention remains a standard today. Peak or center-of-mass coordinates from neuroimaging activations are reported in left to right (x), posterior to anterior (y), and inferior to superior (z) dimensions. Negative values on each dimension indicate locations at left, posterior, and inferior positions, respectively. The Talairach region labels were digitized, and a popular software program, the Talairach Daemon (Lancaster et al., 2000), allows researchers to map neuroimaging results onto Talairach’s labels. In addition, at least two popular software packages, AFNI (Cox, 1996) and BrainVoyager (Brain Innovation, Maastricht, Netherlands), allow researchers to align brains from neuroimaging studies to “Talairach space” using a few key landmarks identified on the brain and on the atlas. The alignment is performed by estimating 12 linear transformation parameters that include translation, rotation, zooms, and shears. Because the Talairach brain is not representative of any population and is not complete—only the left hemisphere was studied, and no histology was performed to accurately map BAs—Talairach coordinates and their corresponding BA labels should not be used (see Brett, Johnsrude, & Owen, 2002; Devlin & Poldrack, 2007, for discussion), as better alternatives are now available. Modern digital atlases based on group-averaged anatomy have largely replaced the Talairach brain. A current standard in the field is the Montreal Neurologic Institute’s (MNI’s) 305-brain average1 (Collins, Neelin, Peters, & Evans, 1994), shown in Figure 9.8 A, which is the standard reference brain for two of the most popular software packages, SPM and FSL (Smith et al., 2004) and the International Consortium for Brain Mapping project.
1
Called avg305T1 in SPM software. A higher-resolution template in the same space, called the ICBM-152 and named avg152T1 in SPM, is also available. It was created from the average of the 152 most prototypical images in the 305-brain set.
8/18/09 5:25:31 PM
168
Essentials of Functional Neuroimaging
Table 9.3
Current web sites for key resources.
Software Registries Neuroimaging Informatics Tools and Resources
www.nitrc.org
Internet Analysis Tools Registry
www.cma.mgh.harvard.edu/iatr/
Software Packages SPM
www.fil.ion.ucl.ac.uk/spm/software/
FSL
www.fmrib.ox.ac.uk/fsl/
AFNI
http://afni.nimh.nih.gov/
BrainVoyager
www.brainvoyager.com
FMRISTAT
www.math.mcgill.ca/keith/fmristat/
VoxBo
www.voxbo.org
FIASCO
www.stat.cmu.edu/˜fiasco/index.php?ref=FIASCO_home.shtml
Analysis Toolboxes SnPM, nonparametric analysis
www.sph.umich.edu/ni-stat/SnPM/
SPMd, image diagnostics
www.sph.umich.edu/ni-stat/SPMd/
Robust regression toolbox
www.columbia.edu/cu/psychology/tor/software.htm
Mediation analysis toolbox
www.columbia.edu/cu/psychology/tor/software.htm
GIFT (Group ICA)
http://icatb.sourceforge.net
MVPA toolbox: Classification
www.csbmb.princeton.edu/mvpa/
Netlab: Pattern classification
www.ncrg.aston.ac.uk/netlab/
Inverse logit HRF model
www.columbia.edu/cu/psychology/tor/software.htm
Atlases and Databases BrainMap
http://brainmap.org
ICBM
http://www.loni.ucla.edu/ICBM/
SUMS DB
http://sumsdb.wustl.edu:8081/sums/index.jsp
SPM Anatomy Toolbox
www.fz-juelich.de/inb/inb-3//spm_anatomy_toolbox
Wager lab meta-analyses
www.columbia.edu/cu/psychology/tor/MetaAnalysis.htm
Surface-Based Normalization/Warping FreeSurfer
http://surfer.nmr.mgh.harvard.edu
Caret/SureFit
http://brainmap.wustl.edu/caret/
Design Optimization Genetic Algorithm for fMRI
www.columbia.edu/cu/psychology/tor/software.htm
M-sequence toolbox
http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=3083
Digital atlases, including the MNI-305 template (not the Talairach template), permit fine-grained nonlinear warping of brain images to the template and can (if data quality is adequate) match the locations of gyri, sulci, and other local features across brains. A popular approach implemented in SPM software is intensity-based normalization. In this process, intensity values in a brain image are matched to a reference atlas image (template) by deforming the brain image in linear or nonlinear ways and using search algorithms to find the deformations that yield the best match. A preferred intensity-based method is the “unified segmentation
c09.indd Sec2:168
and normalization” algorithm in SPM5 (Ashburner & Friston, 2005). A recent and promising alternative to intensity-based approaches is surface-based normalization, in which brain surfaces are reconstructed from segmented graymatter maps and inflated to a spherical shape or flattened (reviewed in Van Essen & Dierker, 2007). Features (e.g., gyri and sulci) are identified on structurally simpler 2-D or spherical brains, and the inflated brain is warped to an average spherical atlas brain. This approach has yielded better matches across individuals in comparison studies
8/18/09 5:25:32 PM
From Data to Psychological Inference (A) MNI 305 Average
(E)
(B) ICBM LBPA40 (SPM5)
(C) SPM Anatomy Toolbox
169
(D) Wager lab N18 (SPM5)
(F) ICBM MNI452 (Poly)
Anterior
Posterior
Anterior
Posterior
Medial
Lateral
Insula
Figure 9.8 (Figure C.2 in color section) Examples of atlas template images and a group-averaged, normalized structural image from a study. Note: A: The Montreal Neurologic Institute (MNI) 305-brain average as included with Statistical Parametric Mapping (SPM99, SPM2, or SPM5) software. The atlas brain is the same across software versions, though the algorithms for normalizing brains to the template have changed. B: The International Consortium for Brain Mapping (ICBM) LBPA atlas, based on manually labeled region from the Center for Morphometric Analysis at Harvard University. Each color represents a gross brain structure based on a consensus among 40 individually labeled brains. C: The singlesubject T1 brain co-registered with MNI space—the “colin brain” based on an average of 27 images of one individual—with overlaid consensus regions based on probabilistic cytoarchitecture. The probabilistic maps represented here are available in the SPM Anatomy toolbox, V1.5, and represent data from a series of studies on cytoarchitectural mapping of postmortem brains registered to the single-subject MNI template (Amunts et al., 2005; Eickhoff, Amunts, Mohlberg, & Zilles, 2006); see Table 9.3. Because the underlay brain is only a single brain, it may not be representative
(Fischl, Sereno, Tootell, & Dale, 1999; Van Essen & Dierker, 2007). Several free packages implement surfacebased normalization to templates registered to MNI space, including FreeSurfer (Table 9.3), Caret/SureFit software (Van Essen et al., 2001), and BrainVoyager. AFNI, using SUMA software (Saad, Reynolds, Argall, Japee, & Cox, 2004), and FSL have facilities for viewing and analyzing surface-based data with FreeSurfer and SureFit. Surfacebased add-ons in these packages permit surface-based registration to be performed after gross registration to the Talairach landmarks. Because the original BAs were not precisely or rigorously defined in a group, reporting of BAs using the Talairach atlas is not recommended (Devlin & Poldrack, 2007). Modern probabilistic cytoarchitectural atlases are being developed (Amunts, Schleicher, & Zilles, 2007), and some of these are
c09.indd Sec2:169
Insula of anatomical locations in a study sample (compare the midbrain with that in D) and thus is not an ideal underlay image for localization of new study data. D: A trimmed average of 18 subjects’ T1 images initially warped to the MNI template using SPM and refined using a genetic algorithm based on custom code. This average brain shows good structural definition, indicating good intersubject registration, and is a suitable underlay image for functional activations. E: The ICBM 452-brain MNI average, with 5th-order polynomial warping to the standard space. The structural definition is excellent for the average of many brains, but the space is different from the MNI 305 space (e.g., the brain stem is much more anterior in the 452 brain), illustrating the need to report the specific atlas and procedures used in neuroimaging studies. F: Activations from a taskswitching paradigm (yellow; Wager, Jonides, Smith & Nichols, 2005a, 2005) superimposed on results from a meta-analysis of executive working memory (blue; Wager & Smith, 2003). Surface reconstruction was done with Caret software (Table 9.3) and shows a partially inflated left hemisphere (left) and a flattened cortical map of that hemisphere (right). Red and green arrows show the medial frontal gyrus and inferior frontal junction on each rendering.
available digitally either from the researchers or within FSL and SPM (as part of the SPM Anatomy Toolbox; Eickhoff et al., 2005; Figure 9.8B and C). In addition, software packages increasingly provide tools for visualizing activations relative to known functional and structural landmarks. Caret software allows study results to be mapped to a variety of atlases, including atlas brains included with SPM2, SPM99, and the Van Essen lab’s surface-based PALS atlas (see Figure 9.8F). Brain sections, surfaces, and flattened maps can be visualized, and digital overlays include probabilistic maps of visuotopic regions, orbitofrontal regions from a recent anatomical study (Ongur, Ferry, & Price, 2003), structural and functional landmarks, and a database of previous studies and reported peaks. The associated SumsDB database is a repository for study maps and peak coordinates (Table 9.3).
8/18/09 5:25:32 PM
170
Essentials of Functional Neuroimaging
Another way to localize functional activations is to compare them with the results of meta-analyses of other neuroimaging studies. Comparison with meta-analytic results can help identify functional landmarks and provide information on the kinds of tasks that have produced similar activation patterns. Whereas it was typical in early neuroimaging studies to claim consistency with previous studies based on activation in the same gross anatomical regions (e.g., activation of the anterior cingulate cortex), it is now recognized that many such regions are very large, and more precise correspondence is required to establish consistency across studies. Quantitative meta-analyses identify the precise locations that are most consistently activated across studies, and they thus provide excellent functional landmarks. Some meta-analysis maps are available on the SumsDB and BrainMap databases (Table 9.3), and a number are available on the web from individual researchers. Our lab currently has images from metaanalyses available on the web (Table 9.3), and these can be loaded into SPM, FSL, BrainVoyager, Caret, or other packages for visualization. The variety and heterogeneity of tools that are currently available is both a strength and an obstacle to effective localization. A few guidelines may aid in the process. First, it is preferable to overlay functional activations on an average of the actual anatomical brains from the study sample, after normalization (registration and/or warping) to a chosen template, instead of relying solely on an atlas brain (see Figure 9.8D). Normalization cannot be achieved perfectly in every region, and showing results on the subject’s actual anatomy is more accurate than assuming the template is a perfect representation. In addition, viewing the average warped brain can be informative about whether the normalization process yielded high co-registration of anatomical landmarks across participants, and can help identify problem areas. Single-subject atlases should not be taken as precise indicators of activation location in a study sample, and while they make attractive underlay images for activations, they should not be used for this purpose. Second, it is important to remember that atlas brains are different, and different algorithms used with the same atlas produce different results. Therefore, it is important to report which algorithm and which atlas were used. Also, it would be highly misleading to use a probabilistic atlas such as those in the SPM anatomy toolbox if the study brains were normalized to a different template (or with different procedures) than the one used to create the atlas (e.g., the SPM anatomy toolbox should not be used when normalizing to the ICBM-452 atlas; see Figure 9.8E). Regardless of the tools used, identifying functional activations on individual and group-averaged anatomy, collaborating with neuroanatomists when possible, and using print atlases to identify activations relative
c09.indd Sec2:170
to structural landmarks are all essential components of the localization and interpretation process.
EXPERIMENTAL DESIGN FOR NEUROIMAGING EXPERIMENTS Types of Experimental Designs Designing a neuroimaging study involves a trade-off between experimental power and the ability to make strong inferences from the results. Some designs, such as the blocked design, typically yield high experimental power, but provide imprecise information about the particular psychological processes that activate a brain region. Event-related designs allow brain activation to be related more precisely to the particular cognitive processes engaged in certain types of trials, but suffer from decreased power. Researchers may also choose to focus intensively on testing one comparison of interest, and maximizing the power to detect this particular effect, or they may test multiple conditions to draw inferences about the generality of a brain region’s involvement in a class of similar psychological processes. In the following subsections, we describe several experimental designs and provide some discussion of the applications for which they are best suited. Blocked Designs Because long intervals of time (30 seconds or more) are required to obtain good PET images, the standard experimental design used in PET studies is the blocked design. In this design, different conditions in the experiment are presented as separate blocks of trials. To image a briefly occurring psychological process (e.g., the activation due to attention switching) using a blocked design, you might repeat the process of interest during an experimental block (A) and have the subject rest during a control block (B). The A – B (A minus B) comparison is the most basic type of contrast for this design. The blocked structure of PET designs (and blocked fMRI designs) imposes limitations on the interpretability of results. While activations related to slowly changing factors such as task-set or general motivation are well captured by blocked designs, they are not well suited if you want to image the neural responses to individual stimuli. In addition, the A – B contrast does not allow researchers to determine whether a region is activated solely in A, deactivated solely in B, or some combination of both effects. Multiple controls and comparison conditions can ameliorate this problem to some degree. The main advantage to using a blocked design is that it typically offers increased statistical power to detect a change. Under ideal conditions, blocked designs can be over 6 times
8/18/09 5:25:33 PM
Experimental Design for Neuroimaging Experiments
estimates typically come from studies of brief visual and motor events. In practice, the timing and shape of the HRF are known to vary across the brain, within an individual and across individuals (Aguirre et al., 1998; Schacter, Buckner, Koutstaal, Dale, & Rosen, 1997; Summerfield et al., 2006). Part of the variability is due to the underlying configuration of the vascular bed, which may cause differences in the HRF across brain regions in the same task for purely physiological reasons (Vazquez et al., 2006). Another source of variability is differences in the pattern of evoked neural activity in regions performing different functions related to the same task. Blocked designs are less sensitive to the variability of the HRF because they depend on the total activation caused by a train of stimulus events that makes the overall predicted response less sensitive to variations in the shape of responses to individual events. Predicted responses in block designs may still be inaccurate if the HRF model is inaccurate or if the density and time course of neural activity is not appropriately modeled (Price, Veltman, Ashburner, Josephs, & Friston, 1999). In a single-trial event-related design, events are spaced at least 20 to 30 seconds apart in time. fMRI signal can be observed on single trials if the eliciting stimulus is very strong (Duann et al., 2002), permitting the possibility of fitting models at the level of an individual trial (Rissman, Gazzaley, & D’Esposito, 2004). This promising technique enables the testing of relationships between brain activity and trial-level performance measures such as reaction time and emotion ratings for particular stimuli (Phan et al., 2004). Early studies frequently employed selective averaging of activity following onsets of a particular type (Aguirre, Singh, & D’Esposito, 1999; Buckner et al., 1998; Menon et al., 1998). However, even brief events (e.g., a 125 ms visual checkerboard display) have been shown to affect fMRI signal more than 30 s later (T. D. Wager, Vazquez,
as efficient as randomized event-related designs (T. D. Wager & Nichols, 2003). Generally, theory and simulations designed to assess experimental power in fMRI designs point to a 16 s to 18 s task / 16 to 18 s control alternating-block design as being optimal with respect to statistical power (Liu, 2004; Skudlarski, Constable, & Gore, 1999; T. D. Wager & Nichols, 2003). However, this is not always true as the relative power of a blocked design depends on whether the target mental process is engaged continuously in A and not at all in B, and whether imposing a block structure changes the task. Event-Related Functional Magnetic Resonance Imaging Event-related fMRI designs take advantage of the rapid dataacquisition capabilities of fMRI. They provide the ability to estimate the fMRI response evoked by specific stimuli or cognitive events within a trial (Rosen, Buckner, & Dale, 1998). In fMRI, the whole brain can be measured every 2 to 3 seconds (the “TR,” or repetition time of image acquisition), depending on the type of data acquisition and the spatial resolution of the images. The limiting factor in the temporal resolution of fMRI is generally not the speed of data acquisition, but rather the speed of the underlying evoked hemodynamic response to a neural event, referred to as the hemodynamic response function (HRF). A typical HRF begins within a second after neural activity occurs and peaks 5 to 8 seconds after that neural activity has peaked (Aguirre, Zarahn, & D’Esposito, 1998; Friston, Frith, Turner, & Frackowiak, 1995). Figure 9.9 shows the canonical HRF used in SPM software. While event-related designs are attractive because of their flexibility and the information they provide about individual responses, they rely more strongly on assumptions about the time course of both evoked neural activity and the HRF. It is common to assume a near-instantaneous neural response for brief events and a canonical HRF shape to generate linear models for statistical analyses (Figure 9.9; see also Data Analysis: Implementation). The canonical
Indicator functions A
Assumed HRF (Basis function)
Design Matrix (X)
Image plot of X A B C D
A Time
B
B
C
C 0
5 10 15 20
D
D 0
50 100 150 200 Time (s)
Figure 9.9 Construction of an event-related fMRI design matrix with four different event types (A-D), using the canonical SPM HRF.
c09.indd Sec3:171
171
0
50 100 150 200 Time (s)
Note: Indicator functions corresponding to the four event types are convolved with the canonical HRF to create the regressors that make up the design matrix. An image of the design matrix is shown to the far right.
8/18/09 5:25:33 PM
172
Essentials of Functional Neuroimaging
et al., 2005). Because the selective averaging procedure does not take the stimulus history into account, it must be used with caution when responses to different events overlap in time. Because of this, the majority of analyses, including those that estimate the shapes of HRFs, are currently done within the GLM framework (see Data Analysis: Implementation). Reports that the fMRI BOLD response is linear with respect to stimulus history (Boynton, Engel, Glover, & Heeger, 1996) encouraged the use of more rapidly paced trials (Zarahn et al., 1997), spaced less than 1 s apart in the most extreme cases (Burock, Buckner, Woldorff, Rosen, & Dale, 1998; Dale & Buckner, 1997). Here linearity implies that the magnitude and shape of the HRF does not change depending on the preceding stimuli. Studies have found that nonlinear effects in rapid sequences (1 or 2 s) can be quite large (Birn, Saad, & Bandettini, 2001; Friston, Mechelli, Turner, & Price, 2000; Vazquez & Noll, 1998; T. D. Wager, Vazquez, et al., 2005), but that responses are roughly linear if events are spaced at least 4 s to 5 s apart (Miezin et al., 2000). If rapid designs are properly designed, they still allow us to discriminate the effects of different conditions. Key is incorporating “jitter,” or a variable interstimulus interval (ISI) between events, which is critical for comparing event-related responses with an implicit resting baseline to determine whether the events are “activations” or “deactivations” relative to rest. With a randomized and jittered design, sometimes several trials of a single type will occur in a row, and because the hemodynamic response to closely spaced events sums in a roughly linear fashion, the expected response to that trial-type will build to a high peak. Introducing jitter allows peaks and valleys in activation to develop that are specific to particular experimental conditions. If we care only about comparing event types (e.g., A – B), randomizing the order of events creates optimal rise and fall without additionally jittering the ISI. However, jittered ISIs are critical for comparing events to baseline activity and thus determining whether events activate or deactivate a voxel relative to that baseline (Josephs & Henson, 1999; Wager & Nichols, 2003). Suppose you have a rapid sequence with two types of trial—say, attention switch trials (S) and no-switch trials (N) as in the task-switching experiment described earlier (Figure 9.4). Randomly intermixing the trials with an ISI of 2 s will allow you to estimate the difference S – N. However, you cannot tell whether S and N activate or deactivate relative to some other baseline. If you vary the interstimulus intervals randomly between 2 and 16 s, you can compare S – N (with less power because there are fewer trials), but you also can test whether S and N show positive or negative activation responses. This ability comes from the inclusion of
c09.indd Sec3:172
intertrial rest intervals against which to compare S and N, and the relatively unique signature of predicted responses to both S and N afforded by the random variation in ISIs. The advantages of rapid pacing—including faster trials and sometimes increased statistical efficiency—must be weighed against potential problems with nonlinearity, multicolinearity, and model misfitting. A current popular choice is to use jittered designs with interstimulus intervals of at least 4 s, with exponentially decreasing frequencies of delays up to 16 s. Optimized Experimental Designs What constitutes an optimal experimental design depends on the psychological nature of the task as well as on the ability of the fMRI signal to track changes introduced by the task manipulations over time. It also depends on the specific comparisons (contrasts) of interest in the study. To make matters worse, the delay and shape of the BOLD response (and ASL signals, and other blood flow-based methods), scanner drift, and nuisance factors such as physiological noise conspire to make experimental design for fMRI more complicated than for experiments that measure behavior alone. Not all designs with the same number of trials of a given set of conditions are equal, and the spacing and ordering of events is critical. Some intuitions and tests of design optimality follow from a deeper understanding of the statistical analysis of fMRI data and are elaborated on later. For a full treatment, there are several excellent papers (Josephs & Henson, 1999; Liu, 2004; Smith, Jenkinson, Beckmann, Miller, & Woolrich, 2007; T. D. Wager & Nichols, 2003). Several computer algorithms are available for constructing statistically optimized designs, including an approach based on m-sequences—mathematical sequences that are near-optimal for certain types of designs (Buracas & Boynton, 2002), and one based on a genetic algorithm (Wager & Nichols, 2003) that incorporates m-sequence designs as a starting point and considers the relative importance of various contrasts to the study goals in calculating optimality. Design Strategies for Enhanced Psychological Inference Thus far, we have alluded to a simple contrast between two conditions, the subtraction of a control condition (B) from an experimental one (A), or [A – B]. Such contrasts are critical because any task, performed alone, produces activation in huge portions of the brain. Though contrasts in event-related designs can usually be more readily interpreted as being evoked by specific psychological or physical events than those in blocked designs, a single contrast
8/18/09 5:25:33 PM
Experimental Design for Neuroimaging Experiments
leaves much room for incorrect inference. This is because there may be multiple psychological and physical differences between task conditions A and B. Imagine a study that compares a difficult version of a working memory task (A) with an easy one (B). Not only does the more difficult task require greater use of working memory, it may also elicit increases in heart rate, more frustration, more error-detection and correction processes, and more monitoring and adjustment of performance. The result is that the [A – B] contrast does not reveal activations associated only with working memory demand. Parametric Modulation Designs One way to constrain interpretation and strengthen the credibility of subtraction logic is to incrementally vary a parameter of interest across several levels (e.g., working memory demand), and perform multiple subtractions or linear contrasts across levels. An example is a study of the Tower of London task (Dagher, Owen, Boecker, & Brooks, 1999), which requires subjects to make a sequence of moves to transfer a stack of colored balls from one post to another in the correct order. The experimenters varied the number of moves incrementally from 1 to 6. Their results showed linear increases in activity in dorsolateral prefrontal cortex across all 6 conditions, suggesting that this area subserved the planning operations critical for good performance. Multiple Control Conditions and Conjunctions Another fruitful approach is to include multiple control conditions matched for various aspects of a target task of interest. In our working memory example, this might amount to including a control condition that produces comparable increase in heart rate without involving working memory, and another that is frustrating without involving working memory, and so on. If a brain region is more activated in the working memory task than each of the control tasks, then it strengthens the case that the region subserves working memory. A productive line of research using this approach is that of Kanwisher and colleagues in the study of face recognition (Kanwisher, McDermott, & Chun, 1997). In a long series of studies, they identified an area in the fusiform gyrus that responded to pictures of faces and drawings of faces, but not to houses, scrambled faces, partial faces, facial features, animal faces, and other control stimuli. By presenting a large number of control stimuli of various types, Kanwisher et al. ruled out many confounding variables and infer that the brain area they studied, which they called the fusiform face area (FFA), was specifically activated by the perception of faces. Though the interpretation of these results as evidencing a face-selective module
c09.indd Sec3:173
173
in the cortex is still being debated, this line of research is an excellent example of using multiple control conditions to rule out alternative hypotheses for the cause of activation of a region. The fact that the ultimate implications for neuroscience are debated is a testament to the difficulty of conceptualizing and ruling out all the plausible confounds, and of making reverse inferences in general. A natural way of making comparisons using multiple control conditions is to use conjunction analysis, which is a logical “and” operator across multiple contrasts. You might want to identify voxels active in a [task A – task B] contrast and in a [task A – task C] contrast. In general, this question is approached by first calculating a statistical map for each contrast of interest, and then selecting those voxels that meet a chosen statistical threshold in both (or all) maps. In effect, the minimum statistic is compared with the conjunction null hypothesis, which specifies that all the contrasts must have significant effects for the conjunction to hold (T. Nichols, Brett, Andersson, Wager, & Poline, 2005). This logic holds generally for all kinds of conjunctions, for example, [A-B] and [C-D] and [E-F], whether or not they are independent. Care must be taken when considering the selection of a significance threshold for a conjunction of contrasts [A-B] and [A-C]: Earlier versions of conjunction analysis in SPM99 and SPM2 software (Price & Friston, 1997) tested the global null hypothesis that none of the effects are truly present. Rejecting this hypothesis implies a true effect in at least one contrast, which is actually an “or” rule: a significant conjunction result in this case implies true activation for contrast [A-B] or contrast [A-C] (T. Nichols et al., 2005). The current version of SPM offers the user a choice of which null hypothesis to test and also offers a range of intermediate alternatives, for example, the hypothesis that 2 or fewer of a series of contrasts have true effects (Friston, Penny, & Glaser, 2005). Unlike the other tests described, this hypothesis requires the assumption of independence among the contrasts, which is clearly violated in our example conjunction with two control conditions [A-B] and [A-C] because they share a baseline. Overall, if you want to test for the intersection (logical and) of multiple effects, then the conjunction null is the proper null hypothesis. In reporting results, the precise procedures and null hypothesis should always be stated; as with other aspects of data analysis, it is not sufficient to merely state that you performed a conjunction analysis with a particular software package. A Note on Baselines Whether a task produces activation or deactivation depends on the baseline condition with which it is compared. Over the past decade or so, Raichle and colleagues have argued for the idea that a quiet resting
8/18/09 5:25:34 PM
174
Essentials of Functional Neuroimaging
state provides a natural baseline condition against which to evaluate task-related activation (Gusnard, Raichle, & Raichle, 2001; Raichle et al., 2001). A source of support is that the oxygen extraction fraction, the ratio of oxygen use to oxygen supplied by blood, is relatively constant across the resting brain. The argument is that this ratio is one that we are equipped to maintain over long time periods, so it provides a natural physiological baseline. Due in large part to the evidence that Raichle has garnered, many researchers compare tasks with an open-eyed fixation or closedeye resting baseline condition. The intertrial intervals in an event-related design, if enough rest and temporal jitter are provided, can provide an estimate of task-evoked activation relative to baseline activity (though the baseline level itself cannot be quantified with BOLD fMRI); however, tasks may also elicit sustained activity during the intertrial intervals (Visscher et al., 2003). Others argue that the baseline state is just another type of cognitive state, albeit one that is poorly experimentally controlled or characterized. Stark and Squire (2001) found that activity in the medial temporal lobes was substantially higher during rest than during some low-level cognitive tasks. Whether a task of interest activated or deactivated the medial temporal lobes depended on the choice of baseline, begging the question of exactly what kind of mnemonic or other cognitive activity is happening during “rest.” Thus, some researchers choose to compare tasks of interest with low-level baseline tasks during which mental activity can be more precisely experimentally controlled (Johnson et al., 2005). Ultimately, the comparison between task states, including rest, is a comparison of activity evoked by different kinds of mental representations. These comparisons can only be psychologically meaningful if the mental processes involved in each task can be specified. This does not preclude the resting state as a baseline condition of interest. Proponents of the baseline state recognize it as an active state, and theories of mental activity during rest include simulation of situations, contingencies, and associated thoughts and feelings generally focused on the self (and likely involving memory retrieval and medial temporal lobe activation; Gusnard et al., 2001). Each investigator must consider these issues in relation to the particular goals of the study when designing the tasks and comparisons.
switches among objects. This design is a simple 2 ⫻ 2 factorial, with two types of trials (switch versus nonswitch) crossed with two types of judgments (object/attribute). This design permits the testing of three contrasts: (1) a main effect of switch versus nonswitch; (2) a main effect of task type; and (3) the interaction between the two, which tests whether the switch versus nonswitch difference is larger for one task-type than the other. Factors whose measurements and statistical comparisons are made within subjects, such as those described previously, are withinsubjects factors; and those whose levels contain data from different individuals (e.g., depressed patients versus controls) are between-subjects factors. Within-subjects factors generally offer substantially more power and have fewer confounding issues (e.g., differences in brain structure and HRF shapes) than between-subjects factors. Factorial designs allow us to investigate the effects of several variables on brain activations. They also permit a more detailed characterization of the range of processes that activate a particular brain region (e.g., attention switching in general, or switching more for one tasktype than the other. Factorial designs also permit us to discover double dissociations of functions within a single experiment. In our example (Figure 9.4), a factorial design was required to infer that a manipulation (e.g., object-switching) affected dorsolateral prefrontal cortex, but a second manipulation (e.g., attribute switching) did not. Factorial designs can also be used to test for violations of the critical assumption of pure insertion, and for a number of other processes. If the baseline process (e.g., task difficulty) can be manipulated independently of the target process (task-switching requirement), then researchers can test for interactions between task difficulty and switching, and test the notion that the switch process produces an additive increase in activation beyond the processes involved in the basic task.
Factorial Designs
PET and fMRI studies yield data in a format that requires substantial preprocessing before statistical analysis and inference can be performed in a valid and optimal way. The goals of preprocessing are (a) to minimize the influence of data acquisition and physiological artifacts; (b) to check statistical assumptions and transform data to meet the assumptions; (c) to standardize the locations of brain
Another extension of subtraction logic is the factorial design. The study of task switching presented in the introduction to this chapter serves as an example (T. D. Wager, Jonides, et al., 2005). A subset of conditions in the study compared switch versus nonswitch trials for each of two different types: switches among object attributes and
c09.indd Sec3:174
DATA ANALYSIS: IMPLEMENTATION Data Preprocessing Artifacts, Assumptions, and the Need for Preprocessing
8/18/09 5:25:34 PM
Data Analysis: Implementation
regions across subjects to achieve validity and sensitivity in group analysis. Most analyses are based, first, on the assumption that all the voxels in any given image were acquired at the same time. Second, it is assumed that each data point in the time series from a given voxel was collected from that voxel only (the participant did not move in between measurements). Third, it is assumed that the residual variance will be constant over time and have a white noise distribution. Additionally, when performing group analysis and making population inference, all individual brains are assumed to be in register, so that each voxel is located in the same anatomical region for all subjects. Without any preprocessing, none of these assumptions hold and statistical analysis would not yield valid or interpretable results. In addition, as noted earlier (see Limitations of PET and fMRI: Acquisition Artifacts in this chapter), neuroimaging data contain artifacts that arise from many sources, including head movement, brain movement, and vascular effects related to periodic physiological fluctuations, and reconstruction and interpolation processes. fMRI data in particular often contain transient spike artifacts and slow drift over time related to a variety of sources, including
magnetic gradient instability, RF interference, and movement-induced inhomogeneities in the magnetic field. An example of transient artifacts as visualized in AFNI is shown in Figure 9.10. Spikes in the data during isolated volume acquisitions are apparent in some entire slices but not others, as shown by the bright bands in the sagittal slices at the bottom of Figure 9.10. This pattern suggests that gradient performance was affected during acquisition of some echo-planar images, which were acquired slice by slice in interleaved order in this experiment. These artifacts likely constitute a violation of the assumptions of normally and identically distributed errors; unless they are dealt with, the consequences include reduced power in group analysis and potentially increased false positives in single-subject inference. A first line of defense is, as with any kind of data analysis, to examine the data—in as raw a form as possible—and diagnose problems. This can be challenging given the massive proportions of neuroimaging data, and different packages provide different ways of looking at the data. As shown in Figure 9.10, AFNI provides an excellent facility for viewing time courses and images from one or more voxels (see Table 9.3 for a list of packages and web sites). Spike artifacts are often identified and problematic images
Figure 9.10 Transient spike artifacts as visualized in the software package AFNI.
This suggests that gradient performance was affected during acquisition of some echo-planar images, which were acquired slice by slice in interleaved order in this experiment.
Note: Spikes in the data during isolated volume acquisitions are apparent in certain slices, as shown by the bright bands in the sagittal slices (bottom).
c09.indd Sec4:175
175
8/18/09 5:25:34 PM
176
Essentials of Functional Neuroimaging
removed prior to or in the course of analysis, or minimized using trimming procedures, as in FIASCO software. VoxBo software also has good data-surfing capabilities. A popular approach implemented in FSL, FMRISTAT, and specialized packages such as GIFT (see later in this chapter) is to extract principal components or independent components from the whole-brain time series and visualize them. These components are increasingly used for artifact removal (Nakamura et al., 2006; Tohka et al., 2007), though if single-subject inference is desired, care must be taken not to bias the results by removing variance from the data without accounting for it in the statistical analysis. Apart from using the procedures described here, the effects of slow drift, the problem of intersubject registration, and some other artifacts can be minimized using preprocessing and analysis techniques to be described. In the rest of this chapter, we focus on fMRI analysis and briefly describe common preprocessing steps. Other neuroimaging methods, including PET, require different steps than those described here. Preprocessing Steps for Functional Magnetic Resonance Imaging The major steps in fMRI preprocessing are reconstruction, slice acquisition timing correction, realignment, coregistration of structural and functional images, registration or nonlinear warping to a template (also called normalization), and smoothing. Single-subject analyses do not require the warping step, which introduce spatial uncertainty in terms of anatomical locations and thus can provide much higher anatomical resolution. Group studies largely preclude false positives due to fMRI time series artifacts and permit population inference. Some group studies do not employ smoothing to increase spatial resolution. Reconstruction Images must be first reconstructed from the raw MR signal. Raw and reconstructed data are stored in a variety of formats, but reconstructed images are generally composed of a 3-D matrix of data, containing the signal intensity at each voxel or cube of brain tissue sampled in an evenly spaced grid, and a header that contains information about the dimensionality, voxel size, and other image parameters. A popular format is Analyze, also known as AVW, which uses a separate header file and image file for each brain volume acquired. Other formats, such as NIFTI, are also gaining popularity. A series of images describes the pattern of activity over the course of the experiment. It is also common to store images in a 4-D matrix, where the fourth dimension is time. Slice Timing Statistical analysis using a single hemodynamic reference function assumes that all the voxels in
c09.indd Sec4:176
an image are acquired simultaneously. In reality, the data from different slices are shifted in time relative to each other because most BOLD pulse sequences collect data slice by slice—some slices are collected later during the volume acquisition than others. Thus, we need to estimate the signal intensity in all voxels at the same moment in the acquisition period. This can be done by interpolating the signal intensity at the chosen time point from the same voxel in previous and subsequent acquisitions. A number of interpolation techniques exist, from bilinear to sinc interpolations, with varying degrees of accuracy and speed. Sinc interpolation is the slowest, but generally the most accurate. Some researchers do not use slice timing, as it adds interpolation error to the data, and instead use more flexible hemodynamic models to account for variations in acquisition time. Realignment A major problem in most time-series experiments is movement of the subject’s head during acquisition of the time series. When this happens, the image voxels’ signal intensity gets contaminated by the signal from its neighbors. Thus, we must rotate and translate each individual image to compensate for the subject’s movements. Realignment is typically performed by choosing a reference image (popular choices are the first image or the mean image) and using a rigid body transformation of all the other images in the time series to match it, which allows the image to be translated (shifted in the x, y, and z directions) and rotated (altered roll, pitch, and yaw) to match the reference. The transformation can be expressed as a premultiplication of the “target” image spatial coordinates to be altered by a 3 ⫻ 3 affine matrix. The elements of this matrix are parameters to be estimated, and an iterative algorithm is used to search for the parameter estimates that provide the best match between a target image and the reference image. Usually, the matching process is done by minimizing sums of squared differences between the two images. Realignment corrects adequately for small movements of the head, but it does not correct for the more complex spinhistory artifacts created by the motion. The parameters at each time point are saved for later inspection and are often included in the analysis as covariates of no interest; however, even this additional step does not completely remove the artifacts created by head motion. Residual artifacts remain in the data and contribute to noise. Sometimes this noise is correlated with task contrasts of interest, which poses a problem, and can create false results in single-subject analyses. However, because these artifacts are expected to (and typically do) differ in sign and magnitude across subjects, group analysis is valid. Group analyses are usually robust to such artifacts in terms of false positives, but power can be severely compromised if large movement artifacts are present.
8/18/09 5:25:35 PM
Data Analysis: Implementation
Because of these issues, it is typical to exclude subjects that move their heads substantially during the scan. Subject motion in each of the 6 directions can be estimated using the magnitudes of the transformation required for each image during the realignment process, and time series of displacements are standard output for realignment algorithms. There are no hard-and-fast rules for how much movement is too much, but more than 1.5 mm displacement within a scanning session (while the scanner is running continuously) is typically considered problematic, and can usually be avoided with proper instructions to subjects and head restraints. Warping to Atlas (Normalization) For group analysis, each voxel must lie within the same brain structure in each individual subject. Individual brains have different shapes and features, but there are regularities shared by every nonpathological brain, and normalization attempts to register each subject’s anatomy with a standardized atlas space defined by a template brain (see Figure 9.11). Normalization can be linear, involving simple registration of the gross shape of the brain, or nonlinear, involving warping to match local features. In intensity-based normalization, matching is done using image intensities corresponding to gray/white matter/fluid tissue classes. Surface-based normalization uses extracted features such as gyral and sulcal boundaries explicitly (see Thresholding and Multiple Comparisons, earlier in this chapter). Here, we describe nonlinear intensity-based normalization as implemented in SPM software. Whereas the realignment and co-registration procedures perform a rigid body rotation, normalization can stretch and shrink different regions of the image to achieve the closest match. This warping consists of shifting the locations of pixels by different amounts depending on their original location. The function that describes how much to shift the voxels is unknown, but can be described by a set of cosine basis functions. The task is then to search for a set of coefficients (weights of each basis function) that minimize the least squares difference between the transformed image and the template. How closely the algorithm attempts to match the local features of the template depends on the number and spatial frequency of basis functions used. Often, warping that is too flexible (using many basis functions) can produce gross distortions in the brain, as local features are matched at the expense of getting the right overall shape, as shown in Figure 9.11B. This happens essentially because the problem space is too complex, and the algorithm can settle into a “local minimum” solution that is not close to the global optimal solution. Surface-based warping uses similar principles, but matches features on extracted cortical surface representations instead of image intensities.
c09.indd Sec4:177
177
Intersubject registration is one of the largest sources of error in group analysis. Thus, it is important to inspect each normalized brain and, if necessary, take remedial measures. These include manually improving the initial alignment, using a mask to exclude problematic regions of atrophy or abnormality (e.g., a lesion), altering the number of basis functions and other fitting parameters, and in some cases developing specialized template brains (e.g., for children). Figure 9.11C shows a process of checking normalization for one subject. We have identified control points on the MNI ICBM152 template brain (left) that correspond to easily identifiable features. Then, we have taken those points and overlaid them on the subject’s normalized T1 image. For this subject, unlike the pathological case in Figure 9.11B, each of the control points matches the corresponding anatomical feature on the subject’s brain quite well. Such checking can be done in several ways, and though there are no hardand-fast rules for how to check and how much error is too much, each lab should develop a set of standardized procedures. Smoothing Many investigators apply a spatial smoothing kernel to the functional data, blurring the image intensities in space. This is ironic, given the push for higher spatial resolutions and smaller voxels—so why does anyone do it? One reason is to improve intersubject registration. A second reason is that Gaussian random field theory, a popular multiple-comparisons correction procedure, assumes that the variations across space are continuous and normally distributed. However, images are sampled on a grid of voxels, and neither assumption is likely to hold; smoothing can help meet these assumptions. Smoothing typically involves convolution with a Gaussian kernel, which is a 3-D normal probability density function often described by the full width of the kernel at half its maximum height (FWHM) in mm. One estimate of the smoothing required to meet the assumption is a FWHM of 3 times the voxel size (e.g., 9 mm for 3 mm voxels). An important consideration is that acquiring an image with large voxels and acquiring it with small voxels and smoothing an image are not the same thing. The signal-tonoise ratio during acquisition increases as the square of the voxel volume, so acquiring small voxels means that much signal is lost that can never be recovered. It is optimal in terms of sensitivity to acquire images at the desired resolution and not employ smoothing. Some recent acquisition schemes acquire images at the final functional resolution desired, which also permits much more rapid image acquisition as time is not spent acquiring information that would be discarded in analysis (M. Lindquist, Glover, & Shepp, in press).
8/18/09 5:25:35 PM
178
Essentials of Functional Neuroimaging (A)
T1 Template
(B) Normalized T1 in standard space
Normalization gone wrong
Warping High-res T1
Assessing normalization quality (C)
Control points overlaid on normalized T1
Template with control points 100
100 80
80
80
60
60
60
40
40
40
20
20
20
0
0
0
⫺20
⫺20
⫺20
⫺40
⫺40
⫺40 ⫺60
⫺60
⫺60 50
0
⫺50
⫺100
50
0
Figure 9.11 Normalization attempts to register each subject’s anatomy with a standardized template brain using an intensitybased warping procedure. Note: A: A schematic overview of the warping process. High resolution T1 images are warped onto a T1 template to give normalized images in a standard space. B: Incorrect warping can produce gross distortions in
Previously, many investigators applied temporal smoothing to the data as well as spatial smoothing. This procedure is another form of filtering like the high-pass filtering done in the course of model estimation; it removes high-frequency signals from the data, whereas high-pass filtering removes low-frequency signals. This procedure was implemented in SPM99 software (Table 9.3) primarily to facilitate accurate estimation of the degrees of freedom, which was assumed after smoothing to equal that implied by the kernel. This approach has largely been replaced by more standard time series models (e.g., autoregressive modeling). There is no expected benefit to temporal smoothing on sensitivity, as it further decreases the temporal resolution of the data, and it is not recommended. Co-registration Often, high-resolution structural images (T1 and/or T2) are used for warping and localization.
c09.indd Sec4:178
⫺50
⫺100
50
0
⫺50
⫺100
the brain, as local features are matched at the expense of getting the correct overall shape. C: The normalization procedure can be checked by identifying control points on the MNI ICBM152 template brain (left) that correspond to easily identifiable features and overlaying them on the subject’s normalized T1 image. For this subject, each of the control points matches well with the corresponding anatomical feature.
The same transformations (warps) are applied to the functional images, which produce the activation statistics, so accurate registration of structural and functional images is critical. Co-registration aligns structural and functional images, or in general, different types of images of the same brain. Because functional and structural images are collected with different sequences and different tissue classes have different average intensities, using a least squares difference method to match images is often not appropriate. The signal intensity in gray matter (G), white matter (W), and ventricles are ordered W > G > V in functional T2* images, and V > G > W in structural T2 images (Figure 9.1). In such cases, an affine transformation matrix can be estimated by maximizing the mutual information among the two images, or the degree that knowing the intensity of one can be used to predict the intensity of the other (Cover & Thomas, 1991). Typically, a single structural image is co-registered to the first or mean functional image.
8/18/09 5:25:35 PM
Data Analysis: Implementation
Localizing Task-Related Activations with the GLM The GLM is the most common statistical method for assessing task–brain activity relationships in neuroimaging (Worsley & Friston, 1995). GLM is a linear analysis method that subsumes many basic analysis techniques, including t-tests, ANOVA, and multiple regression. The GLM can be used to estimate whether the brain responds to a single type of event, to compare different types of events, to assess correlations between brain activity and behavioral performance or other psychological variables, and for other tests. The GLM is appropriate when multiple predictor variables—which together constitute a simplified model of the sources of variability in a set of data—are used to explain variability in a single, continuously distributed outcome variable. In a typical neuroimaging experiment, the predictors are related to psychological events, and the outcome variable is signal in a brain voxel or region of interest. Analysis is typically “massively univariate,” meaning that the analyst performs a separate GLM analysis at every voxel in the brain, and summary statistics are saved in maps of statistic values across the brain. Because of the hierarchical structure of the data, an appropriate analysis for multisubject PET and fMRI studies is the mixed-effects GLM model. This is often approximated by performing a GLM model for each subject and using the resulting activation parameter estimates in a second-level group analysis. We refer to this as the unweighted summary statistic approach. FSL software currently performs a mixed-effects analysis, whereas the most typical analysis in SPM, AFNI, BrainVoyager, VoxBo, and other packages is the unweighted summary statistic approach. We describe the mechanics of a single-subject analysis and then the mixed-effects approach in the following subsections. Single-Subject GLM Model Basics For a single subject, the fMRI time course or series of PET values from one voxel is the outcome variable (y). Activity is modeled as the sum of a series of independent predictors (x variables, that is, x1, x2) related to task conditions and other nuisance covariates of no interest (e.g., head movement estimates). In fMRI analysis, for each task condition or event type of interest, a time series of the predicted shape of the signal response is constructed, usually using prior information about the shape of the vascular response to a brief impulse of neural activity. The vectors of predicted time series values for each task condition are collated into the columns of the design matrix, X, which contains a row for each of n observations collected (observations over time) and a column for each of k predictors. The GLM
c09.indd Sec4:179
179
fitting procedure estimates the best-fitting amplitude (scaling factor) for each column of X, so that the sums of fitted values across predictors best fit the data. These amplitudes are regression slopes, and are denoted with the variable βˆ (the “hat” denotes an estimate of a theoretical constant value). It also estimates a time series of error values, εˆ, that cannot be explained by the model. The model is thus described by the equation: y ⫽ X  ⫹ε
(9.1)
where β is a k⫻ 1 vector of regression slopes, X is an n⫻k model matrix, y is an n⫻ 1 vector containing the observed data, and ε is an n⫻ 1 vector of unexplained error values. The equation is in matrix notation, so that Xβ indicates the rise and fall in the data explained by the model, or the sum of each column of X multiplied by each element of β. Error values are assumed to be independent and to follow a normal distribution with mean 0 and standard deviation σ. The estimated βˆ s correspond to the estimated magnitude of activation for each psychological condition described in the columns of X. One advantage of the GLM is that there exists an algebraic solution for βˆ that minimizes the squared error: T -1 T βˆ ⫽ (X X) X y
(9.2)
where T indicates the transpose operator. Inference is generally conducted by calculating a t-statistic, which equals the βˆ s divided by their standard errors, and obtaining p-values using classical inference. The standard errors of the estimates are the diagonal elements of the matrix: se(βˆ ) ⫽ (X T X)-1σˆ
(9.3)
Notably, the error term comprises two separate terms from different sources. σ is the residual error variance, which depends on many factors, including scanner noise. (XTX)–1 depends on the design matrix itself, and reflects both the variability in the predicted signal and covariance among preditors (multicolinearity). Design optimization algorithms, described earlier, work on minimizing the design-related component of the standard error: (XTX)–1. An important additional feature of the data requires a further extension of the model. fMRI data are autocorrelated—signals are correlated with versions of themselves shifted in time and are not independent—and the autocorrelation must be removed for valid single-subject inference. This is typically done by estimating the autocorrelation in the residuals, after model fitting, and then removing the autocorrelation by “prewhitening.” Prewhitening works
8/18/09 5:25:36 PM
180
Essentials of Functional Neuroimaging
by premultiplying both sides of the general linear model equation (Equation 9.1) by the square root of a filtering matrix W, that will counteract the autocorrelation structure and create a new design matrix W1/2X and whitened data W1/2y. This process is incorporated into what is known as the generalized least-squares solution, so that: βˆ ⫽ (X T WX)-1X T Wy
(9.4)
Note that the standard errors and degrees of freedom change as well due to the whitening process. Because the estimation of W depends on βˆ , and vice versa, a one-step algebraic solution is not available, and the parameters are estimated using an iterative algorithm. There are many ways of designing W, ranging from estimates that make strong simplifying assumptions about the form of the data, such as the one-parameter autoregressive AR(1) model, to empirical estimates that use many parameters. As with any model-fitting procedure, a trade-off exists between using few and many parameters. Many-parameter models generally produce close fits to the observed data. However, models with few parameters—if they are chosen carefully—can produce more accurate estimates of the underlying true function because they are less susceptible to fitting random noise patterns in the data. Contrasts Contrasts across conditions can be easily handled within the GLM framework. Mathematically, a contrast is a linear combination of predictors. The contrast (e.g., A – B in a simple comparison, or A ⫹ B – C – D for a main effect in a 2 ⫻ 2 factorial design) is coded as a k⫻ 1 vector of contrast weights, which we denote with the letter c. For example, the contrast weights for a simple subtraction is c⫽ [1 – 1]T, while a single contrast for a linear effect across four conditions might be c⫽ [–3 – 1 1 3]T. Concatenating multiple contrasts into a matrix can simultaneously test a whole set. Thus, the main effects and interaction contrasts in a 2 ⫻ 2 factorial design can be specified with the following matrix: C⫽
[1 1 ⫺1 ⫺1
1 ⫺1 1 ⫺1
1 ⫺1 ⫺1 1]
Columns 1 and 2 test main effects, and the third tests their interaction. To test contrast values against a null hypothesis of zero—the most typical inferential procedure—contrast weights must sum to zero. If the weights do not sum to zero, then the contrast values partially reflect overall scanner signal intensity, and the resulting t-statistics are invalid. The analyst must take care to specify
c09.indd Sec4:180
contrasts correctly, as contrast weights in neuroimaging analysis packages are often specified by the analyst, rather than being created automatically as in SPSS, SAS, and other popular statistical packages. The true contrast values Cβ can be estimated using CTβˆ , where βˆ is obtained using Equation 9.2. The standard errors of each contrast are the diagonals of: se (CT βˆ ) ⫽ CT (XT X)-1 XT Cσˆ
(9.5)
The whitening process is omitted here for simplicity, but can readily be incorporated. Most imaging statistics packages write a series of images to disk containing the betas for each condition throughout the brain, and another set of contrast images containing the values of CTβˆ throughout the brain. Contrast images are typically used in a group analysis. A third set of images contains t-statistics, or the ratio of contrast estimates to their standard errors. Assumptions The model-fitting procedure assumes that the effects due to each of the predictors add linearly and do not change over time (the system is linear and timeinvariant). The inferential process assumes that the observations are independent, that they all come from the same distribution, and that the residuals are distributed normally and with equal variance across the range of predicted values. All these assumptions are violated to a degree in at least some brain regions in a typical imaging experiment, which has prompted the development of important extensions, including diagnostic tools and robust model-fitting procedures (Loh, in press; Luo & Nichols, 2003; T. D. Wager, Keller, et al., 2005). Violations of the assumptions are not merely a theoretical nuisance. They can make the difference between a valid finding and a false positive result, or between finding meaningful activations in the brain and wasting substantial time and money. Diagnostic tools have been developed for exploring the data, looking for artifacts, and checking a number of assumptions about the data and model (Loh, 2008; Luo & Nichols, 2003), and like many tools developed by members of the neuroimaging community, they are freely available on the Internet. The quantity of data(e.g.,100,000 separate regressions on 1,000 data points per subject ⫻ 20 subjects) and the software and data structures that support its analysis make it difficult to examine assumptions and check the data, which makes such diagnostic tools all the more important. Another active area of research concerns strategies for dealing with some known violations of assumptions, described later. Violations of independence can be handled in a limited way using generalized least squares. Violations of equality and normality can be dealt with by using
8/18/09 5:25:37 PM
Data Analysis: Implementation
nonparametric permutation tests to make statistical inferences (Nichols & Holmes, 2002), or, if they result from the presence of outliers, by robust regression techniques (T. D. Wager, Keller, et al., 2005). Free implementations of each of these extensions are available (Table 9.3). GLM Model-Building in fMRI Perhaps the most challenging task in linear regression analysis is the creation of realistic predictions of taskrelated signals for the columns of X. PET images integrate across many psychological events, obviating the need for accurate models but also limiting the specificity with which activation can be linked to specific events or time periods. As discussed earlier, a popular method of forming predicted BOLD time series is to use a canonical HRF. The process is shown in Figure 9.9. To build the model, researchers start with an indicator vector representing the neuronal activity for each condition sampled at the resolution of the fMRI experiment, shown at the left of Figure 9.9 for four hypothetical event types (A – D). This vector has zero value except during hypothesized neural activation periods, when the signal is assigned a value of 1. Each indicator vector is convolved with the HRF to yield a predicted time course related to that event, which forms a column of the X. The rightmost panel shows X in image form, a common format for presentation in papers. If the canonical HRF fits the shape of the BOLD response to psychological events, then using the canonical HRF simplifies the analysis and has great sensitivity to detect differences. Consider two psychological events A and B that both activate a voxel, but with different amplitudes, as shown in the top left panel of Figure 9.12. Empirical time courses are shown in light lines, and the fitted responses (model fits) with the canonical HRF are shown in dark lines. The [A – B] contrast will appropriately reflect the different response amplitudes. The canonical HRF is a double-edged sword. If the canonical HRF does not fit, there is at best a drop in power, and at worst false positives and misinterpretation of results (Lindquist & Wager, 2007). Consider an example in which two conditions A and B produce responses of equivalent amplitude, but at different delays. This is shown in the top center panel of Figure 9.12, where the response to B is delayed by 3 s. Since the HRF shape is fixed, any difference in model fits will produce a difference in the only free parameter, amplitude. In this example, the estimated amplitude for A will be greater than for B. Without additional diagnostic tests, one might falsely infer that A activates the brain region more than B. This example illustrates the importance of visualizing the data and fits, rather than simply interpreting a statistically significant result at face value.
c09.indd Sec4:181
181
Comparing groups of individuals (e.g., older versus younger adults, or patients and normal controls) can be especially problematic. If you find [A – B] amplitude differences, are those differences caused by differences in neural activity amplitude or the timing and shape of the vascular component of the BOLD response? Elderly subjects have reduced and more variable shapes of their HRFs compared with younger subjects (D’Esposito, Zarahn, Aguirre, & Rypma, 1999), making direct comparisons with a canonical HRF problematic. Alternate approaches include (a) measuring HRFs in visual and motor cortex for each individual subject using a separate task (Aguirre et al., 1998) or (b) using a more flexible model of the HRF by using a basis set. Basis Sets In the previous discussion, conditions are modeled by a single linear regressor that allows us to estimate only the amplitude of the predicted response (βˆ ) or contrast (CTβˆ ). Alternatively, the same neural indicator vector can be convolved with multiple canonical waveforms and entered into multiple columns of X for a single event type. These reference waveforms are basis functions, and the predictors for an event type constructed using different basis functions can combine linearly to better fit the evoked BOLD responses. An example is shown in the second row of Figure 9.12, in which a linear combination of the canonical HRF and its temporal derivative provide better fits to responses that look similar to the HRF (left panel), are shifted in time (center panel), or have extended activation durations (right panel). This basis set is the most popular current alternative to the canonical HRF alone among users of SPM software (Friston, Glaser, et al., 2002; Friston, Josephs, Rees, & Turner, 1998). Notice that the fits are better, but changes in delay and duration are far from perfectly modeled. The ability of a basis set to capture variations in hemodynamic responses such as those depicted in Figure 9.12 depends on both the number and shape of the reference waveforms. There is a fundamental trade-off between flexibility to model variations and power. This is because each parameter is estimated with error, and flexible models can tend to model noise and thus produce noisier parameter estimates. One of the most flexible models, a finite impulse response (FIR) basis set, contains one free parameter for every time point following stimulation in every cognitive event type that is modeled (Glover, 1999; Goutte, Nielsen, & Hansen, 2000; Ollinger, Shulman, & Corbetta, 2001). Using such a model makes minimal assumptions about the shape of the HRF because the βˆ s estimate the average response at each time point following the onset of an event. The FIR model is a preferred way to estimate and visualize
8/18/09 5:25:37 PM
182
Essentials of Functional Neuroimaging Basis functions True amplitude diff. [A - B] A
True delay diff. [A - B] A
True duration diff. [A - B]
B
Canonical HRF B
Canonical HRF ⫹ temporal derivative
Smooth FIR
Inverse Logit Model
Peri-stimulus Time
Figure 9.12 Basis sets differ in the flexibility they provide in modeling different HRF shapes. Note: Each column in the figure shows HRF estimates for an experiment where two conditions A and B produce responses of: (left) different amplitudes; (center) equivalent amplitude, but at different delays; and (right) equivalent amplitude, but different durations. The ability of four different basis sets to estimate the hemodynamic response function (HRF)
the shape of BOLD responses, and it is implemented in major software packages including AFNI, SPM, and FSL. An example of model fits using a smooth FIR model, which is constrained to produce smooth response functions, is shown in the third row of Figure 9.12. The model fits (dark black lines) fit the data reasonably accurately in all conditions, including those shifted in time (center) and extended in duration (right). Other choices of basis sets include those composed of principal components (Aguirre et al., 1998; Woolrich, Behrens, & Smith, 2004), cosine functions (Zarahn, 2002), radial basis functions (Riera et al., 2004), spectral basis sets (Liao et al., 2002), and other functions. The bottom row in Figure 9.12 shows fitted responses from a basis set recently developed in our lab that uses three superimposed inverse logit functions to model the rise, fall, and undershoot of the BOLD response (Lindquist & Wager, 2007). The model can handle both delays and variations in duration, making a single model appropriate for both brief events and prolonged epochs of stimulation. In addition, fits are as accurate as the FIR model fits for these data, and simulations showed that the model compares favorably with a range of other models in terms of statistical power. The model is freely available (see Table 9.3). Basis sets offer a major advantage—more accurate modeling of the HRF across subjects and across the brain—but they pose additional technical difficulties that make their use less common than perhaps it should be. First, it is not
c09.indd Sec4:182
corresponding to each condition is shown in the four rows. The basis sets that were used are SPM software’s canonical HRF, the canonical HRF ⫹ its temporal derivative, a smooth finite impulse response (FIR) model and the inverse logit model. From “Validity and Power in Hemodynamic Response Modeling: A Comparison Study and a New Approach,” by M. A. Lindquist and T. D. Wager, 2007, Human Brain Mapping, 28, p. 776. Adapted with permission.
straightforward to calculate contrasts across conditions when there are multiple parameter estimates per condition. Leaving out some basis functions when calculating contrasts, though it is often done, is not generally advised. An alternative is to calculate one contrast per basis function for each contrast of interest. Group analysis can then be done using repeated measures analyses at the second level (in group analysis) rather than the usual one-sample t-test. However, there is a cost in power when basis functions are added, and in general whenever more parameter estimates are compared. Physiological Noise and Covariates of No Interest In both PET and fMRI designs, additional predictors are typically added to account for known sources of noise in the data. These nuisance covariates are included to reduce noise and to prevent signal changes related to head movement and physiological (e.g., respiration) artifacts from influencing the contrast estimates. In addition, covariates that implement high-pass filtering, or removal of signal frequencies below a specified cutoff, can also be added at this stage; this is the standard approach in SPM software. In PET, a common covariate is the global (whole-brain) mean signal value for each subject, included to control for differences in amount of radioactive tracer in circulation. In fMRI, the signal typically drifts slowly over time, so that the most power is in the lowest temporal frequencies. This characteristic has prompted the widespread use of
8/18/09 5:25:37 PM
Data Analysis: Implementation
high-pass filters that remove fluctuations below a specified frequency cutoff from the data. High-pass filtering is often performed in the GLM analysis by adding covariates of no interest (e.g., low-frequency cosines). Of course, care must be taken to ensure that the fluctuations induced by the task design are not in the range of frequencies removed by the filter. Design optimization algorithms can take this into account when constructing trial sequences (T. D. Wager & Nichols, 2003). Much of the autocorrelated noise and other noise variance in fMRI may come from aliased physiological artifacts (Lund, Madsen, Sidaros, Luo, & Nichols, 2006). Thus, it is increasingly popular to measure heartbeat and respiration during scanning and to use preprocessing algorithms for removing signals related to measured physiological fluctuations from the data prior to analysis (Glover, Li, & Ress, 2000). Programs for doing this are typically available from authors of research articles, but have not yet been incorporated as standard tools in neuroimaging analysis packages. Group Analysis The analysis described so far has been, for fMRI datasets, an analysis of data from a single subject. However, researchers are often interested in making inferences about a population, not just about a single subject or even a set of individual subjects, which requires a group analysis. Both PET and fMRI studies nearly always involve collecting more than one image per subject, and testing for the significance of effects in a group of subjects. In fMRI, typically, separate GLM analyses are conducted on the time series data for each subject at each voxel in the brain to estimate the magnitude of activation evoked by the task. This is called a “first level” analysis. These estimates are carried forward and tested for reliability across subjects in a “second level” group analysis. In PET, the first level analysis often consists of simple image subtractions, followed by the same type of second level analysis as for fMRI. The unweighted summary statistics approach referred to earlier consists of a simple one-sample t-test across contrast estimates for each subject. This analysis, like others discussed so far, is repeated at each voxel. It can be specified in the GLM framework, so Equations. 9.1 to 9.3 hold, and independence is typically assumed across subjects so no prewhitening is needed. The one-sample t-test for overall activation corresponds to a test of the model intercept in a GLM model. Additional covariates across subjects (e.g., average performance scores) can be specified and tested in simple or multiple regression. Two-sample and ANOVA designs to compare groups and related GLM variants can also be specified. Including covariates can improve statistical power for the test of overall activation, though
c09.indd Sec4:183
183
care must be taken: The significance of the intercept can only be assessed if all other covariates are transformed to have a mean of zero. The unweighted summary statistic approach is valid if the contrast standard error is the same across all subjects, which implies identical design matrices and residual variances. This is rarely if ever true in practice, though the cost is mostly in the statistical power of the analysis and it is still widely used. Full mixed-effects models relax those stringent assumptions by considering the standard errors within each subject as well as contrast estimates. Mixedeffects analyses are standard in FSL and FMRISTAT software (see Mixed versus Fixed Effects, earlier in this chapter, and Table 9.3). Mixed-effects analyses essentially weight subjects when calculating group statistics. The larger a subject’s standard error, the less reliable their estimate, and the less that subject should contribute to the group results. This requires estimating variance components: One component is variance related to within-subject measurement error and model misfitting (σ2W), and another component is variance related to true interindividual differences among subjects (σ2B). Accurate estimation of the relative contribution of error within- and between-subjects allows for appropriate weighting. Restricted maximum likelihood (ReML) is a popular estimate of variance components based on the residuals. Since variance estimates and model fits (βˆ s) are interdependent, iterative algorithms such as EM are used to estimate ReML variance components. Statistical Power and Sample Size Statistical power depends on having either a large effect size (high contrast values) or a small standard error. The standard error in a group analysis is determined by both σ2W and σ2B. At the group level, σ2B can be reduced and power increased by increasing the sample size, more accurate normalization or more informed ROI selection, and increased control of strategies used and individual psychological responses to the task. σ2W can be reduced by improving modeling procedures and reducing acquisitionrelated scanner noise and physiological noise. A key question when designing a group study is determining an adequate sample size. The answer to this question depends on the effect size in the group, the amount of scanner noise and signal optimization, and it will be different for each task and each brain voxel (Desmond & Glover, 2002; Zarahn & Slifstein, 2001). Power analysis is difficult in fMRI because power depends on so many factors relating to psychology, task design and analysis, and hardware; however, by referring to standard effect sizes, you can obtain estimates of what sample sizes are needed in a group analysis.
8/18/09 5:25:38 PM
184
Essentials of Functional Neuroimaging
Figure 9.13 shows plots of power (y-axes) as a function of sample size (x-axes) for three effect sizes in two kinds of analysis. The effect sizes are Cohen’s d values, which is defined as mean activation magnitude divided by its standard deviation, for a simple one-sample t-test in group analysis. In behavioral sciences, d⫽ 0.3, 0.5, and 1 are considered small-, medium-, and large-effect sizes, respectively. Most activations reported in neuroimaging have effect sizes that are substantially larger—d⫽ 2 or more. However, this is partly because voxel-wise mapping capitalizes on chance due to selection bias: Voxels in which chance favors the evidence for activation have large effect sizes and tend to be reported. Whereas observed effect sizes in published reports are usually overestimated due to selection bias, the problem is exacerbated when many tests are performed. We show power curves here for effect sizes of 0.5, 1, and 2. Figure 9.13A shows results for a whole-brain search with 200,000 voxels, a typical number depending on acquisition and analysis choices, and FWE correction at p < .05 using the Bonferroni method. To achieve 80% power with a reasonable sample size, the effect size must be larger than 0.5, and around 40 subjects are required for d⫽ 1 and 18 subjects for d⫽ 2. Figure 9.13B shows the same results using nonparametric permutation testing, which takes into account the spatial smoothness in the data. We used nonparametric thresholds from 10 analyses from various studies reported in Nichols & Hayasaka, (2003) to estimate the effective number of independent
(A)
comparisons and thus power. With nonparametric analysis, around 25 subjects for d⫽ 1 and 11 subjects for d⫽ 2 provides 80% power. Design optimization procedures can be employed before data is ever collected to increase the effect size. For a fixed effect size and sample size, power depends on the within-subject standard error (se(CTβ)), which depends on both the design matrix, X, and the residual standard deviation, σ (Equation 9.5). The latter can be reduced by optimizing data collection (e.g., pulse sequences and hardware) and in the study design by ensuring the engagement of subjects in the tasks. Error related to X can be minimized during experimental design by carefully choosing the number, sequence, and spacing of events to minimize the design-related component of the standard error, CT (XTX)–1C. Effective minimization increases predictor variance and reduces predictor covariance (multicolinearity), and is particularly critical in event-related fMRI. It is possible to build an event-related fMRI design in which even large neuronal effects cannot be detected. For this reason, computer-aided design optimization can be very useful (Buracas & Boynton, 2002; T. D. Wager & Nichols, 2003). Finally, both theory and simulations show that there is a substantial trade-off in power between detecting activation differences between conditions using an assumed HRF shape and estimating the shape of evoked activations with a more flexible model (Liu, Frank, Wong, & Buxton,
(B) Power with nonparametric thresholding
Power with Bonferroni correction 1
1 d2 Power at p 05 corrected
Power at p 05 corrected
d2 0.8 d1 0.6
0.4 d 0.5
0.2
0
0
20
40 Sample Size
60
80
Figure 9.13 A: Power curves—calculated for effect sizes of 0.5, 1, and 2—for a whole-brain search with 200,000 voxels and FWE correction at p < .05 using the Bonferroni method. Note: The number of voxels would be typical of a whole-brain search through gray and white matter with a 2 ⫻ 2 ⫻ 2 mm sampling resolution.
c09.indd Sec4:184
0.8 d1 0.6
0.4 d 0.5
0.2
0
0
20
40 Sample Size
60
80
B: The same power curves calculated based on the results of nonparametric permutation testing, which takes into account the spatial smoothness in the data. Based on the smoothness reported in Nichols and Hayasaka (2003) for 10 different statistic maps, we calculated an average of ˜750 effective independent comparisons. Correction across this number of comparisons was used in calculating power.
8/18/09 5:25:38 PM
Data Analysis: Implementation
2001). This trade-off is shown in Figure 9.14, in which shape-estimation power is shown on the x-axis and contrast-detection power is shown on the y-axis. The points in the model represent designs with different sequences and timing of events. Blocked designs have the highest [A – B] contrast detection power when the canonical HRF is used, but provide little information about the shape of the HRF. M-sequences, or sequences that are orthogonal to themselves shifted in time, provide optimal shape estimation power (the nonoptimality in the figure is due to truncation of the m-sequences—so they are not perfect), but low detection power (Buracas & Boynton, 2002). Random event-related designs fall somewhere in between. As the figure shows, designs optimized with a genetic algorithm (T. D. Wager & Nichols, 2003) can produce substantially better results than random designs on both measures. Bayesian Inference Bayesian methods have received a great deal of attention in fMRI literature. These inferential methods are now key 25 Block design 169 s on/off
Approximate theoretical limit
Contrast detection power Cor(1)(z)
20
Optimized designs (GA)
15
Event-related designs
185
components in several major fMRI analysis software packages (e.g., SPM and FSL). A full treatment of Bayesian methods is beyond the scope of this chapter, but an excellent overview can be found in Gelman et al. (2004). A key difference from the frequentist approach discussed previously (which subsumes classical inference in the GLM and its extensions) is that Bayesian analysis combines evidence from the data through priors—beliefs about the data specified as probabilities prior to data collection—to yield posterior probability values. This can be a big advantage in that estimates from data (e.g., of HRF shapes) can be easily regularized based on known information from other studies. Such prior constraints are also possible in frequentist analyses, though they require modifications and special procedures; lasso, ridge regression, and robust regression are examples. If you do not want to impose strong prior beliefs, then it is possible to use noninformative priors, which is implemented in the Bayesian approach in FSL software (Woolrich, Behrens, Beckmann, et al., 2004). For the single-level model, this leads to parameter estimates that are equivalent to those obtained using classical inference. Another way to choose prior beliefs is by estimating them from data. This is the “empirical Bayes” approach. It is a hybrid between classical and Bayesian inference that can provide some regularization without biasing the results of hypothesis tests, and is used in SPM software (Friston, Glaser, et al., 2002; Friston, Penny, et al., 2002).
10
Assessing Brain Connectivity 5 m-sequence designs 0 3.4
3.5
3.6 3.7 3.8 4 3.9 HRF shape estimation power
4.1
4.2
Figure 9.14 The trade-off between contrast detection (y-axis) and hemodynamic response function (HRF) shape estimation power (x-axis), and the performance of different types of designs on each. Note: Power on each axis is expressed here in terms of z-scores in a simulated group analysis (n⫽ 10, effect sizes estimated from visual cortex data in Wager et al., 2005). The double-circle shows a block design with roughly optimal task alternation frequency (16 s/task). The dark circles show power for a number of randomized event-related designs with roughly optimal parameters under linear modeling assumptions (randomized sequences with a stimulus every 2 s). The dark squares show truncated m-sequence designs with the same parameters as the randomized design. The open circles show results for genetic algorithm (GA) optimized designs with the same parameters. Each circle represents the results of one run of the optimization routine with different user-specified detection/shape estimation trade-off settings.
c09.indd Sec4:185
Human brain mapping has been primarily used to provide maps that show which regions of the brain are activated by specific tasks. There has been an increased interest in augmenting this type of analysis with connectivity studies that describe how various brain regions interact and how these interactions depend on experimental conditions. It is common practice in the analysis of neuroimaging data to make the distinction between functional and effective connectivity (Friston, 1994). Functional connectivity is defined as the undirected association between two or more fMRI time series, while effective connectivity is the directed influence of one brain region on the physiological activity recorded in other brain regions; it implies both causality and directness. It implies causality because the models used to assess effective connectivity are usually directional, and directness in the sense that effective connectivity measures attempt to partial out indirect influences from other regions. Functional connectivity is a statement about observed associations among regions and other performance and physiological variables such as the correlation between time
8/18/09 5:25:39 PM
186
Essentials of Functional Neuroimaging
series in two regions (bivariate connectivity). Simple functional connectivity analyses usually compare correlations between ROIs, sometimes in a task-dependent fashion, or between a “seed” region of interest and voxels throughout the brain. Multivariate analysis methods are also used to reveal networks of multiple interconnected regions. Popular methods include Principal Components Analysis (PCA; Andersen & Avison, 1999), Partial Least Squares (PLS; McIntosh, Bookstein, Haxby, & Grady, 1996) and Independent Components Analysis (ICA; Calhoun, Adali, Pearlson, & Pekar, 2001a,b; McKeown & Makeig, 1998). Connectivity between two or more regions may result from direct influences (functional links between regions) or indirect effects due to common input from a third variable. None of these methods can address issues of causality or the common influences of other variables. Functional connectivity methods can be applied at different levels of analysis, with different interpretations at each level (see Figure 9.15). Connectivity across time series data can reveal networks that are dynamically coactivated over time (either intrinsically, regardless of task state, or in a task-dependent fashion), and is closest to the concept of communication among regions, though it does Time series/trial levels Region 1 Region 2
ˆ 1
Subject 1
not conclusively demonstrate that. Connectivity across single-trial response estimates (Rissman et al., 2004) can identify coherent networks of task-related activations. Whereas these levels are only accessible to fMRI and EEG/MEG, which provide relatively rich time series data, other levels of analysis may be examined in PET studies. Connectivity across subjects can reveal patterns of coherent individual differences, which may result from communication among regions but also from differences in strategy use or other genetically determined or learned differences among individuals. Finally, connectivity across studies can reveal tendencies for studies to coactivate within sets of regions, which may be influenced by any of the factors previously mentioned, and also differences among tasks or other study-level variables. An example is the finding that studies in which posttraumatic stress disorder (PTSD) patients showed increased amygdala activity tended to be the same studies in which patients showed decreased activation of the medial frontal cortex (Etkin & Wager, 2007). Regardless of the level of analysis, functional connectivity analyses can be useful for understanding that brain activations are part of coherent patterns that are separate, independent effects of task manipulations. Subject level Region 1 Region 2
Correlate magnitudes within condition or differences across conditions ˆ ID
ˆ 2 Subject 2 ˆ n
Subject n ⫹ ⫹ Artifacts Gradient drift Shot artifacts (spikes) Physiological effects Movement-related artifacts Arousal fluctuations
⫹ ⫹ Trial averaging/ model fitting Individual differences in . . . ⫺ ⫹ Neurovascular coupling differences Time series Trial resp. amplitudes
Figure 9.15 Functional connectivity methods can be applied at different levels of analysis, with different interpretations at each level. Note: (Left) Connectivity across time series data can reveal networks that are dynamically coactivated over time. The solid, dotted, and dashed lines indicate activation time series from three different subjects on the left, and average activation magnitudes for the same subjects (shown by hemodynamic response function [HRF] curves) at the right. Alternatively, measures of single-trial activation amplitude (black dots) can be extracted and used to estimate connectivity, which avoids some ambiguity with respect to the source of connectivity (task-related versus spontaneous).
c09.indd Sec4:186
Artifacts Hematocrit, CO2 Vascular response (hemodyamic model fit) Gray-matter density Alertness
Brain system recruitment Strategy Performance Genetics
However, artifactual influences can make interpretation of both types of connectivity difficult (see list at the bottom). Distributed artifacts tend to create positive covariance, whereas neurovascular coupling differences —and resulting differences in HRF shapes between regions—tend to weaken covariance estimates. (Right) At the subject level, you can correlate magnitudes within condition or differences across conditions. This analysis is conducted on individual differences, rather than on time series data, and results may have different interpretations than time series connectivity data. Again, special care needs to be taken to limit the influence of artifacts, which are likely to be largely related to factors that create individual differences in model fits across the brain (see list).
8/18/09 5:25:39 PM
Data Analysis: Implementation
Generally, activation is only informative if it is restricted to specific brain regions (e.g., activation of the insula means little if every other brain region is activated to the same degree). Likewise, demonstrating that connectivity is greater within a set of regions than among other regions (e.g., for the cognitive control network of Cole & Schneider, 2007, or for demonstrating two or more separable sets of interconnected regions such as the multiple separate networks of coherent opioid release reported by Wager, Scott, & Zubieta, 2007) can provide valuable information about how brain regions function together. Demonstrating specificity of functional connectivity to a particular task state, as the psychophysiological interaction (PPI)/moderation analysis to be described later is designed to do, can be informative about how functional connectivity relates to psychological states. Reporting reciprocal activity (negative correlations) between ventromedial PFC and amygdala may be of limited usefulness if such correlations can be found in any task state; in that case, they may be a general feature of BOLD physiology or vasculature rather than an interesting instance of communication among brain regions. In contrast, effective connectivity analysis is modeldependent. Typically, a small set of regions and a proposed set of connections are specified a priori, and tests of fit are used to compare a small number of alternative models and assess the statistical significance of individual connections. Because connections may be specified directionally (with hypothesized causal influences of one area on another), the model implies causal relationships. Because there are many possible models, the choice of regions and connections must be anatomically motivated. Most effective connectivity depends on two models: a neuroanatomical model that describes which areas are connected, and a mathematical model that describes how areas are connected. Common methods include Structural Equation Modeling (SEM; McIntosh & Gonzalez-Lima, 1994) and Dynamic Causal Modeling (DCM; Friston, Harrison, & Penny, 2003). While effective connectivity methods have become increasingly popular, it is important to keep in mind that the conclusions about direct influences and causality obtained using these models are only as good as the specified models. Any misspecification of the underlying model will almost certainly lead to erroneous conclusions. In particular, the exclusion of important lurking variables (brain regions involved in the network but not included in the model) can completely change the fit of the model and thereby affect both the direction and strength of the connections. Great care always needs to be taken when interpreting the results of these methods. The distinction between functional and effective connectivity is not entirely clear (Horwitz, 2003). If the discriminating
c09.indd Sec4:187
187
features are (a) a directional model in which causal influences are specified; and (b) the willingness to make claims about direct versus indirect connections, then many analyses, including multiple regression, might count as effective connectivity. Indeed, the PPI analysis referred to is typically described as an effective connectivity model, but it tests an interaction effect using linear regression (whether the slope of the linear association between two variables depends on the level of a third, moderating variable). The three-variable PPI model is actually a simple SEM though the criterion of assessing direct effects is not met, since no common indirect influences are accounted for. In the end, the difference between this model and more complicated SEMs is one of scale, and direct effects in any SEM can only be properly assessed if all relevant “3rd variables” have been included in the model and their connections modeled appropriately. While many researchers use both SEM and DCM with the goal of ascribing causality between different brain regions, the tests performed in both techniques are based on model fit rather than on the causality of the effect. Similarly, Granger causality (Roebroeck, Formisano, & Goebel, 2005) is another approach that is typically considered to test effective connectivity, though neither causal influences nor direct versus indirect effects are tested within the basic model framework. Causality is tested strictly in the sense of temporal relationships, rather than on whether activity in a brain region is necessary or sufficient for activity in another. In the end, it is not the label of “functional” or “effective” that is important, but the specific assumptions and robustness and validity of inference afforded by each method. When performing connectivity and correlation studies, it is tempting to make statements about causal links between different brain regions. The idea of causality is a very deep and important philosophical issue (Pearl, 2000; Rubin, 1974). A cavalier attitude frequently is taken in attributing causal effects, and the differentiation between explanation and causation is often blurred. Properly randomized experimental designs permit causal inferences of task manipulations on brain activity. In neuroimaging and EEG/MEG studies, all the brain variables are observed, and none are manipulated. We do not recommend making strong conclusions about causality and direct influences among brain regions using these methods because it is difficult to verify the validity of such conclusions. The combination of neuroimaging and TMS or related forms of brain stimulation (Bohning et al., 1997) may provide more reliable causal inferences about the effects of activating one brain region on another. By stimulating the brain, experimental manipulation of one brain area can be achieved and its causal effects on other brain regions thus examined. The problem
8/18/09 5:25:39 PM
188
Essentials of Functional Neuroimaging
remains of assessing which effects are direct as opposed to mediated by other intervening regions. Bivariate Connectivity Functional connectivity is a statement about the observed associations among regions and other performance and physiological variables. The simplest approach toward functional connectivity is to simply calculate the crosscorrelation between time series from two separate brain regions. The results can be used to determine whether the changes in activity in these regions are related to each other in a linear manner. This idea is expanded in seed analysis (Cordes et al., 2000; Della-Maggiore et al., 2000), where the cross-correlation between the time course from a predetermined region or cluster (the seed region) and all other regions of the brain is calculated. This allows researchers to search the brain for other regions that are positively (or negatively) correlated with the activity pattern found in the seed region. In addition to standard statistical assumptions, time series connectivity typically assumes that the connectivity is instantaneous, meaning that the time constants for neuronal and vascular effects are the same for each pair of regions, and the impulse response functions are thus the same. This assumption is often likely to be violated, and several approaches have been taken to account for variability in the neuronal activity—fMRI signal coupling, such as multivariate autoregressive modeling (Harrison, Penny, & Friston, 2003; Kim, Zhu, Chang, Bentler, & Ernst, 2007). Granger causality, a kind of autoregressive model discussed in more detail later, is a promising approach toward relaxing this assumption. Whatever method is used, functional connectivity is meaningful only to the degree that it is not driven by artifacts related to image acquisition and physiological noise; some artifactual influences are listed in Figure 9.15. Another approach that helps minimize issues of interregion neurovascular coupling differences and artifacts (but does not eliminate them) is the beta series approach (Rissman et al., 2004). In this technique, correlations are not estimated directly from the time series data. Instead, you obtain trial-by-trial estimates of event-related activity within the standard GLM framework. These trial-level activation parameter estimates (called beta values) are correlated across regions to obtain a measure of functional connectivity during each of the individual task components. Component Analysis: Principal Components Analysis, Independent Components Analysis, and Partial Least Squares Multivariate methods model brain imaging data by decomposing a large dataset (e.g., 1,000 time points ⫻ 100,000
c09.indd Sec4:188
voxels ⫻ 20 subjects) into a smaller set of components and a series of weights. The components may be canonical patterns of activity across time and the weights their distribution across brain space, or vice versa. Principal components analysis (PCA), independent components analysis (ICA), and partial least squares (PLS) variations on this theme. These and related multivariate methods—canonical variates analysis (CVA), factor analysis, ordinal trends analysis (Habeck et al., 2005), and the multivariate linear model (MLM; Kherif et al., 2002)—are becoming an increasingly important part of the neuroimaging analyst’s toolbox. They all share the common core idea of decomposing the data into simpler components that maximize the variability explained by the model. The approaches differ in the criteria used to select components, and in whether the experimental design is included as part of the data to be modeled (inclusion is a defining feature of PLS). Each technique described in this section involves decomposing a data matrix, Y, into a set of spatial and temporal components. Let us define Y to be a t⫻v matrix, where t is the number of time points and v the number of voxels. Each column of Y is therefore a time series corresponding to one voxel in the brain, and each row is the collection of voxels that make up an image at a specific time point. Principal Components Analysis (PCA) decomposes the data matrix, Y, by finding linear combinations of time series, each of which make up a column in a matrix U (also of dimension t⫻v), such that each column of U is uncorrelated with every other column of U. The columns of U, called components, are arranged in order of variance explained: The first component explains the most variance possible in Y, the second component explains the maximal amount of remaining variance, and so forth. Together with their associated spatial maps and variances (to be described), these v components perfectly reproduce the data, but most of the total variance is usually captured in just the first few components of U. Thus, the first components can be considered a compressed representation of the data. Because each component is a weighted sum across time series of different voxels, another matrix V (of dimension voxel ⫻ component [v⫻v]) contains columns of voxel weights to create each component in U. The first column of V shows how to weight each of the v voxel time series to capture the most variance in Y and represents the spatial distribution of the first component. Thus, the columns of U are the temporal components (the “canonical” time series) and those of V are the spatial components (the maps across brain voxels) of these time series. In neuroimaging, the components are usually calculated through singular value decomposition (SVD) of the centered (mean-zero) data. SVD is a numerical technique that
8/18/09 5:25:40 PM
Data Analysis: Implementation
decomposes a data matrix, Y, into three simpler matrices (zeros make up at least half of the new matrices), while still representing the original data. In the case of neuroimaging data, these matrices can be interpreted as temporal components U and spatial components V such that: Y ⫽ USV ⌻
(9.6)
With centered (mean-zero) data, S is a diagonal matrix (only the diagonal elements are nonzero) whose entries are the “singular values,” the sums of squared deviations explained by each component. These are related to the eigenvalues such that λ ⫽ S2 /(t– 1). The columns of V are the eigenvectors, as in the eigendecomposition described earlier, and U*S are the component scores (components scaled by the amount of variability they explain), equal to Y*V in the eigendecomposition. The power of this technique lies in that the eigenvectors are orthogonal to each other. By decomposing the data into its eigenvectors and eigenvalues, we obtain a set of components (whether temporal or spatial) that are uncorrelated with each other. Furthermore, we also obtain coefficients of how heavily those components are represented in the original data. A thorough treatment of eigenvectors, eigenvalues, and SVD is provided by Strang (1980). Once you grasp the central idea of data decomposition into spatial and temporal components, you can understand many other techniques, such as ICA, as variations on this theme. Rather than maximizing the variance explained by each additional, orthogonal component, ICA components are chosen to maximize the statistical independence of the components in a more general sense. The components are not required to be orthogonal; rather, the constraint is that they be independent. The distribution of one component cannot be predicted from the values of the other, or more formally, the joint probability P(A,B) of components A and B is equal to P(A)P(B). In the Infomax variant of ICA, mutual information between components—a general measure of dependence that does not require the relationships between components to be linear or monotonic—is minimized (McKeown, 1998). ICA assumes that the data, Y, are a weighted sum of source signals (time series) contained in the source matrix X. The data Y is a linear mixture of these source components described by the weighting or mixing matrix of spatial weights M: Y ⫽ MX
(9.7)
Since both M and X are both unknown, there is no algebraic solution, so iterative search algorithms are used to estimate both M and X. An alternative decomposition is to transpose the data matrix and treat the spatial components as
c09.indd Sec4:189
189
sources and the temporal components as mixing weights. (For more details, see McKeown & Sejnowski, 1998; Petersson, Nichols, Poline, & Holmes, 1999). At first glance, it appears close to impossible to solve Equation 9.7 for both M and X simultaneously. However, ICA makes crucial assumptions that allow you to obtain a solution. The main assumptions are that the data set consists of p statistically independent components, where at most one component is Gaussian. The independence assumption entails that the activations do not have a systematic overlap in time or space, while the non-Gaussiantity assumption is required for the problem to be well defined. In addition, it is assumed that the mixing matrix, M, is both square and invertible, which implies that the independent components can be expressed as a linear combination of the data matrix. Both PCA and ICA reduce the data to a simpler (lowerdimension than that of the v voxels) space by capturing the most prominent variations across the set of voxels. The components may reflect signals of interest or they may alternatively be dominated by artifacts, and it is up to the user to determine which are of interest (e.g., task-related). Both ICA and PCA assume all variability results from signal, as noise is not included in the model formulation. One issue involved with interpreting the results of an ICA analysis is that the sign of the independent components cannot be determined. In addition, the order of importance of the independent components cannot be determined. Therefore, it is necessary to sift through all the components to search for ones that are task-related or otherwise of interest. There is also no guarantee that a specific number of components can be used to explain most of the variation as is the case in PCA. A popular variant in the social sciences literature is factor analysis, which additionally fits a parameter for the noise variance at each voxel. A disadvantage of factor analysis is that the solution is rotationally indeterminate, and thus a number of combinations of spatial and temporal components can explain the same variability in the data. While both ICA and PCA are not rotationally indeterminate, there is some question as to what the “right” rotation is (in PCA it is determined by the amount of variance explained, which is not an index of meaningfulness since artifacts can create much variance). Interpreting thresholded component maps, as is commonly done, depends critically on establishing a rotation that is meaningful and reliable across studies. Multisubject Extensions As described so far, these techniques model only a single subject’s data. In a group study, there is the additional complexity of making population inference. It is not correct to treat all the data as coming from one supersubject and decomposing the group data matrix, for the same reasons that fixed effects
8/18/09 5:25:40 PM
190
Essentials of Functional Neuroimaging
analyses in the GLM are not appropriate. One approach is to decompose the group matrix, and subsequently “backreconstruct” or estimate spatial weights for each subject for a component of interest (Calhoun, Adali, Pearlson, & Pekar, 2001a). The spatial weights at each voxel across subjects are treated as random variables, and one-sample t-test is conducted to test whether that voxel loaded significantly on that component in the group. This approach is implemented in the Group Analysis of Functional Imaging toolbox (GIFT; Table 9.3). Another approach, called tensor ICA, is to use a three-way data decomposition with the group data to estimate temporal components and weights for each subject and each voxel (Beckmann & Smith, 2005). The subject weights at each voxel are then tested for significance. This approach is similar to related PCA-based techniques of PARAFAC (Bro, 1997) and INDSCAL/ALSCAL (Young, Takane, & Lewyckyj, 1978). It is implemented in the ICA tool (called MELODIC) in FSL software (Table 9.3). Structural Equation Modeling Structural equation modeling (SEM) has a rich history in the social sciences literature (Bollen, 1989). It was first applied to imaging data by McIntosh and Gonzalez-Lima (1994). In SEM, the emphasis lies on explaining the variance-covariance structure of the data. While SEM allows for the inclusion of latent variables (which is one of its major selling points in the social sciences), this option is not typically used by the neuroimaging community. An SEM without latent variables is typically called path analysis, but in this chapter we refer to methodology by the name structural equation modeling as this is the common practice in the neuroimaging literature. Structural equation models comprise a set of a priori determined regions and directed connections between these regions. A causal relationship is attributed a priori to the connections where an arrow from A to B implies that A causes B. Further path coefficients are defined corresponding to each link that represents the expected change in activity of one region given a unit change in the region influencing it. The path coefficient indicates the average influence across the time interval measured. Algebraically, we can express an SEM model as Y ⫽ MY ⫹εε
(9.8)
where Y is the data matrix, M is a matrix of coefficients that reflect the linear relationship between regions and e is independent and identically distributed normal noise. Typically, this model is rewritten: Y ⫽ (I ⫺ M)⫺1ε
c09.indd Sec4:190
(9.9)
where I represents the identity matrix. The solution of the unknown coefficients in M is obtained by studying the empirical covariance matrix of Y. Like ICA, solving this model is not straightforward and typically users resort to iterative techniques. The covariance of the data represents how the activities in two or more regions are related. In SEM, we seek to minimize the difference between the observed covariance matrix and the one implied by the structure of the model. The parameters of the model are adjusted to minimize the difference between the observed and modeled covariance matrix. All inferences about the path coefficients rest on nested or stacked models. A hypothesis test on a single path coefficient may be performed by comparing the full model, with all path coefficients estimated, with a nested model in which the coefficient of interest is constrained to be zero.2 The two models are compared using a likelihood ratio test (LRT)—a statistical test of the goodnessof-fit between two models—to test whether a nonzero coefficient results in a significantly better model fit, and thus whether the coefficient is reliably different from zero. The LRT is only valid if it is used to compare nested models; that is, the more complex model must differ from the simple model only by the addition of one or more parameters. A similar approach can be taken when making inferences about changes in connectivity between different experimental conditions. This is done by first partitioning the data according to the different experimental conditions. Next, two models are specified. In the null model, path coefficients are constrained to be equal across conditions, and in the alternative model, coefficients of interest are allowed to vary. The LRT is used to test whether there is any significant difference between the models. If a significant difference exists, we reject the hypothesis that the path coefficients are equal in both conditions and a condition-dependent effect is declared. SEM makes some assumptions in setting up the model formulation. The data is assumed to be normally distributed and independent from sample to sample. An important consequence of the assumptions is that SEM discounts temporal information. Consequently, permuted data sets produce the same path coefficients as the original data, which is a weakness. The assumption of independence is violated in the analysis of a single subject. When looking at the individual differences level, this assumption is more reasonable.
2
Or another test value of interest.
8/18/09 5:25:40 PM
Data Analysis: Implementation
Dynamic Causal Modeling The measurements used in each of the connectivity approaches described so far are hemodynamic, and this limits the scope of the interpretation that can be made at the neuronal level. Dynamic causal modeling (Friston et al., 2003) is an attempt to move the connectivity analysis from the hemodynamic to the neuronal level. DCM uses standard linear systems analyses techniques, namely statespace design (Franklin, Workman, & Powell, 1997), and treats the brain as a deterministic nonlinear dynamic system that is subject to inputs and produces outputs. It makes inference about the coupling among brain areas and how the coupling is influenced by changes in experimental context. DCM models interactions at the neuronal rather than the hemodynamic level and is therefore more biologically accurate than many other models. However, the hemodynamic properties of the system must also be taken into account, as they can confound the measurements (e.g., a vascular delay could be interpreted as a neuronal delay). DCM is based on a neuronal model of interacting cortical regions, supplemented with a forward model describing how neuronal activity is transformed into the measured hemodynamic response. Effective connectivity is parameterized in terms of the coupling among unobserved neuronal activity in different regions. We can estimate these parameters by perturbing the system and measuring the response. Experimental inputs cause changes in effective connectivity at the neuronal level that in turn cause changes in the observed hemodynamics. DCM uses a bilinear model for the neuronal level and an extended balloon model (Buxton, Wong, & Frank, 1998) for the hemodynamic level. In a DCM model, the user specifies a set of experimental inputs (the stimuli) and a set of outputs (the activity in each region for each region). The task of the algorithm is then to estimate the parameters of the system, in this case, the “state variables.” Each region has five state variables; four correspond to the hemodynamic model and the fifth corresponds to neuronal activity. The estimation process is then carried out using Bayesian statistics: Normal priors are placed on the model parameters and an optimization scheme is used to estimate parameters that maximize the posterior probability. The posterior density is then used to make inferences about the significance of the connections between various brain regions. DCM is computationally demanding and is limited to eight regions in the current implementation of SPM. Granger Causality The main problem with methods such as SEM and DCM is that any misspecification of the underlying model will lead to erroneous conclusions. Granger causality takes a
c09.indd Sec4:191
191
different approach to the problem. The technique was originally developed in economics (Granger, 1969) and has recently been applied to connectivity studies (Roebroeck et al., 2005). The benefit of Granger causality is that it does not rely on any a priori specification of a structural model, but rather is an approach for quantifying the usefulness of past values from various brain regions in predicting values in other regions. Granger causality provides information about the temporal precedence of relationships among two regions, but it is in some sense a misnomer because it does not actually provide information about causality. It is true that one variable (x) may precede a correlated variable (y) because x causes y. For example, hitting a baseball causes flight. However, there may be no causal relationship at all: A rooster may crow (x) every morning just before the sun rises (y), but it does not cause the sun to rise. For purposes of economic forecasting for which the technique was developed—or for making predictions based on fMRI data—the actual causal relationships may not matter, and Granger “causality” may be sufficient to be informative. However, it should not be taken as a measure of true causality. To illustrate the method, let x and y be two time courses of length N extracted from two brain regions or voxels. Each time course is modeled using a linear autoregressive model3 of the Mth order (where M≤N– 1), that is: M
x [ n ] ⫽ ∑ a [ i ]x [ n − m ] ⫹ εx [ n ] m⫽1
(9.10
M
y [ n ] ⫽ ∑ b [ i ] y [ n − m ] ⫹ εy [ n ] m⫽1
where both ex and ey are defined to be white noise. The vectors a and b are coefficients that describe how the current values of the time course depends on its past, and therefore it is clear from this formulation that both time courses depend immediately on their own past M values. As a second step of the analysis, we can expand each time course’s model using the autoregressive terms from the other signal. These additional autoregressive terms correspond to the directed influence (previous history) and not to the instantaneous signal; they can be written in the format: value_now ⫽ self_history ⫹ other_history ⫹ error More formally, the equations in our example can be expressed as: 3
Autoregressive models are used to represent processes whose “current” values can be written as a function of their own past values. The order of the model specifies how many steps back into the past the specified function goes.
8/18/09 5:25:41 PM
192
Essentials of Functional Neuroimaging M
M
m⫽1
m⫽1
M
M
m⫽1
m⫽1
x [ n ] ⫽ ∑ a [ i ] x [ n − m ] ⫹ ∑ b [ i ] y [ n − m ] + εx [ n ] y [ n ] ⫽ ∑ b [ i ] y [ n − m ] ⫹ ∑ a [ i ] x [ n − m ] εy [ n ]
(9.11)
In this formulation, the current value of both time courses is assumed to depend both on the past M values of its own time course, but also the past M values of the other time course. By fitting each of these models (Equations 9.10 to 9.13), we can perform tests to determine whether the previous history of x has predictive value of the time course y (and vice versa). If the model fit is significantly improved by the inclusion of the cross-autoregressive terms, it provides evidence that the history of one of the time courses can be used to predict the current value of the other and a “Granger-causal” relationship is inferred. To test the influence between the two regions, we compare the fits to the model for each time course both with and without the additional “cross-autoregressive” terms (Roebroeck et al., 2005). The ratio of error sums of squares obtained from these fits are used to define a measure of the linear-directed influence from x to y, which is denoted Fx → y. If past values of x improve on the prediction of the current value of y, then Fx → y is large. A similar interpretation, but in the opposite direction, holds for Fy → x, which is defined in an analogous manner. The difference between these two terms can be used to infer which region’s history is more influential on the other. This difference is referred to as “Granger causality.” From this definition, it is clear that the idea of temporal precedence is used to identify the direction and strength of causality from information in the data. While it can reasonably be argued that temporal precedence is a necessary condition for causation, it is certainly not a sufficient condition. Therefore, to directly equate Granger causality and causality is a large leap of faith.
SUMMARY fMRI and PET are techniques for imaging the functioning human brain with increasingly precise temporal, spatial, and “neurochemical” resolution. Their major impact on psychology and neuroscience is to help establish a physical grounding in the brain for psychological concepts. However, inferences about psychological processes from brain activity are tricky to make. New kinds of analyses, such as pattern classification and meta-analysis, are helping researchers make inroads into this problem in limited ways. A second advantage of neuroimaging is that it allows researchers to study brain processes directly in humans,
c09.indd Sec4:192
allowing links to be forged across animal and human neuroscientific research. Ultimately, this promotes the ability to integrate across fields and take advantage of a wealth of knowledge available from animal systems by establishing parallels with human brain processes. An obstacle in this process is the complexity of managing and processing neuroimaging data. To overcome this obstacle, researchers have developed a wide variety of tools, most of which are freely available to the research community, and centralized resources for distributing these tools over the internet. Finally, a third advantage of neuroimaging is that it allows full coverage of the brain, including—paticularly with fMRI—dynamic measures of activity every few seconds or less. This combination of spatial and temporal coverage is not available using other techniques, including those commonly used in animal models. This makes fMRI a particularly good tool for studying functional brain connectivity and large-scale distributed networks. A wide and growing array of connectivity analyses are being developed to exploit this feature, and the combination of advances in data collection, analysis methods, and methods for making psychological inferences are making fMRI a uniquely useful tool for neuroscientists.
REFERENCES Aguirre, G. K., Singh, R., & D’Esposito, M. (1999). Stimulus inversion and the responses of face and object-sensitive cortical areas. NeuroReport, 10, 189–194. Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). The variability of human, BOLD hemodynamic responses. NeuroImage, 8, 360–369. Amunts, K., Kedo, O., Kindler, M., Pieperhoff, P., Mohlberg, H., Shah, N. J., et al. (2005). Cytoarchitectonic mapping of the human amygdala, hippocampal region and entorhinal cortex: Intersubject variability and probability maps. Anatomy Embryolgy, 210, 343–352. Amunts, K., Schleicher, A., & Zilles, K. (2007). Cytoarchitecture of the cerebral cortex: More than localization. NeuroImage, 37, 1061–1065. Andersen, A. H. G. D., & Avison, M. J. (1999). Principal component analysis of the dynamic response measured by fMRI: A generalized linear systems framework. Magnetic Resonance in Medicine, 17, 785–815. Andersson, J. L., Hutton, C., Ashburner, J., Turner, R., & Friston, K. (2001). Modeling geometric deformations in EPI time series. NeuroImage, 13, 903–919. Aron, A., Fisher, H., Mashek, D. J., Strong, G., Li, H., & Brown, L. L. (2005). Reward, motivation, and emotion systems associated with earlystage intense romantic love. Journal of Neurophysiology, 94, 327–337. Ashburner, J., & Friston, K. J. (2000). Voxel-based morphometry: The methods. NeuroImage, 11, 805–821. Ashburner, J., & Friston, K. J. (2005). Unified segmentation. NeuroImage, 26, 839–851. Beckmann, C. F., Jenkinson, M., & Smith, S. M. (2003). General multilevel linear modeling for group analysis in FMRI. NeuroImage, 20, 1052–1063. Beckmann, C. F., & Smith, S. M. (2005). Tensorial extensions of independent component analysis for multisubject fMRI analysis. NeuroImage, 25, 294–311.
8/18/09 5:25:41 PM
References 193 Behrens, T. E. J., Berg, H. J., Jbabdi, S., Rushworth, M. F. S., & Woolrich, M. W. (2007). Probabilistic diffusion tractography with multiple fibre orientations: What can we gain? NeuroImage, 34, 144–155.
Cheng, K., Waggoner, R. A., & Tanaka, K. (2001). Human ocular dominance columns as revealed by high-field functional magnetic resonance imaging. Neuron, 32, 359–374.
Bendriem, B., & Townsend, D. W. (1998). The theory and practice of 3D PET (Vol. 32). Boston: Kluwer Academic.
Cole, M. W., & Schneider, W. (2007). The cognitive control network: Integrated cortical regions with dissociable functions. NeuroImage, 37, 343–360.
Benjamini, Y. A. H. Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B, 57, 289–300. Bernstein, M. A., King, K. F., & Zhou, Z. J. (2004). Handbook of MRI pulse sequences. Burlington, MA: Elsevier Academic Press.
Collins, D. L., Neelin, P., Peters, T. M., & Evans, A. C. (1994). Automatic 3D intersubject registration of MR volumetric data in standardized Talairach space. Journal of Computer Assisted Tomography, 18, 192–205.
Birn, R. M., Saad, Z. S., & Bandettini, P. A. (2001). Spatial heterogeneity of the nonlinear dynamics in the FMRI BOLD response. NeuroImage, 14, 817–826.
Constable, R. T., & Spencer, D. D. (1999). Composite image formation in z-shimmed functional MR imaging. Magnetic Resonance in Medicine, 42, 110–117.
Bohning, D. E., Pecheny, A. P., Epstein, C. M., Speer, A. M., Vincent, D. J., Dannels, W., et al. (1997). Mapping transcranial magnetic stimulation (TMS) fields in vivo with MRI. NeuroReport, 8, 2535–2538.
Cordes, D., Haughton, V. M., Arfanakis, K., Wendt, G. J., Turski, P. A., Moritz, C. H., et al. (2000). Mapping functionally related regions of brain with functional connectivity MR imaging. American Journal of Neuroradiolgy, 21, 1636–1644.
Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley. Boynton, G. M., Engel, S. A., Glover, G. H., & Heeger, D. J. (1996). Linear systems analysis of functional magnetic resonance imaging in human V1. J Neurosci, 16, 4207–4221. Brett, M., Johnsrude, I. S., & Owen, A. M. (2002). The problem of functional localization in the human brain. National Reviews Neuroscience, 3, 243–249. Bro, R. (1997). PARAFAC. Tutorial and applications. Chemometrics and Intelligent Laboratory Systems, 38, 149–171. Buckner, R. L., Koutstaal, W., Schacter, D. L., Dale, A. M., Rotte, M., & Rosen, B. R. (1998). Functional-anatomic study of episodic retrieval: Pt. II. Selective averaging of event-related fMRI trials to test the retrieval success hypothesis. NeuroImage, 7, 163–175. Buracas, G. T., & Boynton, G. M. (2002). Efficient design of eventrelated fMRI experiments using M-sequences. NeuroImage, 16, 801–813. Burock, M. A., Buckner, R. L., Woldorff, M. G., Rosen, B. R., & Dale, A. M. (1998). Randomized event-related experimental designs allow for extremely rapid presentation rates using functional MRI. NeuroReport, 9, 3735–3739. Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Sciences, 4, 215–222. Buxton, R. B., & Frank, L. R. (1997). A model for the coupling between cerebral blood flow and oxygen metabolism during neural stimulation. Journal of Cerebral Blood Flow and Metabolism, 17, 64–72. Buxton, R. B., Uludag, K., Dubowitz, D. J., & Liu, T. T. (2004). Modeling the hemodynamic response to brain activation. NeuroImage, 23 (Suppl. 1), S220–S233. Buxton, R. B., Wong, E. C., & Frank, L. R. (1998). Dynamics of blood flow and oxygenation changes during brain activation: The balloon model. Magnetic Resonance in Medicine, 39, 855–864. Cacioppo, J. T., & Berntson, G. G. (in press). Integrative neuroscience for the behavioral sciences: Implications for inductive inference. In Handbook of neuroscience for the behavioral sciences. [This Volume] Cacioppo, J. T., & Tassinary, L. G. (1990). Inferring psychological significance from physiological signals. American Psychologist, 45, 16–28. Calhoun, V. D., Adali, T., Pearlson, G. D., & Pekar, J. J. (2001a). A method for making group inferences from functional MRI data using independent component analysis. Human Brain Mapping, 14, 140–151. Calhoun, V. D., Adali, T., Pearlson, G. D., & Pekar, J. J. (2001b). Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related waveforms. Human Brain Mapping, 13, 43–53.
c09.indd Sec5:193
Cover, T. M., Thomas, J. A., Wiley, J., & InterScience, W. (2006). Elements of Information Theory: Wiley-Interscience New York. Cox, R. W. (1996). AFNI: Software for analysis and visualization of functional magnetic resonance NeuroImages. Computers and Biomedical Research, 29, 162–173. D’Esposito, M., Zarahn, E., Aguirre, G. K., & Rypma, B. (1999). The effect of normal aging on the coupling of neural activity to the bold hemodynamic response. NeuroImage, 10, 6–14. Dagher, A., Owen, A. M., Boecker, H., & Brooks, D. J. (1999). Mapping the network for planning: A correlational PET activation study with the Tower of London task. Brain, 122, 1973–1987. Dale, A. M., & Buckner, R. L. (1997). Selective averaging of rapidly presented individual trials using fMRI. Human Brain Mapping, 5, 329–340. Dale, A. M., Liu, A. K., Fischl, B. R., Buckner, R. L., Belliveau, J. W., Lewine, J. D., et al. (2000). Dynamic statistical parametric mapping combining fMRI and MEG for high-resolution imaging of cortical activity. Neuron, 26, 55–67. Della-Maggiore, V., Sekuler, A. B., Grady, C. L., Bennett, P. J., Sekuler, R., & McIntosh, A. R. (2000). Corticolimbic interactions associated with performance on a short-term memory task are modified by age. Journal of Neuroscience, 20, 8410–8416. Denis Le Bihan, M. D., Mangin, J. F., Poupon, C., Clark, C. A., Pappata, S., Molko, N., et al. (2001). Diffusion tensor imaging: Concepts and applications. Journal of Magnetic Resonance Imaging, 13, 534–546. de Quervain, D. J., Fischbacher, U., Treyer, V., Schellhammer, M., Schnyder, U., Buck, A., et al. (2004, August 27). The neural basis of altruistic punishment. Science, 305, 1254–1258. Devlin, J. T., & Poldrack, R. A. (2007). In praise of tedious anatomy. NeuroImage, 37, 1033–1041; discussion 1050–1038. Disbrow, E. A., Slutsky, D. A., Roberts, T. P., & Krubitzer, L. A. (2000). Functional MRI at 1.5 tesla: A comparison of the blood oxygenation level-dependent signal and electrophysiology. Proceedings of the National Academy of Sciences, USA, 97, 9718–9723. Duann, J. R., Jung, T. P., Kuo, W. J., Yeh, T. C., Makeig, S., Hsieh, J. C., et al. (2002). Single-trial variability in event-related BOLD signals. NeuroImage, 15, 823–835. Duong, T. Q., Yacoub, E., Adriany, G., Hu, X., Ugurbil, K., Vaughan, J. T., et al. (2002). High-resolution, spin-echo BOLD, and CBF fMRI at 4 and 7 T. Magnetic Resonance in Medicine, 48, 589–593. Duvernoy, H. M. (1995). The human brain stem and cerebellum: Surface, structure, vascularization, and three-dimensional sectional anatomy with MRI. Wien, Germany: Springer-Verlag. Eickhoff, S. B., Amunts, K., Mohlberg, H., & Zilles, K. (2006). The human parietal operculum: Pt. II. Stereotaxic maps and correlation with functional imaging results. Cerebral Cortex, 16, 268–279.
8/18/09 5:25:41 PM
194
Essentials of Functional Neuroimaging
Eickhoff, S. B., Stephan, K. E., Mohlberg, H., Grefkes, C., Fink, G. R., Amunts, K., et al. (2005). A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. NeuroImage, 25, 1325–1335. Eisenberger, N. I., Lieberman, M. D., & Williams, K. D. (2003, October 10). Does rejection hurt? An FMRI study of social exclusion. Science, 302, 290–292. Elster, A. D. (1994). Questions and answers in magnetic resonance imaging. St. Louis, MO: Mosby. Etkin, A., & Wager, T. D. (2007). Functional neuroimaging of anxiety: A meta-analysis of emotional processing in PTSD, social anxiety disorder, and specific phobia. American Journal of Psychiatry, 164, 1476–1488. Fabiani, M., Gratton, G., & Federmeier, K. D. (2007). Event-related brain potentials: Methods, theory, and applications. In J. T. Cacioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (4th ed., pp. 85–119). Cambridge: Cambridge University Press.
Good, C. D., Johnsrude, I. S., Ashburner, J., Henson, R. N. A., Friston, K. J., & Frackowiak, R. S. J. (2001). A voxel-based morphometric study of ageing in 465 normal adult human brains. NeuroImage, 14, 21–36. Goutte, C., Nielsen, F. A., & Hansen, L. K. (2000). Modeling the haemodynamic response in fMRI using smooth FIR filters. IEEE Transactions of Medical Imaging, 19, 1188–1201. Granger, C. W. J. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37, 424–438. Grill-Spector, K., & Malach, R. (2001). FMR-adaptation: A tool for studying the functional properties of human cortical neurons. Acta Psychologica, 107, 293–321. Gusnard, D. A., Raichle, M. E., & Raichle, M. E. (2001). Searching for a baseline: Functional imaging and the resting human brain. National Reviews Neuroscience, 2, 685–694. Haacke, E. M. (1999). Magnetic resonance imaging: Physical principles and sequence design. New York: Wiley.
Fischl, B., Sereno, M. I., & Dale, A. M. (1999). Cortical surface-based analysis: Pt. II. Inflation, flattening, and a surface-based coordinate system. NeuroImage, 9, 195–207.
Habeck, C., Krakauer, J. W., Ghez, C., Sackeim, H. A., Eidelberg, D., Stern, Y., et al. (2005). A new approach to spatial covariance modeling of functional brain imaging data: Ordinal trend analysis. Neural Computation, 17, 1602–1645.
Fischl, B., Sereno, M. I., Tootell, R. B., & Dale, A. M. (1999). High-resolution intersubject averaging and a coordinate system for the cortical surface. Human Brain Mapping, 8, 272–284.
Haines, D. E. (2000). Neuroanatomy: An atlas of structures, sections, and systems. Philadelphia: Lippincott Williams & Wilkins.
Franklin, G. F., Workman, M. L., & Powell, D. (1997). Digital control of dynamic systems. Boston: Addison-Wesley Longman. Frey, K. A. (1999). Positron emission tomography. In G. J. Siegel, B. W. Agranoff, R. W. Albers, S. K. Fisher, & M. D. Uhler (Eds.), Basic neurochemistry (6th ed., pp. 1109–1131). Philadelphia: Lippincott Williams & Wilkins. Friston, K. J. (1994). Functional and effective connectivity in neuroimaging: A synthesis. Human Brain Mapping, 2, 56–78. Friston, K. J., Frith, C. D., Turner, R., & Frackowiak, R. S. (1995). Characterizing evoked hemodynamics with fMRI. NeuroImage, 2, 157–165. Friston, K. J., Glaser, D. E., Henson, R. N., Kiebel, S., Phillips, C., & Ashburner, J. (2002). Classical and Bayesian inference in neuroimaging: Applications. NeuroImage, 16, 484–512. Friston, K. J., Harrison, L., & Penny, W. (2003). Dynamic causal modelling. NeuroImage, 19, 1273–1302. Friston, K. J., Josephs, O., Rees, G., & Turner, R. (1998). Nonlinear eventrelated responses in fMRI. Magnetic Resonance in Medicine, 39, 41–52. Friston, K. J., Mechelli, A., Turner, R., & Price, C. J. (2000). Nonlinear responses in fMRI: The balloon model, volterra kernels, and other hemodynamics. NeuroImage, 12, 466–477. Friston, K. J., Penny, W. D., & Glaser, D. E. (2005). Conjunction revisited. NeuroImage, 25, 661–667. Friston, K. J., Penny, W., Phillips, C., Kiebel, S., Hinton, G., & Ashburner, J. (2002). Classical and bayesian inference in neuroimaging: Theory. NeuroImage, 16, 465–483. Glover, G. H. (1999). Deconvolution of impulse response in event-related BOLD fMRI. NeuroImage, 9, 416–429. Glover, G. H., & Law, C. S. (2001). Spiral-in/out BOLD fMRI for increased, SNR, and reduced susceptibility artifacts. Magnetic Resonance in Medicine, 46, 515–522. Glover, G. H., Li, T. Q., & Ress, D. (2000). Image-based method for retrospective correction of physiological motion effects in fMRI: RETROICOR. Magnetic Resonance in Medicine, 44, 162–167. Goldman, R. I., Stern, J. M., Engel, J., Jr., & Cohen, M. S. (2000). Acquiring simultaneous, EEG, and functional MRI. Clinical Neurophysiology, 111, 1974–1980.
c09.indd Sec5:194
Hämäläinen, M., Hari, R., Ilmoniemi, R. J., Knuutila, J., & Lounasmaa, O. V. (1993). Magnetoencephalography: Theory, instrumentation, and applications to noninvasive studies of the working human brain. Reviews of Modern Physics, 65, 413–497. Harrison, L., Penny, W. D., & Friston, K. (2003). Multivariate autoregressive modeling of fMRI time series. NeuroImage, 19, 1477–1491. Heeger, D. J., & Ress, D. (2002). What does fMRI tell us about neuronal activity? National Review of Neuroscience, 3, 142–151. Henson, R. N. (2003). Neuroimaging studies of priming. Progress in Neurobiology, 70, 53–81. Horwitz, B. (2003). The elusive concept of brain connectivity. NeuroImage, 19, 466–470. Huettel, S. A., Song, A. W., & McCarthy, G. (2004). Functional magnetic resonance imaging. Sunderland, MA: Sinauer. Johansen-Berg, H., & Behrens, T. E. (2006). Just pretty pictures? What diffusion tractography can add in clinical neuroscience. Current Opinion in Neurology and Neurosurgery, 19, 379–385. Johansen-Berg, H., Behrens, T. E., Robson, M. D., Drobnjak, I., Rushworth, M. F., Brady, J. M., et al. (2004). Changes in connectivity profiles define functionally distinct regions in human medial frontal cortex. Proceedings of the National Academy of Sciences, USA, 101, 13335–13340. Johnson, M. K., Raye, C. L., Mitchell, K. J., Greene, E. J., Cunningham, W. A., & Sanislow, C. A. (2005). Using fMRI to investigate a component process of reflection: Prefrontal correlates of refreshing a just-activated representation. Cognitive, Affective and Behavioral Neuroscience, 5, 339–361. Josephs, O., & Henson, R. N. (1999). Event-related functional magnetic resonance imaging: Modelling, inference and optimization. Philosophical Transactions of the Royal Society of London. Series B, 354, 1215–1228. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311. Kastner, S., & Ungerleider, L. G. (2000). Mechanisms of visual attention in the human cortex. Annual Review of Neuroscience, 23, 315–341. Kherif, F., Poline J.-B., Flandin G., Benali H., Dehaene, S., & Worsley, K. J. (2002). Multivariate model specification for fMRI data. NeuroImage, 16, 795–815.
8/18/09 5:25:42 PM
References 195 Kim, J., Zhu, W., Chang, L., Bentler, P. M., & Ernst, T. (2007). Unified structural equation modeling approach for the analysis of multisubject, multivariate functional MRI data. Human Brain Mapping, 28, 85–93.
Menon, R. S., Luknowsky, D. C., & Gati, J. S. (1998). Mental chronometry using latency-resolved functional MRI. Proceedings of the National Academy of Sciences, USA, 95, 10902–10907.
Kwong, K. K., Belliveau, J. W., Chesler, D. A., Goldberg, I. E., Weisskoff, R. M., Poncelet, B. P., et al. (1992). Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proceedings of the National Academy of Sciences, USA, 89, 5675–5679.
Menon, V., Ford, J. M., Lim, K. O., Glover, G. H., & Pfefferbaum, A. (1997). Combined event-related fMRI and EEG evidence for temporal-parietal cortex activation during target detection. NeuroReport, 8, 3029–3037.
Lancaster, J. L., Woldorff, M. G., Parsons, L. M., Liotti, M., Freitas, C. S., Rainey, L., et al. (2000). Automated talairach atlas labels for functional brain mapping. Human Brain Mapping, 10, 120–131.
Miezin, F. M., Maccotta, L., Ollinger, J. M., Petersen, S. E., & Buckner, R. L. (2000). Characterizing the hemodynamic response: Effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing. NeuroImage, 11, 735–759.
Liao, C. H., Worsley, K. J., Poline, J. B., Aston, J. A., Duncan, G. H., & Evans, A. C. (2002). Estimating the delay of the fMRI response. NeuroImage, 16, 593–606. Lindquist, M. A., Zhang, C. H., Glover, G., & Shepp, L. (2008). Rapid three-dimensional functional magnetic resonance imaging of the initial negative BOLD response. J Magn Reson, 191 (1), 100–111. Lindquist, M. A., & Wager, T. D. (2007). Validity and power in hemodynamic response modeling: A comparison study and a new approach. Human Brain Mapping, 28, 764–784. Lindquist, M., & Wager, T. D. (in press). Application of change-point theory to modeling state-related activity in fMRI. Applied Data Analytic Techniques for “Turning Points Research.” Lawrence Erlbaum/Taylor & Francis Group, Philadelphia. Lindquist, M. A., Waugh, C., & Wager, T. D. (2007). Modeling state-related fMRI activity using change-point theory. NeuroImage, 35, 1125–1141. Liu, T. T. (2004). Efficiency, power, and entropy in event-related fMRI with multiple trial types: Pt. II. Design of experiments. NeuroImage, 21, 401–413. Liu, T. T., Frank, L. R., Wong, E. C., & Buxton, R. B. (2001). Detection power, estimation efficiency, and predictability in event-related fMRI. NeuroImage, 13, 759–773. Logothetis, N. K., Pauls, J., Augath, M., Trinath, T., & Oeltermann, A. (2001, July 12). Neurophysiological investigation of the basis of the fMRI signal. Nature, 412, 150–157. Loh, J. M., Lindquist, M.A., Wager, T.D. (2008). Residual analysis for detecting mis-modeling in fMRI. Statistica Sinica, 18, 1421–1448. Lund, T. E., Madsen, K. H., Sidaros, K., Luo, W. L., & Nichols, T. E. (2006). Non-white noise in fMRI: does modelling have an impact? NeuroImage, 29 (1), 54–66. Luo, W. L., & Nichols, T. E. (2003). Diagnosis and exploration of massively univariate neuroimaging models. NeuroImage, 19, 1014–1032. Maguire, E. A., Gadian, D. G., Johnsrude, I. S., Good, C. D., Ashburner, J., Frackowiak, R. S., et al. (2000). Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences, USA, 97, 4398–4403. Mai, J. K., Assheuer, J., & Paxinos, G. (2004). Atlas of the human brain (2nd ed.). San Diego, CA: Elsevier Academic Press. McIntosh, A. R., Bookstein, F. L., Haxby, J. V., & Grady, C. L. (1996). Spatial pattern analysis of functional brain images using partial least squares. NeuroImage, 3, 143–157. McIntosh, A. R., & Gonzalez-Lima, F. (1994). Structural equation modeling and its application to network analysis in functional brain imaging. Human Brain Mapping, 2, 2–22. McKeown, M. J., Makeig, S., Brown, G. G., Jung, T. P., Kindermann, S. S., Bell, A. J., et al. (1998). Analysis of fMRI data by blind separation into independent spatial components. Hum Brain Mapp, 6 (3), 160–188.
c09.indd Sec5:195
Morawetz, C., Holz, P., Lange, C., Baudewig, J., Weniger, G., Irle, E., et al. (2008). Improved functional mapping of the human amygdala using a standard functional magnetic resonance imaging sequence with simple modifications. Magnetic Resonance Imaging, 26, 45–53. Nakamura, W., Anami, K., Mori, T., Saitoh, O., Cichocki, A., & Amari, S. (2006). Removal of ballistocardiogram artifacts from simultaneously recorded, EEG, and fMRI data using independent component analysis. IEEE Transactions of Biomedical Engineering, 53, 1294–1308. Nichols, T., Brett, M., Andersson, J., Wager, T., & Poline, J. B. (2005). Valid conjunction inference with the minimum statistic. NeuroImage, 25, 653–660. Nichols, T., & Hayasaka, S. (2003). Controlling the familywise error rate in functional neuroimaging: A comparative review. Statistical Methods in Medical Research, 12, 419–446. Nichols, T., & Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: A primer with examples. Human Brain Mapping, 15, 1–25. Noll, D. C., Fessler, J. A., & Sutton, B. P. (2005). Conjugate phase MRI reconstruction with spatially variant sample density correction. IEEE Transactions in Medical Imaging, 24, 325–336. Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends in Cognitive Sciences, 10, 424–430. Ogawa, S., Lee, T. M., Kay, A. R., & Tank, D. W. (1990). Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proceedings of the National Academy of Sciences, USA, 87, 9868–9872. Ogawa, S., Tank, D. W., Menon, R., Ellermann, J. M., Kim, S. G., Merkle, H., et al. (1992). Intrinsic signal changes accompanying sensory stimulation: Functional brain mapping with magnetic resonance imaging. Proceedings of the National Academy of Sciences, USA, 89, 5951–5955. Ollinger, J. M., Shulman, G. L., & Corbetta, M. (2001). Separating processes within a trial in event-related functional MRI. NeuroImage, 13, 210–217. Ongur, D., Ferry, A. T., & Price, J. L. (2003). Architectonic subdivision of the human orbital and medial prefrontal cortex. Journal of Comparative Neurology, 460, 425–449. Paton, J. J., Belova, M. A., Morrison, S. E., & Salzman, C. D. (2006, February 16). The primate amygdala represents the positive and negative value of visual stimuli during learning. Nature, 439, 865–870. Paus, T. (2001). Primate anterior cingulate cortex: Where motor control, drive and cognition interface. Natures Review Neuroscience, 2, 417–424. Pearl, J. (2000). Causality: Models, reasoning, and inference. New York: Cambridge University Press.
McKeown, M. J., & Sejnowski, T. J. (1998). Independent component analysis of fMRI data: examining the assumptions. Hum Brain Mapp, 6 (5–6), 368–372.
Petersson, K. M., Nichols, T. E., Poline, J. B., & Holmes, A. P. (1999). Statistical limitations in functional neuroimaging. I. Non-inferential methods and statistical models. Philos Trans R Soc Lond B Biol Sci, 354 (1387), 1239–1260.
Menon, R. S. (2002). Postacquisition suppression of large-vessel BOLD signals in high-resolution fMRI. Magnetic Resonance in Medicine, 47, 1–9.
Phan, K. L., Taylor, S. F., Welsh, R. C., Ho, S. H., Britton, J. C., & Liberzon, I. (2004). Neural correlates of individual ratings of emotional salience: A trial-related fMRI study. NeuroImage, 21, 768–780.
8/18/09 5:25:42 PM
196
Essentials of Functional Neuroimaging
Pizzagalli, D. A. (2007). Electroencephalography and high-density electrophysiological source localization. In J. T. Cacioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (4th ed., pp. 56–84). Cambridge: Cambridge University Press. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Science, 10, 59–63. Price, C. J., & Friston, K. J. (1997). Cognitive conjunction: A new approach to brain activation experiments. NeuroImage, 5, 261–270. Price, C. J., Veltman, D. J., Ashburner, J., Josephs, O., & Friston, K. J. (1999). The critical relationship between the timing of stimulus presentation and data acquisition in blocked designs with fMRI. NeuroImage, 10, 36–44. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, USA, 98, 676–682. Rasbash, J. (2002). A user ’s guide to MLwiN. London: University of London, Centre for Multilevel Modelling. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis (2nd ed.). Newbury Park, CA: Sage. Reiman, E. M., Fusselman, M. J., Fox, P. T., & Raichle, M. E. (1989, February 24). Neuroanatomical correlates of anticipatory anxiety. Science, 243, 1071–1074. (Published erratum appears in Science, 256, June 19, 1992, p. 1696) Riera, J. J., Watanabe, J., Kazuki, I., Naoki, M., Aubert, E., Ozaki, T., et al. (2004). A state-space model of the hemodynamic approach: Nonlinear filtering of BOLD signals. NeuroImage, 21, 547–567. Rissman, J., Gazzaley, A., & D’Esposito, M. (2004). Measuring functional connectivity during distinct stages of a cognitive task. NeuroImage, 23, 752–763. Roebroeck, A., Formisano, E., & Goebel, R. (2005). Mapping directed influence over the brain using Granger causality and fMRI. NeuroImage, 25, 230–242. Rosen, B. R., Buckner, R. L., & Dale, A. M. (1998). Event-related functional MRI: Past, present, and future. Proceedings of the National Academy of Sciences, USA, 95, 773–780. Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701. Saad, Z. S., Reynolds, R. C., Argall, B., Japee, S., & Cox, R. W. (2004). SUMA: An interface for surface-based intra- and inter-subject analysis with AFNI. Paper presented at the Biomedical Imaging: Nano to Macro, 2004. IEEE International Symposium. Sandler, M. P. (2003). Diagnostic nuclear medicine. Philadelphia: Lippincott Williams & Wilkins. Sarter, M., Berntson, G. G., & Cacioppo, J. T. (1996). Brain imaging and cognitive neuroscience. Toward strong inference in attributing function to structure. American Psychology, 51, 13–21. Sawamura, H., Orban, G. A., & Vogels, R. (2006). Selectivity of neuronal adaptation does not match response selectivity: A single-cell study of the FMRI adaptation paradigm. Neuron, 49, 307–318.
Skudlarski, P., Constable, R. T., & Gore, J. C. (1999). ROC analysis of statistical methods used in functional MRI: Individual subjects. NeuroImage, 9, 311–329. Smith, S. M., Jenkinson, M., Beckmann, C., Miller, K., & Woolrich, M. (2007). Meaningful design and contrast estimability in FMRI. NeuroImage, 34, 127–136. Smith, S. M., Jenkinson, M., Woolrich, M. W., Beckmann, C. F., Behrens, T. E., Johansen-Berg, H., et al. (2004). Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage, 23, S208–S219. Stark, C. E., & Squire, L. R. (2001). When zero is not zero: The problem of ambiguous baseline conditions in fMRI. Proceedings of the National Academy of Sciences, USA, 98, 12760–12766. Sternberg, S. (1969). Memory-scanning: Mental processes revealed by reaction-time experiments. American Scientist, 57, 421–457. Sternberg, S. (2001). Separate modifiability, mental modules, and the use of pure and composite measures to reveal them. Acta Psychologica, 106, 147–246. Strang, G. (1980). Linear algebra and its applications: Academic Press New York. Summerfield, C., Greene, M., Wager, T., Egner, T., Hirsch, J., & Mangels, J. (2006). Neocortical connectivity during episodic memory formation. PLoS Biology, 4, E128. Sylvester, C. Y., Wager, T. D., Lacey, S. C., Hernandez, L., Nichols, T. E., Smith, E. E., et al. (2003). Switching attention and resolving interference: FMRI measures of executive functions. Neuropsychologia, 41, 357–370. Talairach, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas of the human brain: 3-dimensional proportional system: An approach to cerebral imaging. Stuttgart, New York: Thieme Medical. Taylor, J. E., & Worsley, K. J. (2006). Inference for magnitudes and delays of responses in the FIAC data using BRAINSTAT/FMRISTAT. Human Brain Mapping, 27, 434–441. Thompson, P. M., Schwartz, C., Lin, R. T., Khan, A. A., & Toga, A. W. (1996). Three-dimensional statistical analysis of sulcal variability in the human brain. Journal of Neuroscience, 16, 4261–4274. Tohka, J., Foerde, K., Aron, A. R., Tom, S. M., Toga, A. W., & Poldrack, R. A. (2008). Automatic independent component labeling for artifact removal in fMRI. NeuroImage, 39 (3), 1227–1245. Tootell, R. B. H., Dale, A. M., Sereno, M. I., & Malach, R. (1996). New images from human visual cortex. Trends in Neurosciences, 19, 481–489. Van Essen, D. C., & Dierker, D. L. (2007). Surface-based and probabilistic atlases of primate cerebral cortex. Neuron, 56, 209–225. Van Essen, D. C., Drury, H. A., Dickson, J., Harwell, J., Hanlon, D., & Anderson, C. H. (2001). An integrated software suite for surfacebased analyses of cerebral cortex. Journal of the American Medical Informatics Association, 8, 443–459.
Schacter, D. L., Buckner, R. L., Koutstaal, W., Dale, A. M., & Rosen, B. R. (1997). Late onset of anterior prefrontal activity during true and false recognition: An event-related fMRI study. NeuroImage, 6, 259–269.
van Snellenberg, J. X., & Wager, T. D. (in press). Cognitive and Motivational Functions of the Human Prefrontal Cortex. In E. Goldberg & D. Bougakov (Eds.), Luria’s Legacy in the 21st Century. Oxford: Oxford University Press.
Shulman, R. G., & Rothman, D. L. (1998). Interpreting functional imaging studies in terms of neurotransmitter cycling. Proceedings of the National Academy of Sciences, USA, 95, 11993–11998.
Vazquez, A. L., Cohen, E. R., Gulani, V., Hernandez-Garcia, L., Zheng, Y., Lee, G. R., et al. (2006). Vascular dynamics and BOLD fMRI: CBF level effects and analysis considerations. NeuroImage, 32, 1642–1655.
Shulman, R. G., Rothman, D. L., Behar, K. L., & Hyder, F. (2004). Energetic basis of brain activity: Implications for neuroimaging. Trends in Neuroscience, 27, 489–495.
Vazquez, A. L., & Noll, D. C. (1998). Nonlinear aspects of the BOLD response in functional MRI. NeuroImage, 7, 108–118.
Sibson, N. R., Dhankhar, A., Mason, G. F., Behar, K. L., Rothman, D. L., & Shulman, R. G. (1997). In vivo 13C NMR measurements of cerebral glutamine synthesis as evidence for glutamate-glutamine cycling. Proceedings of the National Academy of Sciences, USA, 94, 2699–2704.
c09.indd Sec5:196
Villringer, A., & Chance, B. (1997). Non-invasive optical spectroscopy and imaging of human brain function. Trends in Neurosciences, 20, 435–442. Visscher, K. M., Miezin, F. M., Kelly, J. E., Buckner, R. L., Donaldson, D. I., McAvoy, M. P., et al. (2003). Mixed blocked/event-related designs separate transient and sustained activity in fMRI. NeuroImage, 19, 1694–1708.
8/18/09 5:25:43 PM
References 197 Vogt, B. A., Nimchinsky, E. A., Vogt, L. J., & Hof, P. R. (1995). Human cingulate cortex: Surface features, flat maps, and cytoarchitecture. Journal of Comparative Neurology, 359, 490–506.
Wang, G., Tanaka, K., & Tanifuji, M. (1996, June 14). Optical imaging of functional organization in the monkey inferotemporal cortex. Science, 272, 1665–1668.
Wager, T. D., Hernandez, L., Jonides, J., & Lindquist, M. (2007). Elements of functional neuroimaging. In J. T. Cacioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (4th ed., pp. 19–55). Cambridge: Cambridge University Press.
Williams, D. S., Detre, J. A., Leigh, J. S., & Koretsky, A. P. (1992). Magnetic resonance imaging of perfusion using spin inversion of arterial water. Proceedings of the National Academy of Sciences, USA, 89, 212–216.
Wager, T. D., Jonides, J., & Reading, S. (2004). Neuroimaging studies of shifting attention: A meta-analysis. NeuroImage, 22, 1679–1693.
Wilson, J. L., & Jezzard, P. (2003). Utilization of an intra-oral diamagnetic passive shim in functional MRI of the inferior frontal cortex. Magnetic Resonance in Medicine, 50, 1089–1094.
Wager, T. D., Jonides, J., & Smith, E. E. (2006). Individual differences in multiple types of shifting attention. Memory and Cognition, 34, 1730–1743. Wager, T. D., Jonides, J., Smith, E. E., & Nichols, T. E. (2005). Towards a taxonomy of attention-shifting: Individual differences in fMRI during multiple shift types. Cognitive, Affective and Behavioral Neuroscience, 5, 127–143. Wager, T. D., Keller, M. C., Lacey, S. C., & Jonides, J. (2005). Increased sensitivity in neuroimaging analyses using robust regression. NeuroImage, 26, 99–113. Wager, T. D., Lindquist, M., & Kaplan, L. (2007). Meta-analysis of functional neuroimaging data: Current and future directions. Social, Cognitive, and Affective Neuroscience, 2, 150–158. Wager, T. D., & Nichols, T. E. (2003). Optimization of experimental design in fMRI: A general framework using a genetic algorithm. NeuroImage, 18, 293–309.
Woolrich, M. W., Behrens, T. E., & Smith, S. M. (2004). Constrained linear basis sets for HRF modelling using Variational Bayes. NeuroImage, 21, 1748–1761. Worsley, K. J., & Friston, K. J. (1995). Analysis of fMRI time-series revisited: Again. NeuroImage, 2, 173–181. Worsley, K. J., Liao, C. H., Aston, J., Petre, V., Duncan, G. H., Morales, F., et al. (2002). A general statistical analysis for fMRI data. NeuroImage, 15, 1–15. Worsley, K. J., Taylor, J. E., Tomaiuolo, F., & Lerch, J. (2004). Unified univariate and multivariate random field theory. NeuroImage, 23, S189–S195.
Wager, T. D., Reading, S., & Jonides, J. (2004). Neuroimaging studies of shifting attention: A meta-analysis. NeuroImage, 22, 1679–1693.
Young, F. W., Takane, Y., & Lewyckyj, R. (1978). ALSCAL: A nonmetric multidimensional scaling program with several difference options. Behavioral Research Methods and Instrumentation, 10, 451–453.
Wager, T. D. & Smith, E. E. (2003). Neuroimaging studies of working memory: a meta-analysis. Cognitive, Affective and Behavioral Neuroscience, 3, 255–274.
Zarahn, E. (2002). Using larger dimensional signal subspaces to increase sensitivity in fMRI time series analyses. Human Brain Mapping, 17, 13–16.
Wager, T. D., Scott, D. J., & Zubieta, J. K. (2007). Placebo effects on human mu-opioid activity during pain. Proceedings of the National Academy of Sciences, USA, 104, 11056–11061.
Zarahn, E., Aguirre, G., & D’Esposito, M. (1997). A trial-based experimental design for fMRI. NeuroImage, 6, 122–138.
Wager, T. D., Vazquez, A., Hernandez, L., & Noll, D. C. (2005). Accounting for nonlinear BOLD effects in fMRI: Parameter estimates and a model for prediction in rapid event-related studies. NeuroImage, 25, 206–218.
c09.indd Sec5:197
Woolrich, M. W., Behrens, T. E., Beckmann, C. F., Jenkinson, M., & Smith, S. M. (2004). Multilevel linear modelling for FMRI group analysis using Bayesian inference. NeuroImage, 21, 1732–1747.
Zarahn, E., & Slifstein, M. (2001). A reference effect approach for power analysis in fMRI. NeuroImage, 14, 768–779. Zeineh, M. M., Engel, S. A., Thompson, P. M., & Bookheimer, S. Y. (2003). Dynamics of the hippocampus during encoding and retrieval of face-name pairs. Science, 299, 577–580.
8/18/09 5:25:43 PM
c09.indd Sec5:198
8/18/09 5:25:43 PM
Chapter 10
Thalamocortical Relations S. MURRAY SHERMAN
The thalamus is a paired structure joined at the midline and located at the center of the brain (Figure 10.1). Each half is roughly the size of a walnut. The main part of the thalamus is divided into a number of discrete regions, known as relay nuclei. These contain the relay cells that project to the cerebral cortex. (In this chapter, cortex refers to neocortex, which does not include the hippocampal formation or olfactory cortex.) Lateral to this main body of the thalamus is the thalamic reticular nucleus (TRN in Figure 10.1), which fits like a shield alongside the body of the main relay nuclei of the thalamus. The thalamic reticular nucleus is comprised entirely of GABAergic neurons that do not innervate cortex but instead innervate thalamic relay cells. In Figure 10.1,
all but the front part of the thalamic reticular nucleus has been cut away to reveal the relay nuclei. Strictly speaking, the relay nuclei are the dorsal thalamus, while the thalamic reticular nucleus is part of the ventral thalamus; here, dorsal and ventral reflect embryonic origin rather than relative location in the adult, meaning that the relay nuclei and thalamic reticular nucleus have different developmental origins. Unless otherwise specified, thalamus refers to the relay nuclei of the dorsal thalamus. Virtually all information reaching the cortex must pass through and be relayed by the thalamus. Thus anything we are consciously aware of and all of our perceptions of the outside world depend on thalamic relays. This relay is dynamically controlled by behavioral states and processes, including attentional demands. Each of the main relay nuclei shown in Figure 10.1 innervate one or a small number of cortical areas and, as far as we know, every area of cortex receives a thalamic input. The thalamus is there not just to get peripheral information to the cortex, but it continues to play a vital role in the further processing of this information by the cortex. Where thalamocortical relationships are understood (e.g., the projection of the lateral geniculate nucleus to the primary visual cortex), the thalamic input plays a major role in determining the functional properties of the cortical target area. It has been shown that if retinal inputs in the ferret are diverted into the medial geniculate nucleus instead of the normal auditory inputs, the auditory cortex acquires visual responsiveness and organizes orientation-specific domains normally seen only in the visual cortex (Sharma, Angelucci, & Sur, 2000). Since all thalamic nuclei innervate cortex and all cortical areas are thus innervated, this might suggest that the functional properties of any cortical area follow its thalamic input rather than inputs from other cortical areas. This is a rather subversive idea that runs counter to the traditional dogma that cortical processing depends solely on direct cortico-cortical pathways. Cortico-thalamo-cortical pathways play a heretofore neglected and perhaps dominant role in cortical functioning.
Figure 10.1 Schematic view of the right thalamus of the human. Shown are the main relay nuclei plus the thalamic reticular nucleus (TRN), of which only the anterior portion is visible; the remainder has been removed to reveal the thalamic relay nuclei. Normally, the thalamic reticular nucleus extends the length of thalamus as a thin shield closely apposed to the lateral surface of the relay nuclei. Abbreviations: A, Anterior Nuclei; CM, Central Medial Nucleus; IL, Intralaminar Nuclei; LD, Lateral Dorsal Nucleus; LGN, Lateral geniculate Nucleus; LP or PO, Lateral Posterior or Posterior Nucleus; MD, Medial Dorsal Nucleus; MGN, Medial Geniculate Nucleus; MI, Midline Nuclei; P, pulvinar; TRN, thalamic reticular nucleus; VA, Ventral Anterior Nucleus; VL, Ventral Lateral Nucleus; VPL, Ventral Posterolateral Nucleus; VPM, Ventral Posteromedial Nucleus. 201
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c10.indd 201
8/17/09 2:06:12 PM
202 Thalamocortical Relations
To understand the functional relevance of the thalamus, it is necessary to understand some details about cell and circuit properties. Fortunately, these are mostly conserved throughout thalamus, so once we appreciate these for a model nucleus, we can extrapolate these properties for the entire thalamus. This is not to say that there are not important differences found among thalamic relay nuclei, but we concentrate in this chapter on those major properties that are common to the thalamus. The best-known and most thoroughly studied of the thalamic nuclei is the lateral geniculate nucleus, which relays retinal information to the visual cortex. We use this nucleus as our model template for cell and circuit properties. For details of the thalamus beyond the scope of this chapter, see recent books on this topic by Jones (2007) and Sherman and Guillery (2006).
BASIC CELL TYPES There are three major cell types involved in thalamic circuitry: relay cells, local interneurons, and thalamic reticular nucleus cells. The relay cells use glutamate as a neurotransmitter, whereas the other cell types are GABAergic. Relay Cells Although the evidence for different classes of relay cells is scattered and incomplete outside of the lateral geniculate nucleus (e.g., Li, Bickford, & Guido, 2003; Yen & Jones, 1983), the evidence is firm for several distinct types of geniculate relay cell. For example, the main geniculate relay cell classes in the cat are called X and Y (see Figure 10.2), and the equivalent types in the monkey are called parvocellular and magnocellular based on the geniculate laminae in which they are located (Casagrande & Xu, 2004; Hendry & Reid, 2000; Sherman, 1985). Another cell type, called W in the cat and K in the monkey, has also been described, but it is unclear if this is one or several distinct classes, and these cells are relatively poorly understood; further details can be found in Casagrande and Xu (2004; Hendry and Reid (2000); and Sherman (1985). These X and Y cell types in the cat’s lateral geniculate nucleus are innervated by equivalent, distinctive retinal cell types and thus represent an independent, parallel, retino-geniculo-cortical neuronal streams of information. X cells have smaller cell bodies and dendritic arbors that are largely bipolar and oriented perpendicular to the laminar borders of the lateral geniculate nucleus (Friedlander, Lin, Stanford, & Sherman, 1981; Stanford, Friedlander, & Sherman, 1983). Y cells have larger cell bodies and thicker dendrites arranged more or less in a spherical volume (Friedlander et al., 1981;
c10.indd 202
Figure 10.2 Examples of thalamic cell types from lateral geniculate nucleus of the cat. These tracings were made from intracellular labeling during in vivo recording (Friedlander et al., 1981; Hamos et al., 1985). A: X relay cell. Note the grape-like appendages near primary branch points. Three examples are shown at higher magnification. B: Y relay cell. C: Interneuron. The dendrites have the appearance of an axonal terminal arbor, and the many boutons seen among the dendrites are indeed synaptic boutons known as F2. Examples are shown in the higher magnification. Scale: The scale bar is 50 µm for the cell drawings and 10 µm for the insets of A and C.
Stanford et al., 1983). For both cells, retinal inputs innervate relatively proximal dendrites, within about 100 µm from the cell body (Wilson, Friedlander, & Sherman, 1984). On Y cells, these retinal synapses are formed fairly simply onto dendritic shafts, but in X cells, these tend to contact curious grape-like appendages found near proximal dendritic branch points. Interneurons Interneurons are particularly interesting cells because, in addition to conventional axonal outputs, they also produce presynaptic terminals from their dendrites, and these dendritic outputs are more numerous than are the axonal (Friedlander et al., 1981; Hamos, Van Horn, Raczkowski, Uhlrich, & Sherman, 1985; Wilson et al., 1984). Figure 10.2C shows an example of an interneuron from the lateral geniculate nucleus of the cat. The dendrites look so much like an axonal terminal arbor that they have been called axoniform (Guillery, 1966). The axonal arbor distributes within the dendritic arbor, and they look so much alike that, with light microscopy, it is often impossible to distinguish the axon. However, because the axon is myelinated and the dendrites are not, they can readily be distinguished with an electron microscope. Also, much work at the electron microscopic level (Famiglietti & Peters, 1972; Guillery, 1969; Hamos et al., 1985; Ralston, 1971) has made it possible to distinguish the axonal terminals (called F1) from the dendritic terminals (called F2; see Figure 10.2C). One important distinction is that the axonal (F1) terminals are strictly presynaptic
8/17/09 2:06:13 PM
Cell Properties of Thalamic Relay Neurons
(to relay cells and other interneurons), whereas the dendritic (F2) terminals are both presynaptic (mostly to the grape-like clustered appendages of X cells; see Figure 10.2A and Wilson et al., 1984) and postsynaptic (mostly to retinal terminals). The F2 terminals are the only postsynaptic terminals so far described in the thalamus. The circuits entered into by these F2 terminals and the functional properties of interneurons are discussed further later in the chapter. Thalamic Reticular Nucleus Cells The final major cell type in the thalamus is the reticular cell, found in the thalamic reticular nucleus. These tend to have elongated dendritic arbors oriented parallel to the borders of the thalamic reticular nucleus (Figure 10.3; Uhlrich, Cucchiaro, Humphrey, & Sherman, 1991). Their axons project into the main relay nuclei of the thalamus and selectively target relay cells (Cucchiaro, Uhlrich, & Sherman, 1991), and local collaterals provide for contacts between reticular cells (Lam, Nelson, & Sherman, 2006; Sanchez-Vives, Bal, & McCormick, 1997). These cells are
203
also functionally connected via gap junctions (Lam et al., 2006; Landisman et al., 2002).
CELL PROPERTIES OF THALAMIC RELAY NEURONS Thalamic relay cells, like cells throughout the central nervous system, have numerous voltage- and time-gated ionic channels in their membranes. The best known of these are the Na and K channels underlying the conventional action potential (see Figure 10.4). There are many others, including channels for other cations. One that is especially important to thalamic relay cells involves T-type Ca2 channels. (For details of T channel properties, see Huguenard & McCormick, 1994; Jahnsen & Llinás, 1984a, 1984b; and for other voltage gated channels in thalamic neurons, see Huguenard & McCormick, 1994; Sherman & Guillery, 2006.) The properties of these T channels are qualitatively the same as those of Na channels involved with the action potential. Figure 10.4 summarizes these properties, emphasizing the similarities with the T channels shown in Figure 10.5. Basic Properties of the T Channel
Figure 10.3 Example of cell in the thalamic reticular nucleus of Galago filled with neurobiotin. The star in the inset shows the location of the cell body. Redrawn from Figure II-12 of (Sherman and Guillery, 2006) from data supplied by P Smith, K Manning and D Uhlrich. Abbreviations: As in Figure 10.1, plus IC, internal capsule.
c10.indd 203
Figure 10.4 is a review of the main properties of the Na (and K) channels underlying the action potential. When the Na channel is open, Na flows into the cell, producing a depolarizing current known as INa. However, the Na channel has two voltage sensitive gates—an activation gate and an inactivation gate—and both must be open for Na to flow into the cell. At a normal resting membrane potential (e.g., 65 mV), the inactivation gate is open, but the activation gate is closed, and thus there is no inward flow of Na (Figure 10.4A). Here INa is deactivated because the activation gate is closed, but it is also relieved of inactivation (or is de-inactivated) because the inactivation gate is open. When the membrane is depolarized to a certain level, the activation threshold for INa (Figure 10.4B), the activation gate pops open and so INa is both activated and de-inactivated; the result is that Na flows into the cell, producing the depolarizing upswing of the action potential. This depolarization, after a suitable period of 1 msec or so, leads to closing of the inactivation gate, and so while the Na channel remains activated, it is also inactivated (Figure 10.4C). This plus the opening of various slower K channels (channels that do not inactivate because they have only an activation gate), which produces a hyperpolarizing outward flow of K, repolarizes the membrane to near its starting position (Figure 10.1D). However, despite being repolarized, INa
8/17/09 2:06:13 PM
204 Thalamocortical Relations
Figure 10.4 Schematic representation of voltage dependent Na+ and K+ channels underlying the conventional action potential. A–D show the channel events and E shows the effects on membrane potential. The Na+ channel has two voltage dependent gates: an activation gate that opens at depolarized levels and closes at hyperpolarized levels, and an inactivation gate with the opposite voltage dependency. Both must be open for the inward, depolarizing Na+ current (INa) to flow. The K+ channel (actually an imaginary combination of several different K+ channels) has a single activation gate, and when it opens at depolarized levels, an outward, hyperpolarizing K+ current is activated. A: At a resting membrane potential (roughly 60 to 65 mV), the activation gate of the Na+ channel is closed, and so it is deactivated, but the inactivation gate is open, and so it is de-inactivated. The single gate for the K+ channel is closed, and so the K+ channel is also deactivated. B: With sufficient depolarization
to reach its threshold, the activation gate of the Na+ channel opens, allowing Na+ to flow into the cell. This depolarizes the cell, leading to the upswing of the action potential. C: The inactivation gate of the Na+ channel closes after the depolarization is sustained for roughly 1 msec (“roughly,” because inactivation is a complex function of time and voltage), and the slower K+ channel also opens. These combined channel actions lead to the repolarization of the cell. While the inactivation gate of the Na+ channel is closed, the channel is said to be inactivated. D: Even though the initial resting potential is reached, the Na+ channel remains inactivated, because it takes roughly 1 msec (“roughly” having the same meaning as above) of hyperpolarization for de-inactivation. E: Membrane voltage changes showing action potential corresponding to the events in A to D. Redrawn from Figure IV-4 of (Sherman and Guillery, 2006).
remains inactivated because it takes roughly 1 msec of this hyperpolarization to open the inactivation gate, restoring the initial conditions of Figure 10.1A. Thus the two gates of the Na channel have opposite voltage dependencies and both respond relatively quickly to voltage changes. Finally, note that the roughly 1 msec of hyperpolarization needed to de-inactivate the Na channel provides a refractory period limiting firing rates for the action potential to 1000 Hz. While Figure 10.4 shows the basic voltage gated properties of the Na channel, one other feature is essential to propagating an all-or-none action potential. That is, the density of Na channels must be sufficiently high that, once threshold is reached, the further depolarization caused by the initial channels to open causes a self-regenerating,
explosive opening of other channels, and this propagates as an action potential. If the Na channel density were too low, the initial channels to open would lead only to a local depolarization that would decay exponentially. As shown in Figure 10.5, the voltage behavior of the T channel is qualitatively the same as that of the Na channel, with the same two types of voltage gate. At the starting position of Figure 10.5A, the activation gate is closed, but sufficient depolarization opens it (Figure 10.5B), allowing the inward IT that further depolarizes the cell. This depolarization eventually inactivates IT (Figure 10.5C) which, along with the activation of K channels, repolarizes the cell (Figure 10.5D). This repolarization eventually leads to de-inactivation of IT (Figure 10.5A).
c10.indd 204
8/17/09 2:06:14 PM
Cell Properties of Thalamic Relay Neurons
c10.indd 205
205
Figure 10.5 Schematic representation of actions of voltage dependent T (Ca2+) and K+ channels underlying low threshold Ca2+ spike; conventions as in Figure 10.4. Note the strong qualitative similarity between the behavior of the T channel here and the Na+ channel shown in Figure 10.4, including the presence of both activation and inactivation gates with similar relative voltage dependencies. A–D show the channel events and E shows the effects on membrane potential. A: At a relatively hyperpolarized resting membrane potential (roughly 70 mV), the activation gate of the T channel is closed, but the inactivation gate is open, and so the T channel is deactivated and de-inactivated. The K+ channel is also deactivated. B: With sufficient depolarization to reach its threshold, the activation gate of the T channel
opens, allowing Ca2+ to flow into the cell. This depolarizes the cell, providing the upswing of the low threshold spike. C: The inactivation gate of the T channel closes after roughly 100 msec (“roughly”, because, as for the Na+ channel in Figure 10.4, closing of the channel is a complex function of time and voltage), inactivating the T channel, and the K+ channel also opens. These combined actions repolarize the cell. D: Even though the initial resting potential is reached, the T channel remains inactivated, because it takes roughly 100 msec of hyperpolarization for de-inactivation. E: Membrane voltage changes showing low threshold spike corresponding to the events in A to D. Redrawn from Figure IV-5 of (Sherman and Guillery, 2006).
As in the case of Na channels, if a sufficiently high density of T channels exists, the threshold opening of the initial T channels leads to an explosive all-or-none spike. This is the case for thalamic relay cells, and the result is a spike-like depolarization of roughly 25 to 50 mV that propagates throughout the dendrites and soma. T channels are quite common in neurons throughout the central nervous system, but only in rare cells is the density high enough to support all-or-none Ca2 spikes. Thus, this property of all-or-none Ca2 spiking based on T channels is fairly unique to the thalamus. Every relay cell of every nucleus in every mammalian species so far tested shows this property (Sherman & Guillery, 2006). However, a further inspection of Figures 10.4 and 10.5 reveals certain important quantitative differences between the behavior of the Na and Ca2 channels. Perhaps most important are the temporal properties of
the inactivation gates. While the activation gates for both channels respond quickly to voltage changes, as does the inactivation gate of the Na channel, the inactivation gate of the T channel is much slower, requiring roughly 100 msec of a sustained polarization change to open or close. Actually, as is the case for the Na channel, the inactivation gate of the T channel has a complex voltage- and time-dependency, so that the greater the sustained polarization change, the more rapidly the gate opens or closes (Zhan, Cox, Rinzel, & Sherman, 1999). This temporal property for the T channel is important and will be considered further. Another quantitative difference is the functional voltage range: the T channel operates in a more hyperpolarized regime. In fact, because the T channel activates at a more hyperpolarized level, the resulting depolarization, which in thalamic relay cells is an all-or-none Ca2 spike, is also known as the “low threshold spike.”
8/17/09 2:06:15 PM
206 Thalamocortical Relations
One other important difference not shown in Figures 10.4 and 10.5 is the distribution of these channels: T channels are effectively limited to the soma and dendrites, whereas Na channels, often found there as well, are notable for their distribution along the axon. This allows action potentials to travel from the soma to a target far away, and in the case of thalamic relay cells, this Na channel distribution permits action potentials to be delivered to cortical targets. While T channels underlie Ca2 spikes propagated in the dendrites and soma, these spikes do not propagate to the cortex. Thus the significance of these Ca2 spikes ultimately rests with their effect on conventional action potentials as described in the following section. This effect is dramatic and important. Burst and Tonic Firing Modes The primary functional significance of T channels for thalamic relay cells is that they are responsible for which of two very different response modes, called burst and tonic, characterize these cells’ responses (Jahnsen & Llinás, 1984a; Zhan et al., 1999). Figure 10.6 summarizes some of the features of these response modes. If the cell has been initially depolarized by just a few mV (5 mV) from rest, the T channels are inactivated and play no role in the response. This leads to tonic firing (Figure 10.6A) where a depolarizing current injection elicits a stream of unitary action potentials that lasts as long as the stimulus is suprathreshold. If, however, the cell has been hyperpolarized initially by 5 mV or so from rest, the T channels are de-inactivated and primed to respond to the next suitable depolarization, and the result is burst firing. This is shown in Figure 10.6B where the same depolarizing stimulus as in Figure 10.6A now evokes an all-or-none low threshold Ca2 spike with a burst of high frequency action potentials riding its crest. The exact same stimulus (think of this as the same excitatory postsynaptic potential or EPSP evoked from the same retinal input to a geniculate relay cell) creates a very different pattern of action potentials depending on the recent voltage history of the relay cell, and this pattern of firing is the only signal that reaches cortex. To summarize: The recent voltage history of a relay cells determines the inactivation state of its T channels, and this, in turn, determines whether the relay cell responds to its next, suprathreshold excitatory input in tonic or burst mode, a determination that dramatically affects the message relayed to the cortex. Significance of Response Mode for Thalamocortical Relays A major question for which we have only partial and largely hypothetical answers is: What is the functional significance
c10.indd 206
Figure 10.6 Properties of IT and the low threshold Ca2+ spike. All examples are from relay cells of the cat’s lateral geniculate nucleus recorded intracellularly in an in vitro slice preparation. A, B: Voltage dependency of the low threshold spike. Responses are shown to the same depolarizing current pulse delivered intracellularly but from two different initial holding potentials. When the cell is relatively depolarized (A), IT is inactivated, and the cell responds in tonic mode, which is a stream of unitary action potentials to a suprathreshold stimulus. When the cell is relatively hyperpolarized (B), IT is de-inactivated, and the cell responds in burst mode, which involves activation of a low threshold Ca2+ spike (LTS) with multiple action potentials (8 in this example) riding its crest. C: Input-output relationship for another cell. The abscissa is the amplitude of the depolarizing current pulse, and the ordinate is the firing frequency of the cell for the first 6 action potentials of the response, since this cell usually exhibited 6 action potentials per burst in this experiment. The initial holding potentials are shown, and 47 mV and 59 mV reflects tonic mode, whereas 77 mV and 83 mV reflects burst mode. Redrawn from Figure IV-6 of Sherman and Guillery (2006).
of the burst and tonic response modes for thalamic relay performance? One answer comes from a consideration of the fact that the only message reaching the cortex is in the form of action potentials and they are evoked differently in the two response modes. During tonic firing, action potentials are
8/17/09 2:06:15 PM
Cell Properties of Thalamic Relay Neurons
directly evoked by an appropriate, suprathreshold depolarizing stimulus (e.g., an EPSP), and so a larger EPSP will evoke more firing. In other words, there is a relatively linear relationship between input (or EPSP) amplitude and firing rate. During burst firing, however, action potentials are not directly activated by the depolarizing input; instead they are activated by the large, depolarizing low threshold Ca2 spike. Because this Ca2 spike is all-or-none, a larger depolarizing input or EPSP will not evoke a larger Ca2 spike, and thus the input-output relationship during burst firing is highly nonlinear, approximating a step function. These differences are illustrated in Figure 10.6C (Zhan et al., 1999). Figure 10.7 shows related and additional effects of response mode. In this example, a geniculate relay cell is recorded intracellularly in an anesthetized cat while its responses to visual stimuli are monitored. These responses indicate how retinal input is relayed to the cortex. Because of the intracellular recording, it is possible to pass current into the cell either to depolarize its baseline level sufficiently to inactivate IT (e.g., baseline depolarized to –65 mV in Figure 10.7A) and promote tonic firing or to hyperpolarize it (e.g., baseline depolarized to –75 mV in Figure 10.7B) so as to deinactivate IT and promote burst firing. The visual stimulus in this case is a drifting sinusoidal grating, providing a visual stimulus in which contrast varies sinusoidally with time at 2 Hz. Figure 10.7A, lower, shows that the tonic response profile is sinusoidal and accurately reflects the contrast changes in the stimulus. The response in burst mode (Figure 10.7B, lower) does not accurately reflect the contrast changes, showing the sort of nonlinear distortion that can largely be predicted by the cellular properties shown in Figure 10.6C. This provides an obvious advantage for tonic firing because the nonlinear distortion caused by burst firing will limit the fidelity of the message relayed to the cortex. In other words, to faithfully reconstruct the visual world, the cortex is better served by tonic firing. What, then, is the purpose of burst firing? Two possible advantages have been suggested. One advantage is related to spontaneous firing, which is much lower during burst firing (Figure 10.7A, B, upper histograms; Guido, Lu, & Sherman, 1992; Guido, Lu, Vaughan, Godwin, & Sherman, 1995). Actually, the higher spontaneous activity helps preserve linearity during tonic firing because the raised level of activity allows inhibitory components of the visual stimulus to be represented; without this, the response would “bottom out,” reflecting rectification, which is itself a nonlinearity. There is another, perhaps more important, consequence of the difference in spontaneous activity levels. Spontaneous activity can be considered noise from the perspective of the cortex because, by definition, it represents firing in the geniculate relay cell that bears no relation to visual stimulation. The lower histograms in Figure 10.7A and B suggest that
c10.indd 207
207
Figure 10.7 Responses of a representative relay cell in the lateral geniculate nucleus of a lightly anesthetized cat to a sinusoidal grating drifted through the cell’s receptive field. The trace at the bottom reflects the sinusoidal changes in luminous contrast with time. Current was injected into the cell through the recording electrode to alter the membrane potential. Thus in A, the current injection was adjusted so that the membrane potential without visual stimulation averaged 65 mV, promoting tonic firing, because IT is mostly inactivated at this membrane potential; in B, the current injection was adjusted to the more hyperpolarized level of 75 mV, permitting de-inactivation of IT and promoting burst firing. Shown are average response histograms to the visual stimulus (bottom histograms in A and B) and during spontaneous activity with no visual stimulus (top histograms), plotting the mean firing rate as a function of time averaged over many epochs of that time. The sinusoidal changes in contrast as the grating moves across the receptive field are also shown as a dashed, gray curve superimposed on the responses in the lower histograms. Note that the response profile during the visual response in tonic mode looks like a sine wave, but the companion response during burst mode does not. Note also that the spontaneous activity is higher during tonic than during burst firing. Redrawn from Figure VI-2 of Sherman and Guillery (2006).
8/17/09 2:06:16 PM
208 Thalamocortical Relations
the signal relayed during both response modes to visual stimulation are roughly equal in extent, and so the lower noise during burst firing suggests that the signal-to-noise ratio is higher during burst firing. A higher signal-to-noise ratio, in turn, suggests better detectability of the stimulus in the response of the geniculate relay cell, and this has been demonstrated (Guido, Lu, et al., 1995). Another advantage of burst firing is that it more powerfully affects the cortex (Swadlow & Gusev, 2001; Swadlow, Gusev, & Bezdudnaya, 2002). This is because the thalamocortical synapse shows strong paired-pulse depression (Abbott, Varela, Sen & Nelson, 1997; Castro-Alamancos & Connors, 1997; Chung, Li, & Nelson, 2002; Gil, Connors, & Amitai, 1999; Lee & Sherman, 2008, 2009) This is shown in Figure 10.8A, where a facilitating layer 6 corticothalamic synapse (Reichova & Sherman, 2004) is also shown for comparison. For a depressing synapse, action potentials
Figure 10.8 Examples of paired-pulse depression and pairedpulse facilitation. These recordings were made from in vitro slices of the mouse brain in which thalamocortical and corticothalamic projections are retained in the somatosensory system. When electrical stimulation is applied as a train of impulses at fixed frequency to afferents of a recorded cortical or thalamic cell, the resultant EPSPs decrease with stimulus number (paired-pulse depression) or increase (paired-pulse facilitation). A: Example of paired-pulse depression (upper trace; recording from a layer 4 cell and activating inputs from thalamus) and paired-pulse facilitation (lower trace; recording from thalamic relay cell and activating inputs from layer 6 of cortex). B: Time course of paired-pulse effects for the examples in A. The abscissa shows the interstimulus interval, and the ordinate, the measure of depression (left) or facilitation (right) expressed as the ratio of the amplitude of the second EPSP to the amplitude of the first. Unpublished data from laboratory of S.M. Sherman.
c10.indd 208
arriving with interspike intervals of less than 50 to 150 msec (see Figure 10.8B) or so will depress the postsynaptic responses resulting in a smaller EPSP. During tonic firing, interspike intervals are sufficiently high to keep the thalamocortical synapses in more or less a constant state of depression. However, the dynamics of burst firing result in a synapse with no depression. This is because, to burst, a cell must be in a sustained state of hyperpolarization for 100 msec or so (to de-inactivate T channels) before responding to a depolarizing EPSP, and so there can be no action potentials during this period; this imposes a requisite silent period on a cell before each burst meaning that, when the burst is evoked, the thalamocortical synapse is free of depression. Elegant experiments by Swadlow and colleagues (Swadlow & Gusev, 2001; Swadlow et al., 2002) have directly confirmed this (Figure 10.9). Hypothesis for Burst and Tonic Firing To summarize the known functional consequences of firing mode (and there may be many other, undiscovered ones), tonic mode is associated with a more linear relay, while burst mode is associated both with superior signal detection and stronger cortical activation. This has led to the theory (Sherman, 1996, 2001; Sherman & Guillery, 2002, 2006) that burst mode may be involved in providing a strong “wake-up call” to the cortex that something has changed in the environment (e.g., the sudden appearance of a new visual stimulus), particularly in circumstances during which attention is not devoted to the relay under question (e.g., general drowsiness, or inattention, or for auditory thalamic relays while attention is diverted to visual stimuli). There are several very indirect lines of evidence in support of this. One is that bursting of thalamic relay cells increases with drowsiness (Massaux & Edeline, 2003; Ramcharan, Gnadt, & Sherman, 2000; Swadlow & Gusev, 2001). Another is that the initial cycle of a repeating visual tends more frequently to evoke bursting of geniculate relay cells (Guido & Weyand, 1995). Finally, studies of visual responses, including the use of natural visual scenes as stimuli, indicate that the best stimulus to evoke a burst is the replacement in the visual field of an inhibitory stimulus with an excitatory one, for instance, the replacement of a dark spot over the center of an on-center cell with a bright spot (Alitto, Weyand, & Usrey, 2005; Denning & Reinagel, 2005; Lesica & Stanley, 2004; Wang et al., 2007). For this hypothesis to make sense, thalamic circuitry must be arranged in a manner that can efficiently control response mode, promoting the transition between burst and tonic firing under appropriate conditions. That is, there must be inputs to relay cells that effectively control membrane potential for a sufficiently long period (i.e., for at least 100 msec or so) to control the inactivation state of IT. As the next section shows, this is indeed the case.
8/17/09 2:06:16 PM
Circuit Properties of Thalamic Relay Neurons
Figure 10.9 (Figure C.3 in color section) Current source density profiles in cortex generated from single spike in afferent thalamic neuron. Recordings were made simultaneously in an awake rabbit from a single neuron in the ventral posterior medial nucleus of the thalamus and from 16 probes at different depths along a column in the cortical target field of the recorded thalamic neuron. Spike triggered averaging was used to generate the synaptic sinks and sources as shown. A, B: Colorized current source density profile generated by tonic spike (A, ~120,000 thalamic action potentials) or first spike in burst (B, 2427 thalamic action potentials) in the thalamic afferent. The vertical orange line in each indicates the time of the action potential in the thalamic afferent. The red arrow in each shows the current source
evoked by the terminals of the thalamic afferent, and note that this is the same for the tonic and burst spike. The depths of layers 4 and 6 are also indicated. The vertical dashed white lines show the initial 1 msec of the postsynaptic responses, with large sinks in layers 4 and 6. Note the denser sinks for the burst spike (B) compared to the tonic spike (A). C, D: amplitude (peak peak) of the axon terminal response (C, indicated by the red arrows in A, B) and the magnitude of the initial 1 msec of the postsynaptic current sink (D) plotted at different recording sites for both the tonic spike and first spike in a burst. Note that there is no difference in the corticothalamic terminal responses for these two spikes but that the peaks in layers 4 and 6 are greater for the burst spike. Redrawn from Figure 3 of Swadlow et al. (2002).
CIRCUIT PROPERTIES OF THALAMIC RELAY NEURONS
Also intimately associated with relay cells are two types of local, GABAergic neurons that provide inhibitory input to relay cells: these are local interneurons and cells of the nearby thalamic reticular nucleus. Interneurons live among relay cells throughout the relay nuclei of the thalamus, and the ratio is roughly three relay cells to every interneuron across nuclei and species, with one curious exception (Arcelli, Frassoni, Regondi, De Biasi, & Spreafico, 1997). That is, while the lateral geniculate nucleus of the rat and mouse contain roughly 25% interneurons, the rest of the thalamus in these species contain almost no interneurons. This is not a rodent property because the thalamus of other rodent species, like squirrels, guinea pigs, and so on, contains normal numbers of interneurons. There are two major sources of extrinsic input to geniculate circuitry (see Figure 10.10). One is a feedback glutamatergic projection from layer 6 of the visual cortex, and the other is a mostly cholinergic input from
Fortunately, the detailed circuit properties of the thalamus are largely conserved among thalamic nuclei. To be sure, there are some differences in circuitry among thalamic nuclei. Certain ones will be discussed next. Because we know most about the lateral geniculate nucleus, this serves as a convenient model for all of the thalamus. Figure 10.10 schematically shows the main circuitry involving geniculate neurons, including the main transmitters and classes of postsynaptic receptor involved. (These circuit details are reviewed in Sherman & Guillery, 1996, 2004, 2006). Basic Anatomical Circuits Relay cells receive input from retinal axons and, in turn, project to visual cortex, mostly to layer 4 but also to layer 6.
c10.indd 209
209
8/17/09 2:06:17 PM
210 Thalamocortical Relations
Figure 10.10 Schematic view of details of the main connections of the lateral geniculate nucleus. Indicated are the inhibitory or excitatory nature of the synapses, the postsynaptic receptors activated by each input on relay cells, and the neurotransmitters involved. Abbreviations: ACh, acetylcholine; GABA, -aminobutyric acid; Glu, glutamate; LGN, lateral geniculate nucleus; PBR, parabrachial region; TRN, thalamic reticular nucleus.
the parabrachial region of the midbrain. In both cases, individual axons branch to innervate all three thalamic cell classes: relay cells, interneurons, and reticular cells. Not shown for simplicity are various serotonergic, noradrenergic, GABAergic, and dopaminergic inputs from the brain stem and histaminergic inputs from the tuberomamillary nucleus of the hypothalamus. This is partly to avoid unnecessary complication and also because the functional significance of these other inputs is just beginning to be understood. (see Sherman & Guillery, 1996, 2004, 2006). Thus, not only do relay cells receive inputs from the retina, which represents the main input relayed to the cortex, but they also receive inputs from other sources as well. Postsynaptic Receptors on Relay Cells It is clear from Figure 10.10 that nonretinal inputs can influence retinogeniculate transmission. All of these inputs to relay cells operate via conventional chemical synapses, and thus their postsynaptic effects are largely controlled by postsynaptic receptors. These, too, are illustrated in Figure 10.10, and they are divided into two main groups: ionotropic and metabotropic. Examples of ionotropic receptors for the transmitter systems shown are AMPA receptors for glutamate, the GABAA receptor, and nicotinic receptors for acetylcholine; the equivalent metabotropic receptor examples are various metabotropic glutamate receptors, the GABAB receptor, and various muscarinic receptors for acetylcholine. Details of differences between these receptor classes are many (Brown et al., 1997; Conn & Pin, 1997; Mott & Lewis, 1991; Nicoll, Malenka, & Kauer, 1990; Pin &
c10.indd 210
Duvoisin, 1995; Recasens & Vignes, 1995), but two major differences are particularly relevant here. First, excitatory and inhibitory postsynaptic potentials (EPSPs and IPSPs) generated via ionotropic receptors tend to be very brief, on the order of 10 msec or a few 10s of msec, whereas those via metabotropic are much more sustained, lasting 100s of msec to several seconds. Second, metabotropic receptors tend to be less sensitive in the sense that afferent firing rates usually need to be higher before they are activated; this is because these receptors tend to be a bit eccentrically located in the synapse with respect to ionotropic receptors (Lujan, Nusser, Roberts, Shigemoto, & Somogyi, 1996; Somogyi, Tamas, Lujan, & Buhl, 1998), and so more transmitter must be released to reach them. With these differences in mind, it is interesting that retinal input activates only ionotropic receptors (mostly AMPA), whereas all of the nonretinal inputs activate metabotropic receptors, often in addition to activation of ionotropic receptors. One input for which the postsynaptic receptor is not as clear is the input from interneurons to relay cells: clearly GABAA receptors are involved, but there have as yet been no definitive tests for the presence or absence of GABAB receptors for this input. Consequences of Type of Postsynaptic Receptor The fact that only ionotropic receptors are activated by retinal input is good for transfer of temporal information. That is, because the evoked EPSPs are brief, temporal summation does not occur until relatively high rates of firing in the retinal afferents, and thus it is possible to evoke a single EPSP for every retinal action potential for reasonable high rates of firing, thereby representing each input action potential as an EPSP in a one-to-one manner. Put another way, if retinal inputs activated metabotropic glutamate receptors, the sustained EPSPs would summate at lower firing rates, and no longer would postsynaptic responses ultimately relayed to the cortex be a precise copy of the retinal input. The representation of EPSPs evoked via metabotropic glutamate receptors would act like a low pass temporal filter, and temporal information would be lost. In this regard, the activation of metabotropic receptors, because of their long time course, would seem to provide a poor substrate for effective information transfer but an excellent one for modulation. In contrast, the sustained PSPs evoked by nonretinal inputs to relay cells means that sufficient activation of these inputs will provide rather lengthy effects on membrane potential, and thus excitability, of the relay cell. In this way, these nonretinal inputs will serve to modulate the gain or effectiveness of retinogeniculate transmission. Other consequences of these nonretinal inputs can be seen in their control of voltage gated ion channels, and a good
8/17/09 2:06:17 PM
Circuit Properties of Thalamic Relay Neurons
example of this is their ability to control response mode— burst or tonic—of the relay cell. Control of Response Mode Recall from the previous description of T channel behavior that inactivation or de-inactivation requires a change in membrane potential to be sustained for at least roughly 100 msec. PSPs activated via ionotropic receptors are poorly suited to this, because they are too brief. Thus, for instance, an AMPA- or nicotinic-mediated EPSP is too brief to inactivate many T channels for a cell in burst mode, and a GABAA-mediated IPSP is too brief to relieve many T channels of their inactivation for a cell in tonic mode. However, the sustained PSPs of metabotropic receptors, lasting 100s of msec, are ideally suited to control response mode. Thus activation of metabotropic glutamate receptors via layer 6 corticogeniculate input or muscarinic receptors via parabrachial input produces an EPSP sustained enough to inactivate T channels and switch relay cell firing mode from burst to tonic; likewise, activation of GABAB receptors produces an IPSP sustained enough to de-inactivate T channels and switch relay cell firing mode from tonic to burst. Further Details of Effects of Corticogeniculate or Parabrachial Inputs Another consequence of the postsynaptic receptor is that it often determines whether a given neurotransmitter acts in an excitatory or inhibitory manner. In the case of the circuitry shown in Figure 10.10, cholinergic inputs excite relay cells while they inhibit interneurons and reticular cells. This is achieved by two types of muscarinic receptors (McCormick, 1992). Those on relay cells are mostly of the M1 type, and activation of M1 receptors leads to closing of K channels, reducing the outward leakage of K ions and thereby resulting in an EPSP. Those on the GABAergic cells are mostly of the M2 type, and activation of M2 receptors leads to opening of K channels, increasing the outward leakage of K ions and thereby resulting in an IPSP. This allows cholinergic inputs to the thalamus to perform a neat trick: they directly excite relay cells while they indirectly disinhibit them. As a result, increasing activity of parabrachial neurons leads to more depolarized relay cells, making them more responsive to retinal input and biasing them toward the tonic firing mode. Indeed, parabrachial cells become more active with increasing vigilance (Datta & Siwek, 2002; Steriade & Contreras, 1995), and more vigilance is associated with increased retinogeniculate transmission and a shift toward tonic firing (Massaux & Edeline, 2003; Ramcharan et al., 2000; Swadlow & Gusev, 2001).
c10.indd 211
211
The situation with corticogeniculate inputs is more complex. These inputs to relay cells and local GABAergic cells are all excitatory, and thus the circuitry shown in Figure 10.10 suggests that corticogeniculate input directly excites relay cells but indirectly inhibits them, and it is not clear from this perspective what purpose this serves or what effect corticogeniculate input has on the firing mode of relay cells. However, as Figure 10.11 indicates, Figure 10.10 may be misleading in terms of the specifics of corticogeniculate circuitry because it does not reveal important details. Figure 10.11 shows two distinct variants of this circuitry involving the thalamic reticular nucleus; one can imagine similar variants involving interneurons and, of course other variants are possible. The variant shown in Figure 10.11A is an example of classical feed forward inhibition. It might seem puzzling because increased activity leads to both depolarization and (indirect) hyperpolarization of the relay cell, with perhaps minimal effect on the relay cell’s membrane potential. This would have very little effect on T channel inactivation and thus little effect on response mode. However, the resultant increase in synaptic conductance would reduce input resistance of the relay cell, and this and other subtle effects pointed out by Chance, Abbott, and Reyes (2002) would result in a reduced retinogeniculate EPSP amplitude. In other words, this form of feedforward inhibition acts as an effective means of gain control for retinogeniculate transmission. The variant shown in Figure 10.11B has quite a different functional significance. This is no longer an example of feedforward inhibition, but instead, an active corticogeniculate axon will directly excite some relay cells and indirectly
Figure 10.11 Schematic view of different possible corticothalamic circuits involving the thalamic reticular nucleus that have quite different effects on relay cells. A: Feedback inhibitory arrangement. B: Arrangement in which activation of layer 6 cell monosynaptically excites some relay cells (e.g., cell 2) and disynaptically inhibits others (e.g., cells 1 and 3).See text for details.
8/17/09 2:06:18 PM
212 Thalamocortical Relations
inhibit others. In this specific example, increased activity in the corticogeniculate axon will depolarize geniculate cell 2, biasing it toward tonic firing, and hyperpolarize cells 1 and 3, biasing them toward burst firing. Evidence exists that activation of layer 6 corticogeniculate input can have dramatic effects on response mode, switching some relay cells from burst to tonic firing, and others, in the opposite direction (Wang, Jones, Andolina, Salt, & Sillito, 2006). Role of Interneurons Interneurons are particularly interesting cells because, among other properties, they have both axonal (F1) and dendritic (F2) output terminals. The axonal outputs seem to innervate both X and Y relay cells and other interneurons on proximal dendritic shafts with conventional, simple synapses. The dendritic outputs target relay X cells in complex synaptic arrangements known as triads (see Figures 10.12 and 10.13).
Figure 10.13 Schematic view of a synaptic triad. Arrows indicate direction of synaptic function, pointing from presynaptic to postsynaptic elements. The question marks indicate that the presence of the receptor indicated is unclear. Abbreviations: GABA, -aminobutyric acid; Glu, glutamate.
Triadic Circuits The F2 dendritic outputs of interneurons enter into a complex synaptic arrangement known as triads (see Figures 10.12 and 10.13). In the most common form, a retinal terminal
Figure 10.12 Electron micgrographs showing some properties of F2 terminals based on intracellular labeling with horseradish peroxidase of an interneuron in the cat’s lateral geniculate nucleus. A: F2 terminal appended to interneuron dendrite via long, thin process (arrow). B: Section through triad. A retinal terminal (R) synapses onto an F2 terminal and a relay cell dendrite (d), and the F2 terminal synapses onto the same dendrite. The arrows show the direction of the synapses, pointing from presynaptic to postsynaptic elements. Figure reassembled from Hamos et al. (1985).
c10.indd 212
contacts an F2 terminal, and both of these terminals contact the same relay X cell, usually on a grape-like appendage (Hamos et al., 1985; Wilson et al., 1984). This would appear to be a form of simple feedforward inhibition, but a consideration of the postsynaptic receptors involved suggests a more interesting possibility. Release of GABA from the F2 terminal results in inhibition in the relay cell, and the rate of GABA release is strongly determined by the retinal input to the F2 terminal. The retinal input is glutamatergic. As noted, the retinal input to the relay cell acts via ionotropic receptors, but recent evidence (Cox & Sherman, 2000; Govindaiah & Cox, 2004) suggests that the retinal input to the F2 terminal operates mainly via metabotropic glutamate receptors (Figure 10.13). In Figure 10.13, arrows indicate the direction of synaptic function, pointing from presynaptic to postsynaptic elements. The question marks indicate that the presence of the receptor indicated is unclear. Also as noted, metabotropic receptor activation requires higher firing rates in the afferent. The implication here is that, at low firing rates, relay cells will be depolarized via the retinal input, but the feedforward circuit via the F2 terminal will not be activated, and so there will be no feedforward hyperpolarization. As the firing rate in the retinal afferent increases, more and more of the feedforward inhibition will be brought into play to offset the increasing, direct depolarization. There are two possible and related implications to this (Sherman, 2004). First, one function of this circuit is to extend the operating range of the retinogeniculate circuit.
8/17/09 2:06:18 PM
Drivers and Modulators 213
That is, if the retinal input fires at a high enough frequency to cause the relay cell to fire at its maximum frequency, thereby saturating its response, further increases in retinal firing cannot be represented in the relay. This triadic circuit would ensure that higher firing rates would be needed than without the circuit for the relay cell’s response to saturate. Second, this also means that as the firing rate in the retinal afferent increases, the gain of the retinogeniculate transmission is reduced, and furthermore, because the metabotropic response lasts so long, estimated to be several seconds in this example (Govindaiah & Cox, 2004), this reduced gain will continue for a period even if the retinal input reduces its firing level. Since retinal firing level generally increases monotonically with contrast in the visual stimulus, periods of higher stimulus contrast will produce a short, several-second period of reduced visual sensitivity. This phenomenon, known as contrast gain control, is a central feature of the visual system (Geisler & Albrecht, 1995; Määttänen & Koenderink, 1991; Ohzawa, Sclar, & Freeman, 1982). While there is evidence for contrast gain control having neuronal substrates in the retina and the cortex (Beaudoin, Borghuis, & Demb, 2007; Ohzawa et al., 1982; Bernardete, Kaplan, & Knight, 1992; Truchard, Ohzawa, & Freeman, 2000), this may also occur via thalamic processing (Sherman, 2004). Functioning of the Interneuron The F2 terminals are connected to each other and to the stem dendrite via long, thin processes (typically 10 m in length and 1 m in diameter; see Figure 10.12A). Modeling (Bloomfield & Sherman, 1989) suggests that, if there are not significant active processes in the membranes involved, a significant proviso, then any membrane potential changes generated in the F2 terminal (e.g., from activation of metabotropic glutamate receptors) would effectively decay before reaching the stem dendrite and thus have no discernable effects on other F2 terminals or on the cell body. This modeling further suggests that synaptic inputs that effectively control the axonal output are essentially limited to the soma itself and proximal dendrites. The hypothesis, then, is that the interneuron massively multiplexes, with an axonal output controlled in a conventional means via proximal inputs and dendritic outputs controlled locally and independently via direct inputs onto these F2 terminals (Sherman, 2004). Generality of Circuit Properties While Figures 10.10 through 10.13 refer specifically to the lateral geniculate nucleus, with minor exceptions, the principles they represent seem to be found throughout the thalamus. An important proviso is that these properties have
c10.indd 213
been documented regarding thalamic nuclei for which sufficient information is available, but there are some that have not been much studied to date. Most of our knowledge is based on studies of thalamic nuclei that project mainly to layer 4 of the cortex, but some nuclei, such as the midline and interlaminar nuclei (see Figure 10.1) project largely to layer 1, and very little is known of their detailed cell and circuit properties.
DRIVERS AND MODULATORS A glance at Figure 10.10 reveals a common situation in brain circuitry that is often ignored or overlooked. That is, relay cells receive inputs from many different sources, but these do not act as some sort of anatomical democracy to equally affect relay cell responses. In fact, only one of these inputs, the retinal for the lateral geniculate nucleus and equivalent for other nuclei (e.g., lemniscal input for the ventral posterior nucleus and inferior collicular input for the medial geniculate nucleus), represents the actual input to be relayed to the cortex. In the case of the lateral geniculate nucleus, for example, the receptive fields of the relay cells represent the information relayed to the cortex, and these receptive fields have the same center/surround configuration as their retinal inputs but are very different from the orientation and direction selective receptive fields of layer 6 cells, not to mention the lack of clear visual receptive fields for parabrachial inputs (reviewed in Sherman & Guillery, 1996, 2006). The retinal input stands alone in terms of being the main information source to be relayed, but it also differs from nonretinal input along a number of anatomical, physiological, and pharmacological properties, and these differences extend to other thalamic nuclei. This has led to the conclusion that these form two different types of input exemplified by retinal and nonretinal input, and termed drivers (for the retinal equivalent because these provide a uniquely powerful drive of relay cells) and modulators (for the nonretinal equivalent because these chiefly modulate thalamic transmission of driver input; Sherman & Guillery, 1998). Table 10.1 summarizes these differences (reviewed in Sherman & Guillery, 1998, 2004, 2006); the 13 criteria in Table 10.1, in a roughly decreasing order of importance, are: 1. As already suggested for the lateral geniculate nucleus, drivers determine the main receptive field properties of the relay cell; modulator input does not. 2. Also as already noted, drivers activate only ionotropic receptors; modulators activate metabotropic as well as ionotropic receptors.
8/17/09 2:06:20 PM
214 Thalamocortical Relations TABLE 10.1 Criteria
Drivers and modulators in LGN plus layer 5 drivers
Retinal (Driver)
Layer 5 to HO (Driver) Modulator: Layer 6
Modulator: PBR
Modulator: TRN and Int
1
Determines relay cell receptive field
Determines relay cell receptive field*
Does not determine relay cell receptive field
Does not determine relay cell receptive field
Does not determine relay cell receptive field
2
Activates only ionotropic receptors
Activates only ionotropic receptors
Activates metabotropic receptors
Activates metabotropic receptors
TRN: Activates metabotropic receptors; Int:†
3
Large EPSPs
Large EPSPs
Small EPSPs
†
TRN: small IPSPs; Int:†
4
Large terminals on proximal dendrites
Large terminals on proximal dendrites
Small terminals on distal dendrites
Small terminals on proximal dendrites
Small terminals; TRN: distal; Int: proximal
5
Each terminal forms multiple contacts
Each terminal forms multiple contacts
Each terminal forms single contact
Each terminal forms single contact
Each terminal forms single contact
6
Little convergence on to target
Little convergence on to target*
Much convergence on to target
†
†
7
Very few synapses on to relay cells (5%)
Very few synapses on to relay cells (5%)
Many synapses on to relay cells (30%)
Many synapses on to relay cells (30%)
Many synapses on to relay cells (30%)
8
Often thick axons
Often thick axons
Thin axons
Thin axons
Thin axons
9
Glutamatergic
Glutamatergic
Glutamatergic
Cholinergic
GABAergic †
10
Synapses show pairedpulse depression (high p)
Synapses show paired-pulse depression (high p)*
Synapses show pairedpulse facilitation (low p)
†
11
Well localized, dense terminal arbors
Well localized, dense terminal arbors
Well localized, dense terminal arbors
Sparse terminal arbors
Well localized, dense terminal arbors
12
Branches innervate subtelencephalic targets
Branches innervate subtelencephalic targets
Subcortically known to innervate thalamus only
†
Subcortically known to innervate thalamus only
13
Innervates dorsal thalamus but not TRN
Innervates dorsal halamus but not TRN
Innervates dorsal thalamus and TRN
Innervates dorsal thalamus and TRN
TRN: both; Int: dorsal thalamus only
* †
Very limited data to date. No relevant data available.
3. Drivers evoke very large excitatory postsynaptic potentials; modulators generally evoke much smaller excitatory or inhibitory postsynaptic potentials. 4. Drivers form very large terminals on proximal dendrites; modulators usually form small terminals, and these can be on proximal or distal dendrites. 5. Each driver terminal forms multiple large synapses; each modulator terminal usually forms a single, small synapse. 6. Driver inputs show little convergence, meaning, for example, that one or a small number of retinal axons converge to innervate each geniculate relay cell; where evidence is available, modulator inputs show considerable convergence. 7. Driver inputs produce a small minority (5%) of the synapses onto relays cells; many modulator inputs produce
c10.indd 214
8. 9. 10.
11.
larger synaptic numbers (e.g., the local GABAergic, cortical, and parabrachial modulator inputs in Figure 10.10 each produce about 30% to 40% of the synapses). Drivers have thick axons; modulators have thin axons. Drivers are glutamatergic; modulators can use a variety of neurotransmitters. Driver synapses show high release probability and paired-pulse depression; modulator synapses that have been tested so far show the opposite properties of low release probability and paired-pulse facilitation (see Figure 10.8). Driver terminal arbors are well localized with a dense array of terminals; modulator terminal arbors can be either well-localized and dense or relatively poorly localized and sparse.
8/17/09 2:06:20 PM
First and Higher Order Thalamic Relays 215
12. Branches of driver axons tend to innervate extrathalamic targets as well as the thalamus (e.g., many or all retinogeniculate axons branch and also innervate midbrain targets); those modulator inputs so far tested innervate the thalamus only. 13. Driver inputs innervate relay cells and interneurons in the dorsal thalamus but do not innervate the thalamic reticular nucleus; modulator inputs innervate relay cells, interneurons, and reticular cells. This driver/modulator distinction is clear not just in the lateral geniculate nucleus, but also in other thalamic relays for which sufficient information is available, such as the ventral portion of the medial geniculate nucleus (the primary auditory thalamic relay) and the ventral posterior nucleus (the primary somatosensory thalamic relay). The main point, again, is that not all anatomical pathways are functionally equivalent, and if we are to understand the functional organization of the thalamus and what it is that is being relayed, we must identify and characterize the driver input. This may also apply outside of the thalamus. This point is considered further in the next section.
FIRST AND HIGHER ORDER THALAMIC RELAYS There are two aspects of functional organization of thalamic nuclei that should be considered. One is the actual relay mechanisms, which are related to the cell and circuit properties defined earlier. The other is a determination of what, exactly, is being relayed by a given nucleus. This second functional property seems clearly defined for some nuclei, such as lateral geniculate nucleus, which relays retinal input, but until recently has not been so clear for many other nuclei, such as the pulvinar or medial dorsal nucleus. From the previous section, it should be clear that understanding this second property boils down to identifying the driver input to any particular nucleus. The recent ability to identify driver inputs to many heretofore rather mysterious nuclei, like the pulvinar or medial dorsal nucleus, has led to the further suggestion that, based on the origin of driver inputs, subcortical or cortical, thalamic nuclei can be identified as first order or higher order. Division of Thalamic Relays into First Order and Higher Order This distinction is well characterized by comparing the two main visual thalamic relays, the lateral geniculate nucleus and pulvinar (see Figure 10.14A). These two nuclei have the same general pattern of modulator inputs from local GABAergic neurons, the brain stem, and layer 6 of cortex.
c10.indd 215
Data that have accumulated over the past few decades make it clear that the pulvinar receives its driver input from layer 5 of one cortical area and projects it to another (reviewed in Guillery, 1995; Guillery & Sherman, 2002a; Sherman & Guillery, 2006). This means that all thalamic nuclei receive a modulator projection from layer 6 that is mostly feedback but that some in addition receive a driver projection from layer 5 (instead of a subcortical driver, such as from the retina) that is feedforward (Van Horn & Sherman, 2004). As indicated in Figure 10.14A, this feedforward layer 5 input places these higher order thalamic nuclei in the middle of a cortico-thalamo-cortical route of information flow. The main sensory thalamic relays can be divided into first order and higher order. In addition to the lateral geniculate nucleus (first order) and the pulvinar (higher order) for vision, there is the ventral posterior nucleus (first order) and the posterior nucleus (higher order) for somesthesis, and the ventral division of the medial geniculate nucleus (first order) and its dorsal division (higher order) for hearing (reviewed in Guillery, 1995; Guillery & Sherman, 2002a; Sherman & Guillery, 2006). Other thalamic relays have also been so identified: The medial dorsal nucleus is mostly or wholly a higher order relay innervating prefrontal cortex; the ventral anterior and lateral nuclear complex, which innervates motor cortex, includes first order circuits based on cerebellar inputs and higher order circuits based on inputs from layer 5 of the motor cortex; and so on. While not all of the thalamus has been so identified yet as regards this division, it seems clear that most of the thalamic volume is involved in higher order relays. There is an important proviso to this, namely, that while first order nuclei seem fairly purely first order, those designated as higher order may have first order components as well. For instance, while most of the pulvinar receives layer 5 input from various regions of the visual cortex and thus appears to participate as a higher order relay in a corticothalamo-cortical circuit, parts of pulvinar are innervated by the superior colliculus. It is not entirely clear whether this colliculo-pulvinar pathway is a driver or modulator (or something else heretofore not described), but there is some anatomical evidence that at least some of the colliculopulvinar terminalis are quite large, suggesting that they are drivers (Kelly, Li, Carden, & Bickford, 2003). If so, then the pulvinar would represent a mixture of mostly higher order relays with some first order relays. Likewise, the posterior medial nucleus, which receives input from layer 5 of somatosensory cortex, also receives some direct spinothalamic input, but it is not known whether this latter input is a driver or modulator. A similar proviso exists for the dorsal portion of the medial geniculate nucleus, which we defined previously as a higher order nucleus: this receives input from the “belt” region of the inferior colliculus, but again,
8/17/09 2:06:20 PM
216 Thalamocortical Relations
Figure 10.14 Schematic diagrams showing organizational features of first and higher order thalamic nuclei. A, B: Distinction between first order and higher order thalamic nuclei. A first order nucleus (A) represents the first relay of a particular type of subcortical information to a first order or primary cortical area. A higher order nucleus (B) relays information from layer 5 of one cortical area to another. This relay can be between first and higher order cortical areas as shown or between two higher order cortical areas. C: Role of higher order thalamic nuclei in cortico-cortical communication via cortico-thalamo-cortical circuits involving a
c10.indd 216
projection from layer 5 of cortex to a higher order thalamic relay to another cortical area. As indicated, the role of the direct corticocortical projections, driver or modulator or other, is unclear. Note in A–C that the driver inputs, both subcortical and from layer 5, are typically from branching axons, the significance of which is elaborated in the text. Abbreviations: FO, first order; HO, higher order; LGN, lateral geniculate nucleus; MGNv or MGNd, ventral or dorsal division of medial geniculate nucleus; PO, posterior nucleus; TRN, thalamic reticular nucleus; VP, ventral posterior nucleus.
8/17/09 2:06:21 PM
First and Higher Order Thalamic Relays 217
it is not known if this is a driver or modulator. Finally, the medial dorsal nucleus, which has much layer 5 input from the prefrontal cortex also has input from the superior colliculus, and although this latter input is described as if it were a driver (Sommer & Wurtz, 2004a, 2004b), insufficient evidence exists as to its identity. Given the possibility that some thalamic nuclei defined here as higher order may also have first order components operating in parallel, we refer below to first order and higher order thalamic “relays” rather than “nuclei.” Implications for Cortical Functioning The concept of higher order relays offering a cortico-thalamo-cortical route for information processing should be seen in the context of the traditional view best expressed by Van Essen and colleagues (Felleman & Van Essen, 1991; Van Essen, Anderson, & Felleman, 1992), namely, that cortical areas communicate with one another via a plethora of direct cortico-cortical connections. In the visual cortex of rhesus monkeys, for instance, this view states that information is brought to the primary visual cortex by the lateral geniculate nucleus, and once it reaches the cortex, it stays there, being processed by the 30-odd visual areas of the cortex through a series of several parallel feedforward routes involving 4 or 5 hierarchical levels. This scheme also has feedback and lateral connections, and the direction of all of these pathways are defined by criteria dependent mostly on the laminar pattern of the cortico-cortical terminations. The cortico-thalamo-cortical pathways may be seen as a complementary or even alternate route for information processing by the cortex, and in this context means that the thalamus is not there just to bring information from the periphery to the cortex but also serves a central role in ongoing cortical processing. One way to try to gain insight into the functional significance of these various pathways is to recall the example of the lateral geniculate nucleus: not all inputs to relay cells are information bearing (i.e., drivers). It is interesting to speculate that the driver/modulator distinction that is so valuable in elucidating functional pathways through the thalamus might also apply beyond the thalamus, especially in the cortex. If so, then it would be appropriate to consider which of the direct cortico-cortical and indirect cortico-thalamo-cortical pathways, which are all glutamatergic pathways, are drivers or modulators.
the prototypical glutamatergic modulator. By these criteria, evidence exists that thalamocortical synapses, both from first order and higher order relay cells, have driver properties (Lee & Sherman, 2008, 2009). Likewise, the layer 5 corticothalamic synapses have driver properties (Guillery, 1995; Reichova & Sherman, 2004). Thus the cortico-thalamocortical pathways involving higher order thalamic relays appear to be functional information routes. In other words, as shown in Figure 10.14C, first order relays bring information of a certain type (e.g., visual) from a subcortical site (e.g., the retina) to the cortex for the first time, and higher order relays are used to pass on this information up the cortical hierarchy as it is processed. Less is known about the direct cortico-cortical synapses. These pathways have been defined almost strictly by anatomical criteria, and the assertion that all, or at least all of the feedforward cortico-cortical projections, are drivers and not modulators (or perhaps something entirely different) is not founded on empirical data. Evidence is now available that the driver/modulator classification works for at least one specific cortical circuit. Figure 10.15 shows that layer 4 cells in the visual cortex receive geniculate inputs with driver properties: these inputs provide the basic receptive field properties of their target cortical cells, and their synaptic properties, including paired-pulse depression of large EPSPs and lack of metabotropic receptor activation, are also driver characteristics (Lee & Sherman, 2008, 2009). These same layer 4 cells receive another glutamatergic input from branches of layer 6 corticogeniculate axons, and this synaptic input has modulator characteristics, including paired-pulse facilitation of small EPSPs and the presence of metabotropic receptor activation (Lee & Sherman, 2008, 2009). The numbers are also interesting because in both pathways the driver inputs to geniculate relay cells and layer
Drivers and Modulators in Various Thalamic and Cortical Circuits The retinogeniculate synapse can serve as the prototypical glutamatergic driver, and the layer 6 thalamocortical synapse,
c10.indd 217
Figure 10.15 Schematic view of selected driver and modulator pathways, the percentages reflecting the relative number of synapses associated with each input.
8/17/09 2:06:21 PM
218 Thalamocortical Relations
4 cortical cells operate over very few (but powerful) synapses, representing only ˜5% of the total (Ahmed, Anderson, Douglas, Martin, & Nelson, 1994; Latawiec, Martin, & Meskenaite, 2000; Van Horn, Eris¸ir, & Sherman, 2000), whereas the glutamatergic modulators inputs operate over many more (but weak) synapses, being about 35% of the input to relay cells and about 45%, to layer 4 cells (Ahmed et al., 1994; Ahmed, Anderson, Martin, & Nelson, 1997; Eris¸ir, Van Horn, & Sherman, 1997; Van Horn et al., 2000). Thus, while the thalamo-cortico-thalamic circuits involving higher order thalamic relays appears to be a functioning circuit to transmit information between cortical areas, it remains to be determined just what functional properties characterize the direct cortico-cortical projections. Nature of Information Relayed by the Thalamus As shown in Figure 10.14, a curious but potentially important fact is that many and perhaps all driver inputs to thalamic relay cells involve branching axons, with one branch innervating relay cells, and the other, extrathalmic subcortical targets (reviewed in Guillery, 2003, 2005; Guillery & Sherman, 2002b; Sherman & Guillery, 2006). Thus, many or all retinogeniculate axons branch to innervate the pretectum and superior collicus (Sur, Esguerra, Garraghty, Kritzer, & Sherman, 1987; Tamamaki, Uhlrich, & Sherman, 1994), and many or all layer 5 corticothalamic axons likewise branch to innervate other brain stem targets, sometimes reaching into the spinal cord (reviewed in Guillery, 2003, 2005; Guillery & Sherman, 2002b; Sherman & Guillery, 2006). Note that, unlike the layer 5 corticothalamic axons, which do not innervate the thalamic reticular nucleus but do branch innervate extrathalamic targets, layer 6 corticothalamic axons innervate the thalamic reticular nucleus but do not extend beyond the thalamus. Guillery (2003, 2005) reviewed these data and pointed out that the major extrathalamic targets of driver afferents to the thalamus appear to be motor targets, as if the messages actually sent to the cortex via the thalamus represent a sort of efference copy of motor commands, starting perhaps as very crude, preliminary commands that are updated and improved on as the message ascends the cortical hierarchy via the ascending cortico-thalamo-cortical circuits. The idea of efference copy is that a command sent to a motor center to initiate movement is copied to other brain areas, such as the cortex, so that these motor commands can be accounted for in the animal’s experience (for details, see Andersen, Snyder, Bradley, & Xing, 1997; Nelson, 1996; Thier & Ilg, 2005; Webb, 2004). Further details of these ideas of efference copy as regards thalamic circuitry can be found in Guillery (2003, 2005).
c10.indd 218
Direct Cortico-Cortical versus Cortico-Thalamo-Cortical Circuits Figure 10.16 summarizes the main conclusions to be derived from an understanding of the existence of higher order thalamic relays. Figure 10.16A shows the conventional view. Here, information is relayed from the periphery by appropriate thalamic nuclei (e.g., the lateral geniculate or ventral posterior nuclei) to primary sensory cortex. From there, the information is processed by direct cortico-cortical connections through several hierarchical levels, including sensorimotor areas, and finally reaches motor cortex, from which a motor command is sent out of the cortex. This view has definite entry and exit points for information processing—the primary sensory cortex and motor cortex, respectively. It also has no definite role for most of the thalamus that we have identified as higher order (labeled by question marks).
Figure 10.16 Comparison of conventional view (A) with the alternative view proposed here (B). The question marks in A indicate higher order thalamic relays, for which no specific function is suggested. The question marks in B indicate uncertainty about the role of the direct corticocortical connections (see text for details). Abbreviations: FO, first order; HO, higher order. Further details in text.
8/17/09 2:06:21 PM
Summary
Figure 10.16B shows the alternative view offered here. By this view, from the beginning, the information from the periphery brought to first order thalamic relays is carried via branching axons that also innervate motor structures, suggesting the possibility that these primary messages relayed to the cortex are also some form of crude motor command. The further processing of information at the cortical level involves cortico-thalamocortical pathways using higher order thalamic relays. Here, too, the corticothalamic limb involves branching axons that also innervate motor structures as if the motor commands are being updated and refined by this cortical processing. There are two other points to notice about Figure 10.16B. First, there is no single entry to or exit from the cortex for information processing. Even the cortex regarded as solely sensory (e.g., primary visual cortex) has a layer 5 output to motor structures: indeed, as far as we know, all cortical areas have such a layer 5 output. Thus, electrical activation of the primary visual cortex in the monkey generates eye movements (Tehovnik, Slocum, & Schiller, 2003). In this regard, the very concept of a cortical area being either sensory or motor needs to be reconsidered. Second, Figure 10.16B raises the question of the direct cortico-cortical projections. Do they function as drivers, modulators, a combination, or something entirely different? One possibility is that a partial combination of panels A and B of Figure 10.16 is closer to the truth. That is, the cortico-cortical and cortico-thalamo-cortical circuits may represent two relatively independent, parallel streams of information processing. Perhaps the larger (anatomically) direct cortico-cortical route may reflect the major bulk of the basic information processing, while the cortico-thalamo-cortical route may be a means of each cortical area informing its upstream partner about motor commands it initiated so that this will not lead to confusion in how the outside world is represented. An example of this is the problem presented by eye movements: such movements create a visual stimulus on the retina of the visual environment moving in the opposite direction, and the the visual cortex must be able to distinguish between such self-generated stimuli and those actually initiated in the visual environment. The cortico-thalamo-cortical pathways may provide just this sort of information. However, the actual role of the various pathways, direct cortico-cortical and cortico-thalamo-cortical, remains unknown, and while there is some experimental evidence that the synapses in the cortico-thalamo-cortical circuit are all drivers, the actual synaptic function of direct cortico-cortical pathways remains to be determined.
c10.indd 219
219
SUMMARY There are two main points to be made here. First, the thalamus is not a simple, machine-like relay, but instead its cell and circuit properties control the flow of information to the cortex in dynamic and state-dependent ways. Second, in addition to getting information to the cortex in the first place, the thalamus continues to play a role in processing that information via cortico-thalamo-cortical circuits involving higher order thalamic relays. One of the challenges to understanding how the cortex processes information is to understand the relative function of the direct corticocortical and indirect cortico-thalamo-cortical circuits. Thalamic Relay Functions The fact that relay cells receive 95% of their input from modulatory sources clearly indicates that many thalamic relay functions are under strong dynamic control. We are just beginning to understand this, and much of the control seems to be affected through control of membrane voltage. As is indicated in Figures 10.10 and 10.11, external modulatory inputs (e.g., feedback cortical and brain stem inputs) operate directly and indirectly via local GABAergic neurons to provide push-pull control of the membrane voltage. The example of how this interacts with the voltage—and time-gated Ca2 T channel has been detailed and, in addition to the ubiquitous Na channel underlying the classic action potential, this may be the best understood example of effects of membrane potential on relay cell functions. However, relay cells exhibit other voltage- and time-gated ion channels, including various K channels, other Ca2 and Na channels, and mixed cation channels, and these are understood much less well (for further details, see Huguenard & McCormick, 1994; Sherman & Guillery, 2006). This plus the fact that all of these channels likely have complex interactions with one another indicates that there is still much to learn about the effects of membrane voltage on thalamic relay cell functions. The synaptic triad involving dendritic outputs of interneurons provides another interesting but not well-understood relay function. A hypothesis has been advanced that this circuit helps to maintain a larger dynamic range of input/output relationships for the relay cell that involves controlling gain of the retinogeniculate synapse, a process that could also support the mechanism of contrast gain control. This is yet another idea that requires more data. Significance of Driver and Modulators and Higher Order Thalamic Relays The importance of the driver/modulator distinction in the thalamus seems fairly clear and straightforward. One can
8/17/09 2:06:22 PM
220 Thalamocortical Relations
partly define the function of a thalamic relay by defining its driver input, and thus we can now argue that much of the function of heretofore rather mysterious nuclei like the pulvinar or medial dorsal nucleus is to relay information originating in layer 5 of the cortex. This, in, turn, defines higher order relays. Another more subtle implication of this distinction is related to the concept of labeled lines: Whatever the cause of a particular neuron firing, the result is always interpreted based on the most likely natural cause. For example, pressure applied to the side of the eyeball creates the perception of light and dark spots in the visual field because of the resultant effect on photoreceptors; it is not perceived as increased intraocular pressure. The cortex must always interpret the firing of relay cells as being due to driver input. Thus, for the lateral geniculate nucleus, every relay cell response must be interpreted as being due to retinal input and not cortical or brain stem. There is some evidence in anesthetized cats that practically every action potential seen in a geniculate relay cell can be attributed to a retinal spike (Cleland, Dubin, & Levick, 1971), so this concept is not so difficult to accept. A final and perhaps most profound implication of the driver/modulator concept is that it dictates that, not only are all inputs to a neuron not equal functionally, but in terms of information transfer versus modulation, only a very small subset of inputs to the thalamus are drivers. This distinction seems quite robust in the thalamus and offers a very different way of looking at information transfer. One important issue is the extent to which this distinction, so clear in the thalamus, can be extrapolated elsewhere, such as the cortex. Most cortico-cortical pathways, especially between areas, are glutamatergic, and it may be significant that metabotropic glutamate receptors are common in the cortex (Caleo et al., 2007). This means that some as yet undetermined subset of these pathways activate metabotropic glutamate receptors (Lee & Sherman, 2008, 2009), and as noted, this seems an important property of modulators. Thus, it seems plausible that many cortical pathways are modulatory. Nonetheless, such is our general ignorance of the functional properties of cortical circuitry and particularly of cortico-cortical projections between areas, that these pathways may require a classification scheme completely different from or in addition to the driver/modulator categories. First Order and Higher Order Thalamic Relays The major implication of the division of the thalamus into first and higher order relays is that, via the latter, corticocortical communication may depend heavily on the thalamus, a thalamic function previously unknown. It is possible that all cortico-cortical communication is via corticothalamo-cortical circuits and that all direct cortico-cortical
c10.indd 220
pathways are modulatory. If so, this would mean that all information entering a cortical area, whether from the periphery (e.g., retina) or another cortical area, must pass through the thalamus. In other words, retinal information does not innervate the cortex directly but passes through a thalamic relay (i.e., the lateral geniculate nucleus) and this applies to cortico-cortical communication as well. A more plausible implication has been suggested earlier. That is, while some undetermined fraction of corticocortical pathways are not information bearing, many are, and the direct cortico-cortical and indirect cortico-thalamocortical circuits represent two parallel paths of information processing. More data are needed to sort this out. Nature of Driver Inputs to Thalamic Relay Cells A curious fact about many, and perhaps all, of the subcortical and layer 5 driver inputs to thalamic relay cells is that they are comprised of branching axons, with the extrathalamic branch innervating motor centers (see Figure 10.14; Guillery, 2003). The significance of this has been discussed in some detail by Guillery (2003, 2005) and will not be repeated here. Nonetheless, this anatomical fact does suggest that much of the evolution of the thalamus and the cortex has involved getting information to the cortex about motor commands and their updating. The thalamus has come a long way from when it was seen as an uninteresting structure whose only role was to relay information simply and consistently from the periphery to the cortex. We now understand that these relay functions are quite complicated and that the thalamus continues to play a role beyond simply getting information to the cortex from the periphery. Nonetheless, we are just beginning to understand these broader and more interesting functions of the thalamus. The challenge is to continue along these lines with more research focused on these subjects.
REFERENCES Abbott, L. F., Varela, J. A., Sen, K., & Nelson, S. B. (1997). Synaptic depression and cortical gain control. Science, 275, 220–224. Ahmed, B., Anderson, J. C., Douglas, R. J., Martin, K. A. C., & Nelson, J. C. (1994). Polyneuronal innervation of spiny stellate neurons in cat visual cortex. Journal of Comparative Neurology, 341, 39–49. Ahmed, B., Anderson, J. C., Martin, K. A. C., & Nelson, J. C. (1997). Map of the synapses onto layer 4 basket cells of the primary visual cortex of the cat. Journal of Comparative Neurology, 380, 230–242. Alitto, H. J., Weyand, T. G., & Usrey, W. M. (2005). Distinct properties of stimulus-evoked bursts in the lateral geniculate nucleus. Journal of Neuroscience, 25, 514–523. Andersen, R. A., Snyder, L. H., Bradley, D. C., & Xing, J. (1997). Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annual Reviews of Neuroscience, 20, 303–330.
8/17/09 2:06:22 PM
References 221 Arcelli, P., Frassoni, C., Regondi, M. C., De Biasi, S., & Spreafico, R. (1997). GABAergic neurons in mammalian thalamus: A marker of thalamic complexity? Brain Research Bulletin, 42, 27–37.
Gil, Z., Connors, B. W., & Amitai, Y. (1999). Efficacy of thalamocortical and intracortical synaptic connections: Quanta, innervation, and reliability. Neuron, 23, 385–397.
Beaudoin, D. L., Borghuis, B. G., & Demb, J. B. (2007). Cellular basis for contrast gain control over the receptive field center of mammalian retinal ganglion cells. Journal of Neuroscience, 27, 2636–2645.
Govindaiah, Cox, C. L. (2004). Synaptic activation of metabotropic glutamate receptors regulates dendritic outputs of thalamic interneurons. Neuron, 41, 611–623.
Bernardete, E. A., Kaplan, E., & Knight, B. W. (1992). Contrast gain control in the primate retina: P cells are not X-like, some M cells are. Visual Neuroscience, 8, 483–486.
Guido, W., Lu, S.-M., & Sherman, S. M. (1992). Relative contributions of burst and tonic responses to the receptive field properties of lateral geniculate neurons in the cat. Journal of Neurophysiology, 68, 2199–2211.
Bloomfield, S. A., & Sherman, S. M. (1989). Dendritic current flow in relay cells and interneurons of the cat’s lateral geniculate nucleus. Proceedings of the National Academy of Sciences, USA, 86, 3911–3914. Brown, D. A., Abogadie, F. C., Allen, T. G., Buckley, N. J., Caulfield, M. P., Delmas, P., et al. (1997). Muscarinic mechanisms in nerve cells. Life Sciences, 60, 1137–1144. Caleo, M., Restani, L., Gianfranceschi, L., Costantin, L., Rossi, C., Rossetto, O., et al. (2007). Transient synaptic silencing of developing striate cortex has persistent effects on visual function and plasticity. J Neurosci 27[17], 4530–4540. 2007. Casagrande, V. A. and X. Xu. (1994). Parallel visual pathways: A comparative perspective. In L. M. Chalupa & J. S. Werner (Eds.), Visual neurosciences (pp. 494–506). Cambridge, MA: MIT Press. Castro-Alamancos, M. A., & Connors, B. W. (1997). Thalamocortical synapses. Progress in Neurobiology, 51, 581–606. Chance, F. S., Abbott, L. F., & Reyes, A. (2002). Gain modulation from background synaptic input. Neuron, 35, 773–782. Chung, S., Li, X., & Nelson, S. B. (2002). Short-term depression at thalamocortical synapses contributes to rapid adaptation of cortical sensory responses in vivo. Neuron, 34, 437–446. Cleland, B. G., Dubin, M. W., & Levick, W. R. (1971). Sustained and transient neurones in the cat’s retina and lateral geniculate nucleus. Journal of Physiology, 217, 473–496. Conn, P. J., & Pin, J. P. (1997). Pharmacology and functions of metabotropic glutamate receptors. Annual Review of Pharmacology and Toxicology, 37, 205–237. Cox, C. L., & Sherman, S. M. (2000). Control of dendritic outputs of inhibitory interneurons in the lateral geniculate nucleus. Neuron, 27, 597–610. Cucchiaro, J. B., Uhlrich, D. J., & Sherman, S. M. (1991). Electronmicroscopic analysis of synaptic input from the perigeniculate nucleus to the A-laminae of the lateral geniculate nucelus in cats. Journal of Comparative Neurology, 310, 316–336. Datta, S., & Siwek, D. F. (2002). Single cell activity patterns of pedunculopontine tegmentum neurons across the sleep-wake cycle in the freely moving rats. Journal of Neuroscience Research, 70, 611–621. Denning, K. S., & Reinagel, P. (2005). Visual control of burst priming in the anesthetized lateral geniculate nucleus. Journal of Neuroscience, 25, 3531–3538. Eri¸sir, A., Van Horn, S. C., & Sherman, S. M. (1997). Relative numbers of cortical and brainstem inputs to the lateral geniculate nucleus. Proceedings of the National Academy of Sciences, USA, 94, 1517–1520. Famiglietti, E. V. J., & Peters, A. (1972). The synaptic glomerulus and the intrinsic neuron in the dorsal lateral geniculate nucleus of the cat. Journal of Comparative Neurology, 144, 285–334. Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1–47. Friedlander, M. J., Lin, C.-S., Stanford, L. R., & Sherman, S. M. (1981). Morphology of functionally identified neurons in lateral geniculate nucleus of the cat. Journal of Neurophysiology, 46, 80–129. Geisler, W. S., & Albrecht, D. G. (1995). Bayesian analysis of identification performance in monkey visual cortex: Nonlinear mechanisms and stimulus certainty. Vision Research, 35, 2723–2730.
c10.indd Sec2:221
Guido, W., Lu, S.-M., Vaughan, J. W., Godwin, D. W., & Sherman, S. M. (1995). Receiver operating characteristic (ROC) analysis of neurons in the cat’s lateral geniculate nucleus during tonic and burst response mode. Visual Neuroscience, 12, 723–741. Guido, W., & Weyand, T. (1995). Burst responses in thalamic relay cells of the awake behaving cat. Journal of Neurophysiology, 74, 1782–1786. Guillery, R. W. (1966). A study of Golgi preparations from the dorsal lateral geniculate nucleus of the adult cat. Journal of Comparative Neurology, 128, 21–50. Guillery, R. W. (1969). The organization of synaptic interconnections in the laminae of the dorsal lateral geniculate nucleus of the cat. Z Zellforsch, 96, 1–38. Guillery, R. W. (1995). Anatomical evidence concerning the role of the thalamus in corticocortical communication: A brief review. American Journal of Anatomy, 187, 583–592. Guillery, R. W. (2003). Branching thalamic afferents link action and perception. Journal of Neurophysiology, 90, 539–548. Guillery, R. W. (2005). Anatomical pathways that link action to perception. Progress in Brain Research, 149, 235–256. Guillery, R. W., & Sherman, S. M. (2002a) Thalamic relay functions and their role in corticocortical communication: Generalizations from the visual system. Neuron, 33, 163–175. Guillery, R. W., & Sherman, S. M. (2002b) The thalamus as a monitor of motor outputs. Philosophical Transactions of the Royal Society of London. Series B, 357, 1809–1821. Hamos, J. E., Van Horn, S. C., Raczkowski, D., Uhlrich, D. J., & Sherman, S. M. (1985, October 17). Synaptic connectivity of a local circuit neurone in lateral geniculate nucleus of the cat. Nature, 317, 618–621. Hendry, S. H. C., & Reid, R. C. (2000). The koniocellular pathway in primate vision. Annual Review of Neuroscience, 23, 127–153. Huguenard, J. R., & McCormick, D. A. (1994). Electrophysiology of the neuron. New York: Oxford University Press. Jahnsen, H., & Llinás, R. (1984a). Electrophysiological properties of guinea-pig thalamic neurones: An in vitro study. Journal of Physiology, 349, 205–226. Jahnsen, H., & Llinás, R. (1984b). Ionic basis for the electroresponsiveness and oscillatory properties of guinea-pig thalamic neurones in vitro. Journal of Physiology, 349, 227–247. Jones, E. G. (2007). The thalamus (2nd ed.). Cambridge: Cambridge University Press. Kelly, L. R., Li, J., Carden, W. B., & Bickford, M. E. (2003). Ultrastructure and synaptic targets of tectothalamic terminals in the cat lateral posterior nucleus. Journal of Comparative Neurology, 464, 472–486. Lam, Y.-W., Nelson, C. S., & Sherman, S. M. (2006). Mapping of the functional interconnections between reticular neurons using photostimulation. Journal of Neurophysiology. 96, 2593–2600. Landisman, C. E., Long, M. A., Beierlein, M., Deans, M. R., Paul, D. L., & Connors, B. W. (2002). Electrical synapses in the thalamic reticular nucleus. Journal of Neuroscience, 22, 1002–1009.
8/17/09 2:06:23 PM
222 Thalamocortical Relations Latawiec, D., Martin, K. A. C., & Meskenaite, V. (2000). Termination of the geniculocortical projection in the striate cortex of macaque monkey: A quantitative immunoelectron microscopic study. Journal of Comparative Neurology, 419, 306–319.
Sherman, S. M. (1996). Dual response modes in lateral geniculate neurons: Mechanisms and functions. Visual Neuroscience, 13, 205–213. Sherman, S. M. (2001). Tonic and burst firing: Dual modes of thalamocortical relay. Trends in Neuroscience, 24, 122–126.
Lee, C. C., & Sherman, S. M. (2008). Synaptic properties of thalamic and intracortical inputs to layer 4 of the first- and higher-order cortical areas in the auditory and somatosensory systems. J. Neurophysiol., 100, 317–326.
Sherman, S. M. (2004). Interneurons and triadic circuitry of the thalamus. Trends in Neuroscience, 27, 670–675.
Lee, C. C., & Sherman, S. M. (2009). Modulator property of the intrinsic cortical projection from layer 6 to layer 4. Front. Syst. Neurosci. 3:3. doi: 10.3389/neuro.06.003.2009.
Sherman, S. M., & Guillery, R. W. (1996). The functional organization of thalamocortical relays. Journal of Neurophysiology, 76, 1367–1395.
Lesica, N. A., & Stanley, G. B. (2004). Encoding of natural scene movies by tonic and burst spikes in the lateral geniculate nucleus. Journal of Neuroscience, 24, 10731–10740.
Sherman, S. M., & Guillery, R. W. (1998). On the actions that one nerve cell can have on another: Distinguishing “drivers” from “modulators.” Proceedings of the National Academy of Sciences, USA, 95, 7121–7126.
Li, J. L., Bickford, M. E., & Guido, W. (2003). Distinct firing properties of higher order thalamic relay neurons. Journal of Neurophysiology, 90, 291–299.
Sherman, S. M., & Guillery, R. W. (2002). The role of thalamus in the flow of information to cortex. Philosophical Transactions of the Royal Society of London [Biol] 357, 1695–1708.
Lujan, R., Nusser, Z., Roberts, J. D., Shigemoto, R., & Somogyi, P. (1996). Perisynaptic location of metabotropic glutamate receptors mGluR1 and mGluR5 on dendrites and dendritic spines in the rat hippocampus. Europena Journal of Neuroscience, 8, 1488–1500.
Sherman, S. M., & Guillery, R. W. (2004). Thalamus. In G. M. Shepherd (Ed.), Synaptic organization of the brain (pp. 311–359). Oxford University Press.
Määttänen, L. M., & Koenderink, J. J. (1991). Contrast adaptation and contrast gain control. Experimental Brain Research, 87, 205–212. Massaux, A., & Edeline, J. M. (2003). Bursts in the medial geniculate body: A comparison between anesthetized and unanesthetized states in guinea pig. Experimental Brain Research, 153, 573–578. McCormick, D. A. (1992). Neurotransmitter actions in the thalamus and cerebral cortex and their role in neuromodulation of thalamocortical activity. Progress in Neurobiology, 39, 337–388. Mott, D. D., & Lewis, D. V. (1991, June 21). Acilitation of the induction of long-term potentiation by GABAB receptors. Science, 252, 1718–1720. Nelson, R. J. (1996). Interactions between motor commands and somatic perception in sensorimotor cortex. Current Opinion in Neurobiology, 6, 801–810. Nicoll, R. A., Malenka, R. C., & Kauer, J. A. (1990). Functional comparison of neurotransmitter receptor subtypes in mammalian central nervous system. Physiology Review, 70, 513–565. Ohzawa, I., Sclar, G., & Freeman, R. D. (1982, July 15). Contrast gain control in the cat visual cortex. Nature, 298, 266–268. Pin, J. P., & Duvoisin, R. (1995). The metabotropic glutamate receptors: Structure and functions. Neuropharmacology, 34, 1–26. Ralston, H. J. (1971, April 30). Evidence for presynaptic dendrites and a proposal for their mechanism of action. Nature, 230, 585–587. Ramcharan, E. J., Gnadt, J. W., & Sherman, S. M. (2000). Burst and tonic firing in thalamic cells of unanesthetized, behaving monkeys. Visual Neuroscience, 17, 55–62. Recasens, M., & Vignes, M. (1995). Excitatory amino acid metabotropic receptor subtypes and calcium regulation. Annals of the New York Academy of Sciences, 757, 418–429. Reichova, I., & Sherman, S. M. (2004). Somatosensory corticothalamic projections: Distinguishing drivers from modulators. Journal of Neurophysiology, 92, 2185–2197. Sanchez-Vives, M. V., Bal, T., & McCormick, D. A. (1997). Inhibitory interactions between perigeniculate GABAergic neurons. Journal of Neuroscience, 17, 8894–8908. Sharma, J., Angelucci, A., & Sur, M. (2000, April 20). Induction of visual orientation modules in auditory cortex. Nature, 404, 841–847. Sherman, S. M. (1985). Functional organization of the W-,X-, and Y-cell pathways in the cat: A review and hypothesis. In J. M. Sprague & A. N. Epstein (Eds.), Progress in psychobiology and physiological psychology (Vol. 11, pp. 233–314). Orlando, FL: Academic Press.
c10.indd Sec2:222
Sherman, S. M., & Guillery, R. W. (2006). Exploring the thalamus and its role in cortical function. Cambridge, MA: MIT Press. Sommer, M. A., & Wurtz, R. H. (2004a). What the brain stem tells the frontal cortex: Pt. I. Oculomotor signals sent from superior colliculus to frontal eye field via mediodorsal thalamus. Journal of Neurophysiology, 91, 1381–1402. Sommer, M. A., & Wurtz, R. H. (2004b). What the brain stem tells the frontal cortex: Pt. II. Role of the SC-MD-FEF pathway in corollary discharge. Journal of Neurophysiology, 91, 1403–1423. Somogyi, P., Tamas, G., Lujan, R., & Buhl, E. H. (1998). Salient features of synaptic organisation in the cerebral cortex. Brain Research. Brain Research Reviews, 26, 113–135. Stanford, L. R., Friedlander, M. J., & Sherman, S. M. (1983). Morphological and physiological properties of geniculate W-cells of the cat: A comparison with X- and Y-cells. Journal of Neurophysiology, 50, 582–608. Steriade, M., & Contreras, D. (1995). Relations between cortical and thalamic cellular events during transition from sleep patterns to paroxysmal activity. Journal of Neuroscience, 15, 623–642. Sur, M., Esguerra, M., Garraghty, P. E., Kritzer, M. F., & Sherman, S. M. (1987). Morphology of physiologically identified retinogeniculate X- and Y-axons in the cat. Journal of Neurophysiology, 58, 1–32. Swadlow, H. A., & Gusev, A. G. (2001). The impact of ‘bursting’ thalamic impulses at a neocortical synapse. Nature of Neuroscience, 4, 402–408. Swadlow, H. A., Gusev, A. G., & Bezdudnaya, T. (2002). Activation of a cortical column by a thalamocortical impulse. Journal of Neuroscience, 22, 7766–7773. Tamamaki, N., Uhlrich, D. J., & Sherman, S. M. (1994). Morphology of physiologically identified retinal, X., & Y axons in the cat’s thalamus and midbrain as revealed by intra-axonal injection of biocytin. Journal of Comparative Neurology, 354, 583–607. Tehovnik, E. J., Slocum, W. M., & Schiller, P. H. (2003). Saccadic eye movements evoked by microstimulation of striate cortex. European Journal of Neuroscience, 17, 870–878. Thier, P., & Ilg, U. J. (2005). The neural basis of smooth-pursuit eye movements. Current Opinion in Neurobiology, 15, 645–652. Truchard, A. M., Ohzawa, I., & Freeman, R. D. (2000). Contrast gain control in the visual cortex: Monocular versus binocular mechanisms. Journal of Neuroscience, 20, 3017–3032.
8/17/09 2:06:23 PM
References 223 Uhlrich, D. J., Cucchiaro, J. B., Humphrey, A. L., & Sherman, S. M. (1991). Morphology and axonal projection patterns of individual neurons in the cat perigeniculate nucleus. Journal of Neurophysiology, 65, 1528–1541.
Wang, X., Wei, Y., Vaingankar, V., Wang, Q., Koepsell, K., Sommer, F. T., et al. (2007). Feedforward excitation and inhibition evoke dual modes of firing in the cat’s visual thalamus during naturalistic viewing. Neuron, 55, 465–478.
Van Essen, D. C., Anderson, C. H., & Felleman, D. J. (1992, January 24). Information processing in the primate visual system: An integrated systems perspective. Science, 255, 419–423.
Webb B. (2004). Neural mechanisms for prediction: Do insects have forward models? Trends in Neuroscience, 27, 278–282.
Van Horn, S. C., Eri¸sir, A., & Sherman, S. M. (2000). The relative distribution of synapses in the A-laminae of the lateral geniculate nucleus of the cat. Journal of Comparative Neurology, 416, 509–520. Van Horn, S. C., & Sherman, S. M. (2004). Differences in projection patterns between large and small corticothalamic terminals. Journal of Comparative Neurology, 475, 406–415. Wang, W., Jones, H. E., Andolina, I. M., Salt, T. E., & Sillito, A. M. (2006). Functional alignment of feedback effects from visual cortex to thalamus. Nature of Neuroscience, 9, 1330–1336.
c10.indd Sec2:223
Wilson, J. R., Friedlander, M. J., & Sherman, S. M. (1984). Fine structural morphology of identified X- and Y-cells in the cat’s lateral geniculate nucleus. Proceedings of the Royal Society in London, Series B, 221, 411–436. Yen, C.-T., & Jones, E. G. (1983). Intracellular staining of physiologically identified neurons and axons in the somatosensory thalamus of the cat. Brain Research, 280, 148–154. Zhan, X. J., Cox, C. L., Rinzel, J., & Sherman, S. M. (1999). Current clamp and modeling studies of low threshold calcium spikes in cells of the cat’s lateral geniculate nucleus. Journal of Neurophysiology, 81, 2360–2373.
8/17/09 2:06:24 PM
Chapter 11
Vision DALE PURVES
The purpose of visual percepts is to generate successful behavior based on the information in retinal stimuli. When photoreceptors capture a sufficient number of photons, a series of processing steps is initiated in retinal circuitry; the outcome is then carried centrally by action potentials in the optic nerve to further processing stations in the thalamus and primary visual cortex, eventually reaching the visual association cortices. Perception—defined as what we actually see—is the result of this processing. Despite enormous progress in understanding the organization of visual circuitry over the past 50 years, how this circuitry generates percepts is not yet understood. The focus of this chapter is on perception as such, with the expectation that what we see can tell us much about what the underlying circuitry is seeking to accomplish. To help readers not familiar with the human visual system follow the relevant evidence, the chapter begins with a brief account of visual stimuli, how transduction by photoreceptors initiates the neural activity that leads to perception, and an overview of the subsequent visual pathways. The bulk of the chapter, however, discusses how and why we see the basic perceptual qualities that characterize visual experience: brightness, color, form, depth, and motion. Since perception is, by definition, a conscious phenomenon that humans report far more readily than experimental animals, most of the evidence derives from human studies.
ORGANIZATION OF THE VISUAL SYSTEM The primary visual pathway refers to the major route from retina to cortex that conveys the information in light stimuli that ultimately leads to perception (Figure 11.1; as indicated, centrally directed retinal pathways serve other functions as well). The primary visual pathway begins with the transduction of light energy by two types of retinal receptors, rods and cones, that define two overlapping but largely distinct light-level processing systems. The visual processing that rods initiate is primarily concerned
with perception at very low light levels, whereas cones only respond to greater light intensities and are responsible for the detail and color qualities that we normally think of as defining visual percepts. Subsequent to the retinal processing, the information arising from both rods and cones converges onto the retinal ganglion cells whose axons leave the retina in the optic nerve (see Figure 11.1). The major target of the retinal ganglion cells is the dorsal lateral geniculate nucleus in the thalamus, which comprises two magnocellular layers (so named because of the relatively large neurons in these layers) and four parvocellular layers containing smaller neurons. The distinct populations of neurons in these layers reflect substantially different functions that have perceptual consequences (Livingstone & Hubel, 1988). The magnocelluar layers and the larger retinal ganglion cells that innervate them tend to process information about changes in the stimulus, and thus perceptions of motion. In contrast, the parvocellular layers tend to process information about spatial detail and color. Neurons in both the magnocellular and parvocellular layers of the thalamus are also extensively innervated by axons descending from the cortex and other brain regions. Although the function of this descending innervation is not known, the geniculate nucleus is clearly a station for processing visual information and not simply passing it along to the cortex. The lateral geniculate neurons project in turn to the primary visual cortex, which is usually referred to as V1. Finally, the output neurons in the primary cortex project to additional visual cortical areas in the occipital, parietal, and temporal lobes. Because of the increasing integration of information from other brain regions in the visual cortical regions adjacent to V1, most investigators consider the primary visual cortex to be the terminus of the primary visual pathway. The higher-order cortical processing areas adjacent to V1 (Figure 11.2) are called cortical association areas; with respect to vision, these regions are called the extrastriate visual areas (V1 is also called the striate cortex because of an anatomically distinct layer that effectively creates
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. 224
c11.indd 224
7/14/09 5:11:29 PM
Organization of the Visual System
Optic nerve Optic chiasm Dorsal lateral geniculate nucleus Optic radiation
225
Optic tract
Hypothalamus: circadian rhythm Edinger-Westphal nucleus: pupillary light reflex Superior colliculus: orienting the movements of head and eyes
Primary visual cortex Figure 11.1 (Figure C.4 in color section) The primary visual pathway (solid lines). The route that carries information centrally from the retina to those regions of the brain especially pertinent to what we see comprises the optic nerves, the optic tracts, the dorsal lateral geniculate nuclei in the thalamus, the optic radiations and the primary (or striate) and secondary (or extrastriate) visual cortices in the occipital lobes. The partial crossing of the
optic nerve axons at the optic chiasm means that the left occipital lobe processes information arising from the right visual field , and conversely (note the view of the brain in the diagram is from the ventral aspect). Other central pathways to targets in the brainstem (dotted lines) determine pupil diameter as a function of retinal light levels, help to organize and effect eye movements, and influence circadian rhythms. (From Purves & Lotto, 2003)
a striped appearance of cortical layer 4). Both clinical and experimental evidence has shown that these extrastriate regions of association cortex tend to process one or more of the qualities that define visual perception more extensively than others. Thus in humans and nonhuman primates, the area called V4 is especially important in processing information pertinent to color vision, and areas MT (for middle temporal) and MST (for middle superior temporal) are especially important for the generation of motion percepts (this
nomenclature derives from studies in nonhuman primates; the less distinct areas in humans is often referred to as MT⫹). A further general rule is that the flow of information in the extrastriate cortical areas is organized into two relatively separate information streams that eventually feed into areas in areas of association cortex in the temporal and parietal lobes, respectively (Figure 11.3; Ungerleider & Haxby, 1994; Ungerleider & Mishkin, 1982; see also Milner & Goodale, 1995). One of these loosely defined pathways,
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec1:225
7/14/09 5:11:29 PM
226 Vision
Figure 11.2 (Figure C.5 in color section) Higher order visual areas in the human brain determined by functional magnetic resonance imaging in normal human subjects. A, B: Lateral and medial surface views of the brain. The primary visual cortex is indicated in green; the additional colored areas are the extrastriate areas. C: To better see the relation of these areas, the cortex has been computationally “inflated” to flatten its highly convoluted surface. V1-V4 and MT⫹(V5) are indicated; VP is a ventral posterior area whose function is not well understood. (After Sereno et al., 1995)
of motion, orienting attention, and positional relationships between objects in the visual scene. Where? (Analysis of motion and spatial relations) 19 18 17 What? (Analysis of form and color)
Figure 11.3 (Figure C.6 in color section) The dorsal and ventral visual streams. The differential flow of information along these pathways has been documented in humans with functional magnetic resonance imaging (fMRI) and other methods. The ventral pathway conveys information to regions of the lateral and inferior temporal lobes and is especially important in the recognition of objects. The dorsal path runs to the adjacent areas of the parietal lobe and is more concerned with perception of the location of objects and orienting attention to them. These pathways are therefore referred to as the “where” and “what” streams, respectively. (From Blumenfeld, 2004)
called the ventral stream, leads from the striate cortex to the temporal lobe. The information carried in this pathway appears to be concerned with high-resolution vision and object recognition, a conclusion that accords with other evidence about functions of the temporal lobe. The dorsal stream leads from striate cortex and visually relevant areas into the parietal lobe. This pathway appears to be primarily responsible for spatial aspects of vision, such as the analysis
VISUAL SYSTEM FUNCTION To understand visual perception, some basic facts about visual function are necessary, beginning with the nature of visual stimuli. The photons to which the human eye is sensitive comprise only a miniscule fraction of the electromagnetic spectrum, namely photons with wavelengths of ⬃400 to 700 nm. Light is therefore defined by the fact that the human visual system has evolved to process this particular spectral range, presumably driven by the spectrum of sunlight at the surface of the earth, which has a strong peak at about 550 nm. Visual percepts can be elicited by amounts of light ranging from a few tens of photons/mm2 at the retinal surface to values a billion or more times greater. This enormous range of responsiveness allows the visual system to generate percepts in widely varying circumstances, in starlight as well as daylight. To function over such a broad range, the visual system— and other sensory systems—continually resets its sensitivity according to ambient conditions. The primary purpose of this adaptation is to ensure that, despite the limitations of nerve cells, useful signaling can occur with maximum efficiency over the full range of pertinent environmental conditions. The rate of neuronal firing conveys information about stimulus intensity (the more action potentials per unit time, the more intense the stimulus), and the maximum rate of signaling is only a few hundred action potentials per second. This biological range is obviously inadequate to
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec1:226
7/14/09 5:11:32 PM
Visual System Function
generate finely graded percepts that convey brightness values in response to a range of light intensities that spans 10 or more orders of magnitude. Thus, the sensitivity of the system is continually adjusted to match different levels of light intensity in the environment. Visual acuity is a second aspect of visual function pertinent to many aspects of perception. Acuity refers to the fineness of discrimination, as in distinguishing two points from one another in a visual scene (as in the standard test of acuity used by optometrists). Although the visual world seems to be seen quite clearly, visual acuity in humans actually falls off rapidly as a function of eccentricity. Consequently, vision outside the central few degrees of the visual field is extremely poor, and without a normally functioning central retina, vision operates at levels that qualify as legal blindness. Frequently moving the direction of gaze to different positions in visual space is essential for seeing objects in detail, which is what humans do during the normal inspection of a scene. On average, such eye movements, called saccades, occur 3 to 4 times a second. The reason for this difference in acuity according to where an image falls on the retina is largely due to the distribution of photoreceptors and the organization of the output to other retinal neurons (Figure 11.4). Cones, which as noted are responsible for detailed vision in daylight, greatly predominate in the central region of the retina, being most dense in a specialized region called the fovea. The fovea corresponds to the line of sight and the couple of degrees around it. The prevalence of cones falls off sharply in all directions as a function of distance from this locus; as a result, high acuity vision is limited to the fovea and its immediate surroundings. In contrast, rods are sparse in the fovea and absent altogether in the middle of it. The rod system has little acuity
227
because many rods converge on the next level of neurons entailed in retinal processing; this arrangement enhances sensitivity, but at the expense of acuity. In consequence, sensitivity to a dim stimulus such as a spot of light is greater off the line of sight because of the paucity of rods in the fovea and their preponderance a few degrees away, even though acuity is less at this eccentricity. The reason people are generally unaware of the poor acuity of eccentric vision is simply because whenever it is important to see something clearly, the object of interest is brought onto the fovea by movements of the eyes, head, or body. A third basic aspect of visual physiology pertinent to perception is the receptive field characteristics of visual neurons. The receptive field of any sensory neuron is defined as the region of the receptor surface that, when stimulated, elicits a response in the neuron being examined. The receptive fields of visual neurons, whether in the retina, thalamus, or cortex, are typically defined by the region of visual space that corresponds to the stimulated region of the retinal surface (Figure 11.5). In addition to responsiveness measured in spatial terms, visual neurons are also sensitive to many other characteristics of a stimulus. In Figure 11.5, for example, the neuron illustrated responds vigorously to a moving bar oriented at some angles but not others. By testing a neuron’s sensitivity to a range of differently oriented stimuli, a tuning curve can be defined that indicates the maximal responsiveness of the cell to a given feature, orientation in this example (see Hubel & Wiesel, 2005, for a detailed account of this seminal work). The receptive fields of cortical neurons serving central vision in the primary visual cortex generally measure less than a degree of visual angle across, as do the receptive fields of the corresponding retinal ganglion cells and lateral
Figure 11.4 (Figure C.7 in color section) The acuity of the visual systems is determined primarily by the distribution of retinal receptors. Graph showing the density of rods and cones as a function of distance from the fovea. The poor resolution of vision a few degrees off the line of sight is a result of the relatively paucity of cones at eccentricities greater than a few degrees, the line of sight corresponding to the center of the cone rich fovea. (From Purves & Lotto, 2003)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec2:227
7/14/09 5:11:44 PM
228 Vision (A) Stimulus Stimulus orientation presented
0
(B)
1 2 3 Time (s)
Figure 11.5 Example of the receptive field of a neuron recorded in the primary visual cortex of an experimental animal. As different stimuli are presented on a screen in front of the anesthetized animal, the neuron being recorded from fires in a variable way that defines both sensitivity to the stimulus location and the specific features to which it responds. In this instance, the neuron is especially sensitive to lines with a particular orientation. (After Howe, Purve, et al., 2005)
geniculate neurons. Even for cells serving the peripheral vision, the receptive fields in primary visual cortex measure only a few degrees. In extrastriate cortical areas, however, receptive fields often cover a substantial fraction of the entire visual field (which extends about 180˚ horizontally and 130˚ vertically). The location of retinal activity—and the corresponding topographical relationships in the primary visual pathway assumed to underlie the sense of where an object is in visual field—cannot be conveyed by neurons that respond to stimuli anywhere in such a large region of space. This relative mooting of retinal topography in higher-order visual cortical areas presents a problem for any rationalization of vision in terms of images and percepts based on image representation in the brain, a problem discussed later in the chapter. Finally, the organization of the visual system is hierarchical in the sense that lower-order stations lead anatomically and functionally to higher-order ones, albeit with much modulation and feedback at each stage. At each of the “lower” stations in the primary visual pathway, the receptive field characteristics of the relevant neurons can be understood reasonably well in terms of the cells that provide their input. Thus, the responses of retinal ganglion cells can be rationalized on the basis of the rods and cones that supply information to them via the bipolar and other retinal cells, geniculate cell responses can be understood in terms of the ganglion cells that innervate them, and that the responses of at least some classes of “lower-order” visual
cortical neurons make sense in terms of their geniculate inputs (reviewed in Hubel, 1988; Hubel & Wiesel, 2005). Beyond these initial processing levels, however, rationalizing the organization of the visual system in hierarchical terms—that is, in terms of lower-order neurons shaping the response properties of higher-order neurons—becomes difficult since the cells in question are increasingly driven by a variety of other higher-order neurons, including many conveying information that is not primarily visual. The problem with the idea of a hierarchy of visual processing, however, is not simply that it is hard to explain higher-order receptive field properties in terms of lowerorder ones. A more worrisome aspect of a visual hierarchy is the implication that there is some place in the brain where the various qualities processed in the primary and more specialized areas of cortex are brought together for purposes of perceiving them. This way of thinking has given rise to the idea that the activity of defined populations of visual neurons generates percepts by representing the various features of the world we actually see. This scenario, however, is not supported by present evidence about visual perception, and testing possible alternatives is the subject of much current research, as the following sections indicate.
BRIGHTNESS A good place to begin any consideration of visual percepts and how they are related to the structure and function of the visual system is the perception of light and dark elicited by different stimulus intensities. Light intensity is measured physically as luminance, whereas the ensuing sensory quality called is brightness. (Technically, brightness refers to the appearance of a light source, such as a light bulb, and lightness to the appearance of a surface such as a piece of paper; for present purposes, brightness is used in its more general inclusive sense.) Not the least of the reasons for starting with brightness is that such percepts are arguably the most fundamental visual qualities that humans see. Human vision cannot occur without this perceptual quality, whereas other qualities (color, for example) are expendable (some highly visual animals have well-developed color vision while others don’t). Psychophysical Measurements of Brightness Like all percepts, brightness is subjective and can only be evaluated indirectly. A conceptually straightforward but technically difficult perceptual determination is the least energetic retinal stimulation that can be perceived at all in dark-adapted subjects. By varying the amount of energy delivered, a psychophysical function can be obtained that
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec2:228
7/14/09 5:11:50 PM
Brightness
defines the stimulus threshold value. At the threshold, subjects have difficulty saying whether they saw something or not; therefore, such tests are usually carried out using a paradigm in which the observer must respond on each trial (called forced choice). Typically, a series of trials is presented in which stimuli of different energetic levels are randomly interspersed with trials that do not present a stimulus. Since 50% “correct” (i.e., saying “Yes, I saw something” or “No, I saw nothing” when a stimulus was or was not present, respectively) would be the average result obtained if subjects merely guessed on each trial, 75% correct responses is conventionally taken to be the criterion for establishing the threshold level of stimulus energy. The relative as opposed to the absolute sensitivity of the visual system can be measured somewhat more easily by asking how much physical change is needed to generate a perceptual change out at any level of luminance and at different stimulus wavelengths (Figure 11.6A). Such psychophysical functions have led to a number of important generalizations, one being the Weber-Fechner Law. This law states that the ability to notice a difference in a stimulus (called the just noticeable or equally noticeable difference) is determined by a fixed proportion of the stimulus intensity, the proportion referred to as the Weber fraction. The proportional relationship between just noticeable differences and stimulus magnitude expressed by the Weber-Fechner law makes good sense: Because of the limited number of action potentials that neurons can generate per second, the visual system must continually adjust their overall range of operation to provide subjects with information about energy levels of light that for
(A) 1.0
humans span many orders of magnitude (see above); the Weber fraction presumably reflects this fact. Another psychophysical approach to evaluating brightness is called magnitude scaling and entails ordering percepts along a scale of subjective magnitude that covers the full range of the perceptual quality (Figure 11.6B). The most extensive studies of this sort were carried out by the Harvard psychologist Stanley Stevens, who worked on this issue from about 1950 to 1975 (Stevens, 1975). Stevens asked whether a light stimulus that is made progressively more intense elicits perceptions of brightness that linearly track physical intensity. In making such determinations, Stevens simply asked subjects to rate brightness on a number scale along which 0 represented the least sense of relative brightness in a test series, and 100 the greatest. In this manner, he determined that brightness scales as a power function with an exponent of ⬃0.5 under the standard conditions he used. The power functions found in such scaling experiments are sometimes referred to as reflecting Stevens’ Law. These observations show that the perception of light intensity is oddly nonlinear, a puzzling fact whose mystery is only deepened by the phenomena described in the next section. Further Discrepancies between Luminance and Brightness Given that brightness is the subjective experience of light intensity, a logical assumption would be that two objects in a scene that return the same amount of light to the eye should appear equally bright. Observations dating to the nineteenth century and earlier, however, showed that perceptions of
(B) 100 90 Brightness (arbitary units)
0.8 Relative sensitivity
229
0.6 0.4 0.2
aw
er L
80
s
ven
Ste
70
w ’ Po
60 50 40 30 20 10
0
0 400
450
500 550 600 Wavelength (nm)
650
700
Figure 11.6 Psychophysical assessments of the relationship between luminance and brightness. A: The human luminosity function, determined by measuring the sensitivity of normal subjects to light as a function of stimulus wavelength. This determination is usually made by evaluating just noticeable differences at suprathreshold levels. The results show that humans are
0
10
20
30
40
50
60
70
80
90
100
Luminance (arbitary units)
far more sensitive to stimuli in the middle of the light spectrum. B: Illustration of magnitude scaling as a means of evaluating how human subjects perceive brightness. The results of such testing show that the relationship between the perception of brightness and the intensity of a light stimulus is a nonlinear power function (the exponent is ⬃0.5 in this instance). (After Purves & Lotto, 2003)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec3:229
7/14/09 5:11:53 PM
230 Vision
brightness fail to meet this seemingly simple expectation. For example, two patches returning equal amounts of light to the eye are perceived as looking differently bright when placed on backgrounds that have different luminances (Figure 11.7A). Thus, a patch on a background of relatively low luminance appears somewhat brighter than the same patch on a background of higher luminance, a phenomenon called simultaneous brightness contrast. This effect can be made much more dramatic when the stimulus includes more information about the possible real-world conditions underlying the stimulus (Figure 11.7B). Until relatively recently, the explanation of this sort of effect most often given was based on the properties of neurons at the input level of the visual system (the retina) and the lateral interactions that demonstrably occur in retinal processing. Presumably as a means of enhancing the detection of luminance contrast boundaries (edges), the central region of the receptive fields of lower-order visual neurons has a surround of opposite functional polarity. The number of action potentials generated per unit of time by neurons whose receptive fields intersect a contrast boundary will therefore differ from the activity of neurons whose receptive fields fall entirely on one side of the boundary or the other (Figure 11.8). For example, the neurons whose receptive field centers lie just within the target on the dark background in Figure 11.7A will fire at a higher rate than the neurons whose receptive field centers lie just within the target on the light background (because the former are less inhibited by their oppositely disposed receptive field surrounds than the latter). As a result, many investigators supposed that the patch on the dark background looks
brighter than the patch on the light background because of this difference in the retinal output. The percepts elicited by other stimulus patterns, however, undermine the idea that simultaneous brightness contrast effects are an incidental consequence of an anomalous retinal output in response to edges (or indeed any other aspect of retinal processing). In Figure 11.9, for example, the target patches on the left are surrounded by a greater area of higher luminance (lighter) territory than lower luminance, and yet appear brighter than the targets on the right, which are surrounded by more lower luminance (darker) territory than higher. Although the average luminance values of the surrounds in this stimulus are effectively opposite those in standard simultaneous brightness stimulus in Figure 11.7A, the brightness differences elicited are about the same in both direction and magnitude as in the standard presentation. The perceptions of brightness elicited by the stimulus in Figure 11.7B are also difficult to explain in terms of a rule such as that illustrated in Figure 11.8. If the output of retinal neurons can’t account for the relative brightness values seen in response to stimuli such as the simple patterns in Figures 11.7 and 11.9, what then is the explanation? An alternative framework for thinking about this problem is based on the fact that the significance of the luminance in any part of a retinal image is unknowable by any direct operation on the stimulus as such. The reason for this counterintuitive statement is not hard to see. Three fundamental aspects of the physical world determine luminance: the illumination of objects, the reflectance of object surfaces, and the transmittance of the space between the objects and the observer. As indicated in Figure 11.10,
Figure 11.7 Simultaneous brightness contrast. A: Standard presentation of this effect; the two diamond shaped patches have the same luminance (see key), but the one in the dark surround looks somewhat brighter. B: Simultaneous brightness contrast effects
can be much greater when the scene contains more detailed contextual information; as shown in the key, the relevant patches that look very different in brightness although they again have the same luminance. (After Purves & Lotto, 2003)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec3:230
7/14/09 5:11:58 PM
Brightness
Dark
A B C D E
Light
Edge “On”-center ganglion cells
Response rate
D E
C
A
Spontaneous level of activity
B
Position
Figure 11.8 Lateral interactions in the retina and their effect on retinal output. Diagram of the different firing rates of retinal ganglion cells as a function of their position with respect to a light-dark contrast boundary. See text for further explanation. (After Purves et al., 2008)
Figure 11.9 Stimulus pattern that elicits perceptual effects that cannot be explained in terms of lateral interactions arising from local contrast effects and any simple rule such as “a target surrounded by less luminant surfaces should appear brighter than the same target surrounded by more luminant surfaces”. The pattern is called “White’s illusion” after the psychologist who first described this effect. (After White, 1979)
the relative contributions of these factors are inevitably conflated in the retinal image. Thus, many different combinations of illumination, reflectance, and transmittance can give rise to the same value of luminance; as a result, there is no direct way that the visual system can know how these factors have actually been combined to generate a particular retinal luminance value. Because appropriate behavior requires responses that accord with the physical source of a stimulus, this uncertainty presents an enormous challenge in the evolution of vision; by the same token, this fact makes clear that if brightness percepts were simply
231
proportional to the luminance values in a stimulus, the percepts would not be a guide to successful behavior. A biologically useful approach would be possible, however, if brightness percepts were generated empirically according to the past success or failure observers had experienced interacting with different combinations of illumination, reflectance, and transmittance in natural scenes. In this framework, brightness would correspond to the relative frequency with which different possible combinations proved to be the source of the same or similar stimuli in the enormous number of visual scenes witnessed during the course of evolution (as well as by individual observers during their lifetimes). This general idea resonates to some degree with Hermann Helmholtz’s suggestion in the nineteenth century that empirical information might be needed to augment what he took to be the more or less veridical information supplied by peripheral sensory mechanisms (Helmholtz, 1866). As indicated in Figure 11.10, however, retinal receptors can’t provide any unambiguous information about the state of the world. Thus, the more radical idea now being examined is that visual processing is empirical from start to finish, the brightness seen by an observer being determined by the empirical significance of the stimulus for behavior, rather than the intensity of light falling on the retina (reviewed in Purves & Lotto, 2003; Purves, Williams, Nundy, & Lotto, 2004). The biological rationale for this way of seeing brightness is that by using the outcome of experience accumulated by trial and error during phylogenetic and ontogenetic experience (i.e., what worked as a percept in response to a given stimulus), percepts and the ensuing behaviors come to have an increasingly better chance of responding successfully to retinal images whose meaning is otherwise unknowable. Central Processing of Luminance How and where perceptions of brightness are generated centrally is not understood. There is no cortical region that is especially concerned with processing brightness in the way V4 cells are especially concerned with color or MT neurons with motion. Moreover, the close relationship between light intensity and the firing rate of retinal ganglion cells diminishes as neurons are tested in increasingly central stations of the visual system. Thus, cells in the lateral geniculate nucleus respond in more or less the same way to intensity as retinal ganglion cells, whereas most neurons in the visual cortex respond only weakly to changes in stimulus intensity as such. The key observations made by neurophysiologists Stephen Kuffler, David Hubel, and Torsten Weisel working first at Johns Hopkins and then at Harvard Medical School in the 1950s and 1960s showed that what neurons at the level of the primary visual cortex respond
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec3:231
7/14/09 5:12:01 PM
232 Vision
Illumination
Reflectance
Transmittance
Stimulus
Figure 11.10 (Figure C.8 in color section) The inevitable conflation in light stimuli of illumination, reflectance and transmittance, the factors that an observer must parse in order to respond appropriately to the pattern of luminance values in any visual stimulus. The difficulty for vision presented by this conflation of information in the retinal image is an example the “inverse optics problem.” (From Lotto & Purves, 2003)
to with respect to luminance is not the intensity of a light stimulus, but the contrast between light and dark regions in the stimulus (see Figure 11.5; Hubel, 1988; Hubel & Wiesel, 2005; Kuffler, 1953). Increasingly central neurons are more concerned with the configuration of the complex stimuli and less concerned with luminance levels per se. This key fact must be important in the way luminance is related to brightness, but it is not yet clear how.
(Figure 11.11B). The ability to see colors has presumably evolved in humans and many other mammals because perceiving spectral differences allows observers to distinguish surfaces in the natural world more effectively than distinctions made solely on the basis of luminance. The fact that color vision has not evolved to any great extent in many visual animals indicates that the ecological value of color perception is far less than the value of brightness perception, which is presumably essential to form vision.
COLOR
Initiation of Color Percepts
A second basic quality of human visual perception is color. Recall that brightness is defined as the perceptual category elicited by the overall amount of light in a visual stimulus. Color is the perceptual category generated by the distribution of that amount across the visible spectrum, that is, the relative amount of energy at short, middle, and longer wavelengths in a stimulus (Figure 11.11A). The experience of color actually comprises three perceptual qualities: (1) hue, which is the perception of the relative redness, blueness, greenness, or yellowness of a stimulus; (2) saturation, which is the degree to which the perception approaches a neutral gray (e.g., a highly unsaturated red is a percept that appears largely gray but nonetheless has an appreciable reddish tinge); and (3) color brightness, which is the same perceptual category described previously, but applied to a stimulus that elicits a discernible hue. Together these qualities describe a perceptual color space
Rods play little part in human color vision, as is apparent from the fact that we don’t see or color well or eventually at all in dim light where rod vision predominates. The reason is that all rods contain the same photopigment (rhodopsin), whereas three different cone photopigments (called cone opsins) characterize three different cone types. As a result, each cone type has a different absorption spectrum, and therefore responds best to a different portion of the visible spectrum (roughly speaking, to long, middle, and short wavelengths, respectively; see Figure 11.11A; Rodieck, 1998). The different responsiveness of the three cone types allows the cones to report information about the distribution of energy in a light stimulus, and thus to generate information leading eventually to percepts of hue and saturation as well as brightness. In contrast, rods can only report the amount of light they capture, and can thus generate only information pertinent to brightness.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec3:232
7/14/09 5:12:06 PM
Color 233
Figure 11.11 The perception of color . A: (Figure C.9 in color section) The solid curves indicate the absorption properties of the three cone types in the human retina, showing their differential sensitivity to short, medium and long wavelength light (dotted curve shows rod absorption). B: (Figure C.10 in color section) Diagram of perceptual “color space” for humans. At any particular level of light intensity, movements around the perimeter of the relevant plane or “color circle” correspond to changes in hue (i.e., changes in the apparent contribution of red, green, blue or yellow to the percept), whereas movements along the radial axis correspond to changes in saturation (i.e., changes in the approximation of the color to the perception of a neutral gray). Each of the four primary color categories (red, green, blue and yellow) is characterized by a unique hue (indicated by dots) that has no apparent admixture of the other three (i.e., a color experience that cannot be seen or imagined as a mixture of any other colors). These four colors are considered primary is because of their perceptual uniqueness. (After Lotto & Purves, 2003)
The fact that human color vision is based on the different sensitivity of three cone types is called trichromacy; color vision in most other mammals that have significant color vision is based on only two cone types, and they are thus referred to as dichromats (Mollon, 1995; Rodieck, 1998). A common disorder of color perception is anomalous color vision based on a genetic defect in one (or sometimes more) of the three cone types, effectively creating human dichromats. The most common form of this sort of color blindness is deficiency of a single cone type, which affects about 5% of U.S. males (this inherited genetic defect is located on the X chromosome, which explains its overwhelming predominance in males; Nathans, 1987). Although such individuals cannot distinguish red and green hues, or less commonly blue and yellow ones, this inability presents little practical difficulty in daily life. Central Processing of Spectral Differences While successfully accounting for many aspects of color perception in the laboratory, explanations of color vision based on retinal output from the three human cone types
have long been recognized to be inadequate, in much the same way that retinal output determined by luminance does not explain the brightness values that people see. This realization dates back to the nineteenth century when Helmholtz’s contemporary, Ewald Hering, pointed out that some aspects of color perception cannot be fully understood simply on the basis of three cone types (Turner, 1994). For example, humans with normal color vision perceive red to be an opponent color to green, and blue to be an opponent color to yellow. Whereas observers can see and/or imagine a gradual transition from red to yellow through a series of intermediate colors (orange colors), there is no parallel perception—or conception—of how to get from red to green, or from blue to yellow except through gray or through one of the other primaries. Furthermore, in contrast to the way observers see orange as a mixture of red and yellow, or purple as a mixture of blue and red, humans perceive a particular hue of red, green, blue, and yellow to be unique in the sense of not being a mixture of any other colors (see Figure 11.11B). Because simply having three cone types offers no explanation of these perceptual phenomena, Hering argued correctly that the comparisons made by the
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec4:233
7/14/09 5:12:13 PM
234 Vision
three cone types provide only a partial account of the colors we end up seeing and how color sensations are generated. Central visual processing modulates the information generated by the retina to determine color percepts, just as central processing modulates luminance information to generate brightness percepts. The neural basis of opponent color percepts has been advanced by modern electrophysiological studies of wavelength-sensitive neurons at different stations of the visual system of nonhuman primates and other species with color vision. The majority of color-sensitive neurons in the retina and lateral geniculate nucleus of the thalamus have receptive fields that are organized in a color opponent fashion. Such cells are excited by light of one wavelength (e.g., long wavelength or “red” light) illuminating the center of their receptive field, and inhibited (or “opposed”) by another wavelength (e.g., middle wavelength or “green” light) falling in the region surrounding the center of the receptive field. In macaque monkeys (which have color vision nearly identical to humans), most (but not all) color opponent cells are antagonistic with respect to wavelengths that appear red and green, or blue and yellow (Hubel, 1988; DeValois & DeValois, 1993; Gegenfurtner, 2003). In addition to red/green and blue/yellow classes of opponent cells, other neurons are insensitive to differences in wavelength. These cells are often considered “white/ black” opponent neurons. The explanation usually given for the perceptual phenomena that Hering first noted is that perceptions of color are elicited by neurons comprising three color “channels” that operate in a push/pull fashion. For instance, when the neurons responsible for seeing red are excited, those responsible for green are inhibited, and vice versa. Color percepts clearly arise from processing at higher levels in the brain and not just from the presence of three cone types, even though the details and consequences of opponent color processing are not yet understood. Additional information about central color processing has come from studies in nonhuman primates carried out over the past 25 years (Zeki, 1983a, 1983b; 1989, 1993). This work and related clinical studies have shown without much doubt that extrastriate area V4 is important in color processing (see Figure 11.2). Neuropsychological and imaging studies of patients suffering from a condition called cerebral achromatopsia have confirmed the relative specialization of this cortical region. In effect, such individuals lose the ability to see the world in color, whereas other aspects of vision such as brightness and form vision remain intact. As indicated in Figure 11.12, these patients typically have damage to extrastriate visual areas, damage to V4 being the most common site. Further evidence that this general area of cortex is concerned with color processing comes from functional imaging studies of normal subjects, which
Achromatopsia (A)
Z ⫽ 36
Percent overlap 80 60 (B) 40 20
Figure 11.12 (Figure C.11 in color section) Damage to the ventral extrastriate occipital cortex (which includes area V4) can lead to an inability to see color (achromatopsia), despite being able to see brightness and form more or less normally. A: Degree of overlap in the location of lesions in a series of patients with achromatopsia as well as other neurological deficits. Given the anatomy of the primary visual pathway, such patients are often blind to stimuli of any sort in the contralateral visual field (see Figure 11.1). B: Degree of overlap in 11 patients with achromatopsia as the primary symptom. The narrower overlap in these patients is consistent with the conclusion that the integrity of V4 is important for color vision. Inset shows level of the horizontal sections shown. (From Bouvier & Engel, 2006)
show that much the same regional activation is elicited by color processing tasks. The neurologist and essayist Oliver Sacks (1995, 1996) had a patient who described objects in visual scenes as all being “dirty” shades of gray. When asked to draw objects from memory, he had no difficulty with the relevant shapes or with shading, but was unable to appropriately color the things he represented (e.g., he could draw a banana, but couldn’t color it yellow). As in any brain lesion study, however, some caution is warranted because of the great variability of cortical damage and uncertainty about the extent of neurological damage. Whereas V4 seems a key component of central color processing, a number of related extrastriate areas probably participate in generating color percepts as well. Color Contrast and Constancy Like perceptions of brightness, the colors we see are strongly influenced by the overall pattern of light in any
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec4:234
7/14/09 5:12:19 PM
Color 235
particular stimulus. For example, a stimulus patch generating exactly the same distribution of power at various wavelengths can appear quite different in color depending on its surroundings, a phenomenon called color contrast (Figure 11.13). Conversely, patches in a scene returning different spectra to the eye can appear to be much the same color, an effect called color constancy. Although these phenomena were well known to psychologists and vision scientists more than 100 years ago, they were emphasized by Edwin Land’s work in the late 1950s (Land, 1986). Land—an independent photochemist, who, among other achievements, invented polarizing filters and instant photography, and founded the Polaroid Corporation—used three adjustable lights generating short, middle, and long wavelength light respectively to illuminate a collage of colored papers. He used spectrophotometry to show that two patches that appeared to be different colors in white light (e. g., green and brown) continued to elicit much the same color percepts when the three illuminators were adjusted so that the light being returned from the “green” surfaces produced exactly the same readings on a spectrophotometer that had previously come from the “brown” surface—a striking demonstration of color constancy. Color contrast and constancy effects raise much the same problem for understanding color processing as the contextual
“Blue”
“Yellow”
Contrast
“Red”
“Red” Constancy
brightness effects do for understanding how achromatic percepts are generated. Together, these phenomena have led to a debate about brightness and color percepts that now spans more than a century. The issue is how global information about the spectral context in scenes is integrated with local spectral information to produce color percepts. Land tried to explain such effects by a series of algorithms that integrated the spectral returns of different regions over the entire scene (the implication being that color processing in the visual system implemented these algorithms). It was recognized even before Land’s death in 1991, however, that this so-called “retinex theory” did not work in all circumstances and was a description, not a physiological explanation. Other vision scientists have emphasized the opponent receptive field properties of central neurons, double opponent cells in particular, as possible neural substrates for such effects (these are neurons in which the activity of the surround is inhibited by activation of the receptive field center and vice versa; see Hubel, 1988, for a general description). Another approach has focused more specifically on the interaction of the three human cone types in responding to the foreground and background components of retinal images (e.g., Shevell & Monnier, 2006). Still others have provided evidence that, like achromatic brightness, perceptions of hue, saturation, and color brightness are generated
Figure 11.13 (Figure C.12 in color section) Demonstrations of color contrast and constancy. Color contrast. The four blue patches on the top surface of the cube in the left panel and the seven yellow patches on the cube in the right panel key are actually identical grey patches (see upper key below). Thus patches that are physically the same can be made to appear either blue or yellow by changing the spectral context in which they occur. Color constancy. Patches that have very different spectra (see the five different colored patches on the left and right in the lower key) can be made to look more or less the same color (red in this case) by contextual information. (From Purves & Lotto, 2003)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec4:235
7/14/09 5:12:25 PM
236 Vision
according to the empirical significance spectral stimuli in past experience (Long, Yang, & Purves, 2006; Purves & Lotto, 2003; Purves et al., 2004). There is as yet no consensus about how central visual processing integrates local and global spectral information to produce the remarkable phenomenology of color perception.
PERCEPTION OF FORM A third basic quality of vision is the perception of form. In the simplest case, perceptions of form entail geometrical characteristics such as the length of lines, their apparent orientation, and the angles they make with other lines. Understanding the perception of such simple stimuli is a first step toward understanding how more complex objects are perceived. Seeing Simple Geometries A starting point in exploring how the visual system generates perceptions of form is examining how observers perceive the distance between two points in a visual stimulus, as in the perceived length of a line or the dimensions (size) of a simple geometrical shape. It is logical to suppose that the perception of a given line (e.g., a line drawn on a piece of paper or presented on a computer screen) should correspond more or less directly to its projected length in the retinal image. But as in the case of brightness and color, what we see does not correspond to physical reality. A wellstudied example is the variation in the perceived length of a line as a function of its orientation (Figure 11.14A). As investigators have repeatedly shown over the past 150 years, a line oriented more or less vertically in the retinal image appears to be significantly longer than a horizontal
(B)
1.15 Perceived line length
(A)
line of the same length, the maximum length being seen, oddly enough, when the stimulus is oriented about 30˚ from vertical (Figure 11.14B; Howe & Purves, 2005). This effect is a particular manifestation of a general tendency to perceive the extent of any spatial interval differently as a function of its orientation in the retinal image. For instance, as the psychologist Wilhelm Wundt (1862) first showed, the apparent distance between a pair of dots varies systematically with the orientation of an imaginary line between them, and a perfect square or circle appears to be slightly elongated along its vertical axis. There is a rich literature on the perceptions elicited by simple geometrical stimuli, showing that measurements made with rulers or protractors are typically at odds with the corresponding percepts (Robinson, 1972/1998). Some of the most familiar of these “geometrical illusions”—and the ones whose etiology has been most hotly debated—are illustrated in Figure 11.15. The first example is attributed to Hering, who showed that two parallel lines (indicated in red) appear bowed away from each other when presented on a background of converging lines (Figure 11.15A). In the Poggendorff illusion, the continuation of a line interrupted by a bar appears to be displaced vertically, even though the two line segments are actually collinear (Figure 11.15B). The inverted T illusion is effectively the phenomenon of vertical lines looking longer than horizontal ones described above (Figure 11.15C). In the more complex Müller-Lyer illusion, the line terminated by arrow tails looks longer than the same line terminated by arrowheads (Figure 11.15D). In the Ponzo illusion, the upper horizontal line appears longer than the lower one, despite the fact that they are again identical (Figure 11.15E). All these effects are apparent in natural scenes and, as in brightness and color, can be enhanced by more complex contextual information: the
1.10
1.05
1.00
Figure 11.14 Variation in apparent line length as a function of orientation. A) The horizontal line in this figure looks significantly shorter than the vertical or oblique lines, despite the fact that all the lines are identical in length. B) Quantitative assessment of the apparent length of a line reported by subjects as a function of its orientation in the stimulus (orientation is expressed as the angle between the line and the horizontal axis). The maximum
0
30 60 90 120 150 180 Orientation of a line in retinal image (deg)
length seen by observers occurs when the line is oriented approximately 30° from vertical, at which point it appears about 10–15% longer than the minimum length seen when the stimulus is horizontal (in this graph the reference is the apparent length of the line when horizontal, which is plotted as 1.00). (After Howe & Purves, 2005)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec4:236
7/14/09 5:12:40 PM
Perception of Form 237 (A)
(C)
(B)
(D)
(E)
(F)
Rotate 90°
Figure 11.15 Examples of several much-studied geometrical illusions. A: The Hering illusion. B: The Poggendorff illusion. C: The inverted T illusion. D: The Müller-Lyer illusion. E: The Ponzo illusion. F: The table-top illusion. (From Purves & Lotto, 2003)
table top illusion created by psychologist Roger Shepard is a good example (Figure 11.15F). Several of these stimuli are effectively “size contrast” stimuli, similar in principle to brightness contrast and color contrast stimuli. That is, the same physical stimulus appears different when placed in different contexts. Neural Processing of Form Recall that receptive field properties define the physiological response of visual neurons. With respect to the orientation of a line, for example, neurons in V1 respond selectively to lines shown at different angles (see Figure 11.5). Related studies have shown that many V1 cells are also selective for particular line lengths (so called “end-stopped cells”). As a result, it is attractive to suppose that the activity of neurons in visual cortex that respond best to a stimulus feature, for example, a particular orientation or length of a line,
corresponds to a more or less direct representation of the perception of that stimulus feature in the retinal image. There are, however, a number of problems with this intuitively appealing view. The major obstacle is the uncertain relationship between retinal image features such as length and orientation, and the length and orientation of objects in the real world (Figure 11.16). The challenge of this ambiguous link between retinal stimuli and their physical sources has been recognized since the beginning of the eighteenth century, and is the same in principle as the conflation of information illustrated in Figure 11.10. As indicated in Figure 11.16, the problem is that images on the retina cannot, by their nature, uniquely specify the physical geometry of the objects in a scene. As in the case of brightness and color, the strategy needed to deal with this problem must somehow rely on empirical information because there is no logical solution of the quandary illustrated in Figure 11.16. As already noted, the idea that vision must in some way use empirical information goes back to Helmholtz’s writings in the nineteenth century (i.e., the idea that retinal information simply had to be given a helping hand by prior experience). Only recently, however, have vision scientists considered the possibility that percepts might be generated entirely on the basis of the empirical success or failure of visual experience during the evolution of vision. Evidence for a wholly empirical basis for the perception of simple forms comes from studies of the percepts elicited by the sort of stimuli shown in Figures 11.14 and 11.15 (reviewed in Howe & Purves, 2005). For example, laser range-scanning can provide a database of natural scene geometry that can serve as a proxy for accumulated human experience with the relationship between retinal projections at different orientations and their probable source (Figure 11.17). Such studies have shown that this cumulative information about the frequency with which the physical sources of lines generate retinal projections of a given length in different orientations correspond remarkably well with the way people see line lengths (Figure 11.18). The peculiar perceptual function in Figure 11.14B is accurately predicted on this basis (cf. Figure 11.18B). The other geometrical illusions shown in Figure 11.15 can likewise be explained in terms of the statistical link between retinal images and the sources of such geometries in the natural scenes that humans have always had to contend with in their behavior. This observation in the perception of geometry is consistent with empirical explanations of brightness and color percepts (see Purves & Lotto, 2003). How the visual system generates these statistical relationships is not known, but some recent evidence concerning the processing of forms is at least consistent with the idea that visual processing operates in this general way.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec5:237
7/14/09 5:12:41 PM
238 Vision
Retinal projection
Figure 11.16 The inverse optics problem in geometry. The inherent ambiguity of retinal stimuli is illustrated here in terms of the perception of objects in space. The same linear projection on the retina can derive from an infinite number of linear objects at different distances, of different sizes, and in different spatial orientations. As a result, a retinal image cannot specify its physical source. (After Howe & Purves, 2005)
Figure 11.17 (Figure C.13 in color section) Relating the realworld geometry to retinal projections. A: Laser range scanning apparatus used to determine the physical geometry of scenes. The distances of object surfaces determined in this way are accurate to within a few millimeters. B: Representative images acquired
by the range scanning. Ordinary color images of the scenes are shown on the left and the corresponding range images acquired by the laser scanner on the right. Color-coding indicates the physical distance of each point in the scene from the origin of laser beam. (After Howe & Purves, 2005)
A key observation in this regard is that the activity of some visual cortical neurons cannot be understood in terms of their receptive field properties, at least as these properties have been conventionally defined (see Figure 11.5). For instance, the same pattern of neuronal activity in V1 can be elicited by differently oriented stimuli moving in different directions at different speeds (Figure 11.19; Basole, White, & Fitzpatrick, 2003). This result is contrary to what would be expected if the orientation of stimuli were represented simply by the activity of neurons selective for a given orientation. Although the finding illustrated in Figure 11.19 can be rationalized in several different ways, it raises doubts about the idea that receptive field properties are directly linked to perception.
Other observations have also challenged the conventional concept of receptive field properties by showing that the context of particular stimulus features modulates the relevant neuronal responses in a variety of ways. It is now generally recognized that the response properties of visual cortical neurons are influenced, often markedly, by stimuli presented outside the region of visual space that has traditionally defined extent of a neuron’s receptive field (reviewed in Fitzpatrick, 2000; Worgotter & Eysel, 2000). For instance, the response of orientation-selective cells in V1 to a moving bar is suppressed in varying degrees by the presence of moving bars outside the receptive field, even though the neurons show no response when the stimulus outside the field is presented alone (Knierim & Van Essen,
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec5:238
7/14/09 5:12:50 PM
Cumulative probability of occurrence of the physical sources
Perception of Depth 239 1
0.5
0
θ 0° θ 10° θ 20° θ 90°
0
50 100 150 200 Length of line projection (pixels)
250
Percentile rank (%)
25
20
15
10
0
30
120 150 60 90 Orientation of line projection (deg)
180
Figure 11.18 The statistical relationship of oriented lines projected on the retina and their physical sources predicts the way observers see line lengths as a function of their orientation in the retinal image. A: Cumulative probability distributions of the occurrence of physical sources of differently oriented lines. B: Perception of line length as a function of orientation predicted by the data in (A). (After Howe & Purves, 2005)
Figure 11.19 Evidence that the same pattern of cortical activity in V1 can be elicited by different stimuli (the experimental animal in this case was a ferret). The optical imaging technique used here monitors cortical activity by virtue of activity-dependent changes in the light reflected from the primary visual cortex (the dark areas are more active; the view is looking down on V1,
1992). These findings are not particularly surprising, considering that neurons at different levels of the primary visual pathway receive a majority of their synaptic inputs from other neurons at the same level and/or feedback from neurons at higher levels of processing. They cast doubt, however, on the notion that a representation of the retinal image is in any sense reconstructed at some level of the visual system based on the combined receptive field properties of the relevant neurons. Several findings add to misgivings about the conventional interpretation of receptive field properties that have been expressed for some time now (e.g., Lennie, 1998). These countervailing observations about receptive fields should not be taken to mean that the evidence illustrated in Figure 11.5 is in any sense wrong. Rather, they simply imply that conventional thinking about the relationship between classical physiology and perception is incomplete. Finally, observations about how the visual cortex represents the perception of size imply that cortical processing that tracks perceptions rather than stimulus features as such (Figure 11.20). In this case the investigators took advantage of the retinotopic organization of the primary visual cortex and used fMRI to ask whether the area activated by a stimulus of a particular size corresponded to the actual size of the object in the retinal image or its perceived size. The active area in V1 appeared to track perceived rather than actual size, suggesting that cortical processing even at this initial stage is more closely related to perception than to the retinal representation of form. PERCEPTION OF DEPTH A fourth basic quality of vision is the perception of depth, that is, the perception of a three-dimensional world from
which has been surgically exposed). A: The same pattern of neuronal activity can be elicited by either of the two different stimuli in (B). B: Examples of stimuli comprising differently oriented line segments moving in different directions at different speeds; both elicited the pattern of activity shown in (A). (After Basole et al., 2003.)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec6:239
7/14/09 5:13:00 PM
240 Vision
Figure 11.20 (Figure C.14 in color section) Representation of object size in the primary visual cortex. The two panels on the left show the stimulus used: the two checkered balls are the same size, but the one depicted as being further away looks larger than the one depicted as being closer. A: Inset showing the occipital region examined by fMRI study. B: Flickering stimulus used to define the region activated in the primary visual cortex of each subject. C: The numbers corresponding to the area of primary visual cortex activated in fMRI images by balls of different actual sizes. The results in several subjects indicated that active region in V1 varied with the perceived size of the balls in the stimulus rather than their actual size in the stimulus. (After Murray et al., 2006)
two-dimensional retinal images. Some aspects of depth are derived from information in the view of one eye alone, whereas another aspect is apparent only when both eyes are used together. Thus depth perception is generally discussed in terms of its monocular and binocular components. Monocular Depth Perception Monocular depth perception (the sense of three dimensionality when looking at the world with one eye closed) presumably derives from experience with the arrangement of objects in space. The most obvious fact learned from such experience is occlusion: When part of one object is obscured by another, the obstructing object is always closer to the observer than the obstructed object. Another universal experience pertinent to depth is the relationship of size and distance: a projection of the same object occupies progressively less space on the retina the further away it is, thus providing additional information about depth (and defining perspective). Additional monocular depth comes from motion parallax. When the position of the observer changes (by moving the head from side to side, for instance), the position of the background with respect to an object in the foreground changes more for nearby objects than distant ones. Finally, the fainter and fuzzier appearance of distant objects as a result of the Earth’s atmosphere
(referred to as aerial perspective) provides a further empirical indication of how far away things are. Moreover, because the atmosphere absorbs more long than short wavelength light (the interposed medium is effectively “sky”), distant objects also look bluer compared to their appearance nearby, as landscape artists know well. That monocular information about depth is largely learned accords with the fact human infants do not at first appreciate depth (newborn rhesus monkeys, however, do see depth quite well, as evidenced by skillful behavior as they leap from perch to perch). All of us gradually discover that more distant objects are often occluded, are smaller in appearance, tend to change position less with respect to the background when we move, and look fainter, fuzzier, and bluer. Since we are only dimly aware of these issues, if at all, it follows that the incorporation of this sort of information is unconscious, in keeping with the general idea that percepts are continually shaped by feedback from behaviors that work. Binocular Depth Perception A quite different sort of information about the arrangement of objects in space is available when scenes are viewed with both eyes. Binocular information about depth is called stereopsis and arises from the fact that the eyes are separated horizontally across the face by an average distance of 65 mm
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec6:240
7/14/09 5:13:06 PM
Perception of Depth 241
in adult humans, giving each eye a slightly different view of the same nearby objects (Figure 11.21A). This difference in the two images is called retinal disparity. The behavioral significance of stereo vision can be appreciated comparing the difficulty of bringing the points of two pencil points together in the frontal plane using both eyes and then only one. Making the tips of the pencils touch (or performing other, more consequential tasks) is much easier in binocular view. Other animals with frontally located eyes enjoy the same advantage in depth perception, and most mammals have some stereoscopic ability in the region of binocular overlap (the human overlap is about 140˚, whereas walleyed animals like horses have only about 15˚of overlap). As the English physicist and vision scientist Charles Wheatstone showed by his invention of stereoscope in the 1830s, the greater behavioral success with both eyes open in this and other tasks involving manipulation—and the corresponding enrichment in the perceptual sense of depth—arises from a “fusion” in visual perception of the somewhat different views of the two eyes (Figure 11.21B;
Wheatstone, 1838). Wheatstone also pointed out that stereoscopic information is limited to viewing objects relatively near the observer. The differences in the views of the right and the left eyes illustrated in Figure 11.21A decrease progressively as the lines of sight of the two eyes become increasingly parallel, causing the binocular disparity of objects in the image plane to eventually fall below the resolving power of the visual system (for all practical purposes, stereoscopy adds little to the success of visually guided behavior for objects more than a few meters away, and presumably evolved for its advantages in near tasks. Random Dot Stereograms Many studies of stereoscopic vision have used random dot stereograms (RDSs; Figure 11.22). In addition to their intrinsic interest, RDSs continue to fascinate because of basic issues they raise, in particular how locally randomly arranged right and left eye information can be put together
Figure 11.21 The different views of the two eyes. A: Viewing any nearby object with one eye and then the other makes obvious the difference in the views of the two eyes. B: The consequences for generating sensations of depth can be demonstrated with a stereoscope. If two pictures of a scene are taken from slightly different angles, then looking at the 2-D images binocularly produces a strong sensation of depth that is not present when the same images are viewed with one eye or the other, or identical images observed with both eyes. (From Purves & Lotto, 2003)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec6:241
7/14/09 5:13:14 PM
242 Vision
Figure 11.22 Random dot stereograms and their construction. A–C: Construction and perceptual result of shifting a set of random dots to the left in the view of the right eye. D–F: Construction and perceptual result shifting a set of random dots
to the left in the view of the left eye. The diagrams of the resulting percepts in (C) and (F) assume that the observer is fusing the images “divergently” (i.e., by looking through the page). (From Purves & Lotto, 2003)
by the brain. These intriguing stimuli were introduced about 50 years ago and adapted for experimental work in vision by the late psychologist Bela Julesz working at Bell Labs (Julesz, 1995). Although the idea had been discovered some years earlier, the technique became widespread when Julesz showed how RDSs could be easily made and manipulated by a computer. RDSs are essentially stereograms of an object camouflaged so completely with respect to its background that the target can be seen only when the two monocular components of the random dot pair are viewed binocularly. Such stimuli thus eliminate all monocular depth cues (e.g., occlusion, perspective, motion parallax), and/or cognitive information (e.g., prior recognition of an object) that might surreptitiously affect neural processing specific to binocular depth perception. Figure 11.22 shows how such stimuli are typically made. A target object (a square in this case) within a field of randomly generated black and white “dots” (each comprising a few pixels on a computer screen) is selected and shifted a fraction of a degree over the background in the one member of the stereo pair (the corresponding set of dots in the view of the other eye remains
in place). The gap created by the shifted set is then filled in with additional random dots; note also that another set on the other side of the shifted set has been covered up in this process. As a result of this manipulation, the shifted square appears to be in front of or in back of the background array (depending on whether the shift was to the left or right) when the left and right random dot arrays are fused. Many people can, with a little practice, carry out such fusion by looking “through” the plane of the printed page (alternatively, the two components can be viewed in a stereoscope of the sort shown in Figure 11.21A). The perception of depth in response to RDSs is actually not as mysterious as it seems: the shifted pattern of dots in the two eye views simply mimics what would be seen if an object in this spatial arrangement were perfectly camouflaged by the texture of the background, and many natural situations approach this condition. The experience of looking at RDSs that lack an obvious frame (as in the “autostereograms” found in popular books and posters of such stimuli) suggests that the visual system determines how to put the two eye views together by trial and error.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec6:242
7/14/09 5:13:19 PM
Perception of Motion 243
Binocular Neural Processing The fact that stereopsis depends on retinal disparity implies that the visual system must in some way compare the loci on the two retinas that are stimulated by light rays arising from the same points in visual space (Howard & Rogers, 1995). This idea is supported by the fact that many neurons in both the primary and extrastriate visual cortex of experimental animals have receptive fields that are “tuned” to specific disparities (Figure 11.23). This evidence, together with the knowledge that stereopsis can be elicited by random dot stereograms, suggests that the perception of binocular depth is generated by neural computations of the disparity at corresponding retinal points. Although this explanation is eminently logical, understanding how the nervous system implements the postulated geometrical comparison of a stereo pair has been a difficult challenge. There is as yet no agreement about an algorithm that could accomplish this feat. Nor is it agreed whether this interpretation adequately explains the further aspects of binocular vision: cyclopean fusion and binocular rivalry. Although we normally view nearby objects with both eyes open—and thus process two appreciably different retinal images (see Figure 11.21)—the perceived image of the nearby world is clearly a unified one (remember that for distant scenes the two retinal images are identical). Thus, what observers see in binocular view seems to have been generated by a single eye in the middle of the face, a subjective experience referred to as cyclopean vision. This union of two quite different monocular views into a coherent cyclopean percept is taken for granted. Yet like many other aspects of vision it presents a deep puzzle: How are the two independent views of any nearby
Tuned excitatory
Far
Neural activity
Near
scene conjoined to create a single percept having qualities (including stereoscopic depth) that are not present in the view of either eye alone? Most explanations of this puzzle depend on the fact that inputs from the two eyes converge on cortical neurons in the primary visual cortex (Figure 11.24). Although right and left eye inputs are kept apart in the thalamus and in cortical layer IV in V1 (which receives the afferents from the lateral geniculate nucleus; see previous discussion), many neurons in the deeper and more superficial cortical layers in the primary visual cortex of primates are binocularly driven. The prevalence of binocular cells in the primate visual cortex suggests that cyclopean vision arises from this demonstrable conjunction of right and left eye inputs at the level of common target cells in the visual cortex. Despite this attractive anatomical and physiological substrate for a perceptual union of the two monocular streams, the idea of “seeing” a cyclopean image by virtue of binocular neurons in the visual cortex, at least in any simple sense, is inconsistent with other evidence, the phenomenon of binocular rivalry in particular (Figure 11.25). Binocular rivalry refers to the fact that when a particular stimulus pattern (e.g., vertical stripes) is presented to one eye and a strongly discordant pattern (e.g., horizontal stripes) to the other, the same region of visual space is perceived to be alternately occupied by vertical stripes or horizontal stripes, but rarely (and only transiently) by both. If information from the two eyes were simply united in the visual cortex, observers would presumably see some stable integration of vertical and horizontal stripes in response to such stimuli (a grid in the most simplistic interpretation; see Figure 11.25A). Moreover, work by Randolph Blake and Nikos Logothetis (2002) have shown that it is not always the images on the two retinas that rival: at least in some circumstances it is the percepts themselves that seem to be the source of the competition, consistent with the idea that cortical activity is more concerned with percepts than image features (Figure 11.25B). There has been no consensus about the basis of binocular fusion and rivalry; how the visual system processes and unites the views of the two eyes is not yet understood.
PERCEPTION OF MOTION
Near
Fixation point
Far
Distance of stimulus relative to point of fixation
Figure 11.23 Disparity tuning in visual cortical neurons. Electrophysiological recording of the activity of single neurons in V1 of cats and monkeys shows that many cells respond selectively to binocular stimuli that have different disparities, leading to the concept of “near” and “far” cells. (After Poggio et al., 1995)
The final perceptual quality generated by visual processing considered here is motion, defined as the subjective experience elicited when a sequence of different but related images are presented to the retina over a brief span of time (physical motion can be either too fast or too slow to elicit the perception of motion; we don’t see the trajectory of a bullet or the motion of the hour hand on a clock). Much as a perceptual category like color that comprises the subsidiary qualities of
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec7:243
7/14/09 5:13:42 PM
244 Vision Primary visual cortex Cortical layer
I Monocular cells
II III IV
Binocular cells
V Left retina
Right retina
VI Lateral geniculate nucleus (thalamus) 6 5 4 3 2 1
Lateral geniculate nucleus (thalamus) Primary visual cortex
Figure 11.24 (Figure C.15 in color section) The anatomical conjunction of the two monocular streams of visual information in visual cortex. Inputs related to the right and left eyes first come together in the primary visual cortex, where half or more of the neurons in rhesus monkeys can be activated by a stimulus presented to either the left or right eye. Note that the afferents related
to the two eyes remain segregated at the level of the lateral geniculate nucleus in the thalamus and in the right eye/left eye cortical modules in layer IV illustrated in Figure 1; binocularly driven cells are found only above (and below; not shown) this thalamic input layer. (From Purves & Lotto, 2003)
(A) Monocular stimuli Left eye
Right eye
Binocular percept
(B)
Spikes s⫺1
40 20 0
Figure 11.25 Binocular rivalry. A: The phenomenon of binocular rivalry illustrated with vertical lines presented to the left eye and horizontal lines to the right eye. A grid pattern is not seen, indicating the views of the two eyes are not simply brought together by the activity of binocular neurons in the visual cortex. B: Electrophysiological recordings from individual visual cortical neurons in a monkey trained to report whether he was aware of the left or right eye image in a rivalry paradigm. The neuron shown in this example is active only when the right eye view was perceived (red bars). This result indicates that percepts compete in binocular rivalry. (B is from Blake & Logothetis, 2002)
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec7:244
7/14/09 5:13:47 PM
Perception of Motion 245
hue, saturation, and brightness, motion percepts entail the perception of speed and the perception of direction.
(A) Perceptual discrimination task Left?
Motion Right?
Evidence for Dedicated Motion Processing Areas
(B) Linking behavior to perception
% left choices
Just as more or less specific regions of the visual association cortices emphasize the processing of color (V4 and related areas), form (V1 and related areas) and depth (also V1 and related areas), so particular regions in the primate brain are especially concerned with motion processing. These regions in the occipital and posterior temporal lobes called MT for medial temporal and MST for medial superior temporal (see Figure 11.2; recall that these regions are called MT⫹ in humans to indicate that the MST component is less well defined in humans than in monkeys, in which most motion studies have been done). That these areas are specialized for motion processing was first determined by single unit recording in monkeys carried out in the 1970s and 1980s, which showed that many more cells within these areas are responsive to image sequences than cells in other visual cortical regions (Maunsell & Van Essen, 1983). Noninvasive brain imaging during the presentation motion stimuli has shown that the same general areas are active in humans viewing motion stimuli. The neurons in MT and MST in monkeys receive input from motion sensitive cells in V1, are arranged in columnar modules that have the same preference for oriented motion stimuli. Moreover, they respond to motion over large regions of a visual scene, a finding that accords with the large receptive field sizes of other neurons in extrastriate regions (see earlier discussion). Evidence that the activity of MT neurons is closely related to motion percepts has been provided by studies by William Newsome and his collaborators (Figure 11.26; Newsome, Britten, & Movshon, 1989; Sugrue, Corrado, & Newsome. 2005). Rhesus monkeys were shown a display of dots moving in different directions. If a sufficient proportion of the dots move coherently stimuli, humans or monkeys perceive and overall a direction of motion in the display (e.g., rightward). As indicated in the figure, monkeys can be trained to move their eyes in the direction of the movement of the dots. While a monkey trained in this way performed the task, action potentials were recorded from MT neurons. The recordings showed that the activity of single neurons was often correlated with the direction of dot motion. Indeed, the activity of neurons in the population was sometimes a better predictor of the direction of dot motion than the behavior of the monkey (i.e., its eye movements). To show that MT neurons play a causal role in such perceptual discriminations, it would be necessary to manipulate neuronal activity directly and then observe changes in behavior. To test this point, Newsome
Stimulus strength (left)
Figure 11.26 Relating motion sensitive neurons in MT to motion percepts. A: In this experiment a rhesus monkey was trained to report whether he perceived rightward or leftward motion in response to a pattern of moving dots (the report was made by shifting his eyes to either the right or left target). B: By changing the amount of coherent motion in the moving dot pattern, a psychophysical function was obtained that plots perceptual accuracy against the amount of motion coherence among the dots. Electrical stimulation of small populations of MT neurons (not shown) shifted this curve in a systematic way, showing that the activity of these neurons can influence motion perception. (After Sugrue et al., 2005)
and colleagues identified MT neurons that showed selective activity for a particular direction of motion. They then stimulated the neurons electrically. For about half of the electrode locations, such microstimulation increased the probability that the monkeys would move their eyes in the direction consistent with the directionally selective receptive field properties of the stimulated neurons. Like the evidence for the importance of V4 in color, the importance of these extrastriate temporal areas for motion processing has been underscored by a “motion-blind” patient (Zihl, von Cramon, & Mai, 1983). The patient is a 43-year-old woman known as LM who suffered a vascular lesion that caused bilateral damage in the general region of the MT⫹ motion areas. Although the lesion resulted in several neurological problems, a striking feature of her case was difficulty perceiving motion. She had difficulty following speech because she couldn’t pick up mouth movement cues, and was hesitant crossing a street because
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec7:245
7/14/09 5:14:02 PM
246 Vision
she couldn’t judge the movement of cars. Interestingly, LM is nonetheless able to perceive certain kinds of motion. For instance, when lights are attached to the key joints of the body and human movements observed in the dark, she can distinguish different types of common human movements such as walking. Consistent with this clinical evidence, transcranial magnetic stimulation of MT⫹ in normal human subjects can also interfere with motion percepts. Taken together, this evidence for specialized motion processing areas accords with the concept of a motion processing stream that conveys information from the magnocellular pathway that begins in the retina and is evident in the magnocellular layers of the thalamus and in V1. These areas are also components of the more broadly defined dorsal pathway concerned with object location and action, which contrasts with a ventral pathway that is more concerned with object recognition (see Figure 11.3). Some Problems Understanding Motion Perception Despite these advances, how motion percepts are generated by neural processing is far from understood. Because the movement of objects in three-dimensional space is projected onto the two-dimensional retinal surface, the changes in position that uniquely define motion in physical terms are always uncertain with respect to the possible sources of the retinal image sequence (see also Figure 11.16). A much-studied example that makes this point is the perception of a moving rod seen through an aperture that renders its ends invisible (Wallach, 1935; Wuerger, Shapley, & Rubin, 1996). As illustrated in Figure 11.27, the combinations of speeds and directions in this situation that could have given rise to the sequence of images falling on the retina is infinite. The challenge of explaining how the visual system generates quite definite perceptions of speed and direction in response to such stimuli is called the aperture problem, and remains to be solved. A further challenge in understanding motion percepts is the sense of entirely realistic motion generated from a series of static images, a phenomenon called apparent motion. The simplest stimulus sequence that could be used to study this phenomenon is the presentation of just two sequential flashed spots of light, which is what Max Wertheimer did nearly a century ago (Figure 11.28A; Wertheimer, 1912). For a spatial interval of one or a few degrees, Wertheimer found that if the temporal interval is less than ⬃2 ms, the two spots of light appear to come on simultaneously and no motion is seen; at the other extreme, if the interval is greater than ⬃450 ms, the two lights appear to come on sequentially and no motion is seen. Between these limits, subjects perceived some form of motion, the most “realistic” motion being in the middle of this range. The motion
Possible directions of motion for part of line in frame moving downwards
Actual direction of motion
Horizontally moving line
Circular aperture
Perceived direction of motion (discrepancy of ~ 45º)
Figure 11.27 The inherent ambiguity of motion stimuli. The stimulus sequence elicited by a rod moving behind an aperture can be generated by an infinite number of directions of physical motion, each associated with a different speed. Imagine, for example, that the linear object in the aperture is moving horizontally from left to right. The same stimulus sequence could have been generated by any of the directions of physical movement indicated by the other arrows around a limiting hemisphere, each coupled with an appropriate speed. In the absence of the aperture, such a line appears to be moving horizontally from left to right at a particular speed. The moment the aperture is applied, however, the line appears to be moving downward and to the right at a slower speed (arrow in circle). (From Purves & Lotto, 2003)
elicited by such stimuli is the basis movies and video, in which static images are presented at a high frame rate (96/s in movies; video frames are refreshed one line at a time such that the whole picture changes ⬃30 times each second, but the general idea is the same). Other intriguing percepts occur if additional lights are added to the simple sort of pattern studied by Wertheimer. For instance, if a quartet of lights is used, the apparent motion seen is horizontal and not diagonal, even though there is no obvious prohibition against seeing diagonal motion (Figure 11.28B). Explanations of apparent motion have tended to invoke rules or principles called heuristics that the visual system supposedly employs to guide perceptual “interpretations,” an approach derived from the gestalt school of psychology that Wertheimer founded. The basis of apparent motion is, however, unresolved and raises the general issue of how the visual system (or the brain more generally) parses visual information over time. The ongoing debate over how to rationalize perceived motion in the face of these problems is well beyond the scope of this chapter. In general, however, most theoretical explanations have been based on mathematical models of motion energy or on other nonlinear spatio-temporal
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec7:246
7/14/09 5:14:03 PM
Perception of Objects 247 (A) 1
2
1
2
1
2
(B)
Not
1
2
1
2
Figure 11.28 Apparent motion. A: When two fixed lights separated by an appropriate distance are turned on and off at an interval greater than 20ms but less than several hundred ms, observers see the first light (1) moving to the position of second light (2). B: Manipulating these variables, and varying the number of lights involved, elicits a variety of motion effects that are difficult to explain, such as why diagonal motion is not seen in this example.
filtering mechanisms. Another approach more in keeping with the approach that can rationalize aspects of brightness, color, and form is to suppose that the perceived directions and speeds of the moving objects are empirically determined by accumulated information about the possible sources of the inherently ambiguous stimuli. At present, however, there is no consensus about the strategy the visual system uses to generate motion percepts.
PERCEPTION OF OBJECTS The focus of the chapter has been on visual perceptual qualities as such—brightness, color, form, depth, and motion. No understanding of more complex percepts can emerge without first understanding these fundamental characteristics of visual percepts. It is obvious, however, that when we look at the world we see objects defined by these qualities. Moreover, objects that have particular significance for us—for example, human faces or symbols like letters
and numbers—are much more carefully inspected and attended to than other classes of objects. As already indicated, the recognition of objects by means of vision involves the ventral visual processing stream that eventually leads to the temporal lobe (see Figure 11.3). The region of the temporal lobe that supports object recognition is not uniform but is to some degree regionally specialized. Thus there is a relatively specific region on the inferior aspect of the temporal lobe where many neurons are responsive to faces (called the fusiform face area; Figure 11.29), another region that is involved in processing information about animals, another concerned with inanimate objects such as tools or houses, and still others concerned with recognizing words (Kanwisher, 2006). It is of course unlikely that every category of object we see has a dedicated area of temporal cortex underpinning recognition; thus how best to think about the organization of this part of the temporal lobe remains controversial. Similarly, whether object recognition entails the appreciation of significant parts (e.g., eyes, nose, mouth), a more global integration or some combination of these is much debated. Nonetheless, lesions of the inferotemporal cortex can clearly impair the ability to perform recognition tasks, sometimes quite selectively. Given that we perceive objects as unitary entities characterized by a variety of sensory attributes (size, weight, shape, color, texture, and so on), another issue is how these sensory qualities are brought together. This problem is generally referred to as the “binding problem” (Marr, 1982). There are several frameworks for thinking about a possible solution. One possible answer is fundamentally anatomical: The union of perceptual qualities being achieved by the convergence of information about various sensory properties in higher-order neurons whose activity would then represent a conjunction of the various qualities involved. This perspective is predicated on the idea that neurons representing percepts are at the apex of a processing pyramid. There are, however, logical obstacles to this interpretation, as already mentioned in discussing binocular fusion. Most neuroscientists have concluded that only the activity of a fairly large population of cells located in different brain regions could accomplish this feat. But if a dispersed population is involved, then how does the activity of this cohort of nerve cells become associated with the specific object in question? Some investigators have suggested that synchronized oscillatory activity among the relevant cortical neurons might serve this function. Another proposal is that the solution could lie in a rapid transition of attention to the activity of the various neurons representing different object qualities, the perception of unity being a result of this rapid transitioning. A more radical possibility is that neither physiological nor anatomical union is necessary.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec8:247
7/14/09 5:14:05 PM
248 Vision (A)
(B) 1.00%
MR signal change
0.80%
White matter Face area
0.60% 0.40% 0.20% 0.00% ⫺0.20% ⫺5⫺4⫺3⫺2⫺1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Time (s)
R
L
Figure 11.29 Functional magetic resonance imaging during a face recognition task. A: Face stimulus presented to a normal subject at time indicated by arrow. Graph shows activity change
in the relevant area of the right temporal lobe. B: Location of f MRI activity in right inferior termporal lobe. (Courtesy of Greg McCarthy; from Purves et al., 2004)
Whatever activity existed in the brain at a given moment that was consciously attended would constitute a percept, the binding following more or less automatically without any special mechanism being required. At present, all of these possibilities remain potential solutions to the conceptual puzzle of feature binding.
both the species and the individual. How the receptive field properties of visual neurons and the organization of neuronal populations can be understood in these terms is just beginning to be explored.
SUMMARY
Basole, A., White, L. E., & Fitzpatrick, D. (2003). Mapping multiple stimulus features in the population response of visual cortical neurons. Nature, 423, 986–990.
A great deal is now known about how information conveyed by light is processed in the primary visual pathway, including the primary visual cortex. How this information is processed in the higher order visual cortices is less well understood, however, and the generation of percepts remains a matter of debate. The basic challenge in understanding the strategy of neural processing in any of these regions is explaining how inherently uncertain retinal stimuli can give rise to definite perceptions and generally successful visually guided behavior. In each category of basic visual qualities—brightness, color, form, depth, and motion—the evidence points increasingly to an empirical strategy of vision as a means of contending with the inverse optics problem. The idea that what we see in response to retinal stimuli is a statistical manifestation of accumulated past experience rather than a logical analysis of the features of the retinal image runs counter to all our intuitions about vision. Nevertheless, the nature of the inverse problem and obvious discrepancies between what we actually see and physical reality are difficult to explain in any other way. The advantage of generating vision in this manner is that over evolutionary time, percepts—and the visual circuitry that underlies them—progressively incorporate the vast amount information derived from the experience of
REFERENCES
Blake, R., & Logothetis, N. K. (2002). Visual competition. Nature Review. Neuroscience, 3, 1–11. DeValois, R. L., & DeValois, K. K. (1993). A multistage color model. Vision Research, 33, 1053–1065. Fitzpatrick, D. (2000). Seeing beyond the receptive field in the primary visual cortex. Current Operations of Neurobiology, 10, 438–442. Gegenfurtner, K. R. (2003). Cortical mechanisms of color vision. Nature Review. Neuroscience, 4, 563–572. Helmholtz, H. L. F. V. (1866/1924). Helmholtz’s treatise on physiological optics. New York: Optical Society of America. Howard, I. P., & Rogers, B. J. (1995). Oxford psychology series: No. 29. Binocular vision and stereopsis. New York: Clarendon Press. Howe, C. Q., & Purves, D. (2005). Perceiving geometry: Geometrical illusions explained by natural scene statistics. New York: Springer. Hubel, D. H. (1988). Eye, brain, and vision, scientific American library series. New York: Freeman. Hubel, D. H., & Wiesel, T. N. (2005). Brain and visual perception: The story of a 25-year collaboration. Oxford: Oxford University Press. Julesz, B. (1995). Dialogues on perception. Cambridge, MA: MIT Press. Kanwisher, N. (2006, February 3). What’s in a face? Science, 311, 617–618. Knierim, J. J., & Van Essen, D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 67, 961–980. Kuffler, S. W. (1953). Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16, 37–68. Land, E. H. (1986). Recent advances in retinex theory. Vision Research, 26, 7–21.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec8:248
7/14/09 5:14:10 PM
Additional Readings Lennie P. (1998). Single units and visual cortical organization. Perception, 27, 889–935. Livingstone, E. M., & Hubel, D. H. (1988, May 6. Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science, 240, 740–749.
249
Wheatstone, C. (1838). Contributions to the physiology of vision: Pt. I. On some remarkable and hitherto unobserved phenomena of binocular vision. Royal Society of London, 128, 371–394. Worgotter, F., & Eysel, U. T. (2000). Context, state and the receptive fields of striatal cortex cells. Trends in Cognative Science, 23, 497–503.
Long, F., Yang, Z., & Purves, D. (2006). Spectral statistics in natural scene predict hue, saturation, and brightness. Proceedings of the National Academy of Sciences, USA, 103, 6013–6018.
Wuerger, S., Shapley, R., & Rubin, N. (1996). On the visually perceived direction of motion by Hans Wallach: 60 years later. Perception, 25, 1317–1367.
Marr, D. (1982). Vision: A computational investigation into human representation and processing of visual information. San Francisco: Freeman.
Zeki, S. M. (1983a). Colour coding in the cerebral cortex: The reaction of cells in monkey visual cortex to wavelengths and colours. Neuroscience, 9, 741–766.
Maunsell, J., & Van Essen, D. (1983). Functional properties of neurons in the middle temporal visual area of the macaque monkey. Journal of Neurophysiology, 49, 1127–1147.
Zeki, S. M. (1983b). Colour coding in the cerebral cortex: The responses of wavelength-selective and colour-coded cells in monkey visual cortex to changes in wavelength composition. Neuroscience, 9, 767–781.
Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. New York: Oxford University Press.
Zeki, S. (1989). A century of cerebral achromatopsia. Brain, 113, 1721–1777.
Mollon, J. D. (1995). Seeing colour. In T. Lamb & J. Bourriau (Eds.), Colour: Art and science (pp. 127–150). Cambridge: Cambridge University Press.
Zeki, S. (1993). A vision of the brain. Oxford: Blackwell Scientific Publications. Zihl, J., von Cramon, D., & Mai, N. (1983). Selective disturbance of movement vision after bilateral brain damage. Brain, 106, 313–340.
Nathans, J. (1987). Molecular biology of visual pigments. Annual Review of Neuroscience, 10, 163–194. Newsome, W. T., Britten, K. H., & Movshon, J. A. (1989, September 7). Neuronal correlates of a perceptual decision. Nature, 341, 52–54.
ADDITIONAL READINGS
Purves, D., Augustine, G. A., Fitzpatrick, D., Hall, W., LaMantia, A. S., McNamara, J. O., & Williams, S. M. (2008). Neuroscience (4th ed.). Sunderland, MA: Sinauer.
Barlow, H. B., & Mollon, J. D. (1982). The senses. Cambridge: Cambridge University Press.
Purves, D., Brannon, E. M., Cabeza, R., Huettel, S. A., LaBar, K. S., Platt, M. L., & Woldorff, M. (2008) Principles of cognitive neuroscience. Sunderland, MA: Sinauer. Purves, D., & Howe, C. Q. (2005) Perceiving geometry: Geometrical illusions explained by natural scene statistics. New York: Springer.
Berkeley, G. (1709/1976). A new theory of vision. Ayers, M.R., (Ed.) In Everyman’s library. London: Everyman/J.M. Dent. Bouvier, S. E., & Engel, S. A. (2006). Behavioral deficits and cortical damage in cerebral achromatopsia. Cerebral Cortex, 16, 183–191. Cornsweet, T. N. (1970). Visual perception. New York: Academic Press.
Purves, D., & Lotto, R. B. (2003). Why we see what we do: An empirical theory of vision. Sunderland, MA: Sinauer.
Courtney, S. M., & Ungerleider, L. G. (1997). What fMRI has taught us about human vision. Current Operations of Neurobiology, 7, 554–561.
Purves, D., Williams, S. M., Nundy, S., & Lotto, R. B. (2004). Perceiving the intensity of light. Psychology Review, 111(1), 142–158.
Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in primate cerebral cortex. Cerebral Cortex, 1, 1–47.
Robinson, J. O. (1998). The psychology of visual illusions. New York: Dover. (Original work published 1972)
Goodale, M. A., & Humphrey, G. K. (1998). The objects of action and perception. Cognition, 67, 179–205.
Rodieck, R. W. (1998). First steps in seeing. Sunderland, MA: Sinauer.
Horton, J. C. (1992). The central visual pathways. In W. M. Hart (Ed.), Adler’s physiology of the eye (pp. 728–772). St. Louis, MO: Mosby Yearbook.
Sacks, O. (1995). An anthropologist from Mars: Seven paradoxical tales. New York: Knopf. Sacks, O. (1996). The island of the colorblind. New York: Knopf.
Howe, C. Q., Lotto, R. B., & Purves, D. (2006). Empirical approaches to understanding visual perception. Journal of Theoretical Biology, 241, 866–875.
Shevell, S. H., & Monnier, P. (2006). Color shifts induced by S-cone patterns are mediated by a neural representation driven by multiple cone types. Visual Neuroscience, 23, 567–571.
Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. Journal of Physiology, 160, 106–154.
Stevens, S. S. (1975). Psychophysics. New York: Wiley.
Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215–243.
Sugrue, L. P., Corrado, G. S., & Newsome, W. T. (2005). Choosing the greater of two goods: Neural currencies for valuation and decision making. Nature Reviews: Neuroscience, 6, 363–375. Turner, R. S. (1994). In the eye’s mind: Vision and the Helmholtz-Hering controversy. Princeton, NJ: Princeton University Press. Ungerleider, L. G., & Haxby, J. V. (1994). “What” and “where” in the human brain. Current Opinion in Neurobiology, 4, 157–165.
Hubel, D. H., & Wiesel, T. N. (1977). Functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society, 198, 1–59. Kersten, D. (2000). High-level vision as statistical inference. In M. S. Gazzaniga (Ed.), The new cognitive neurosciences (pp. 353–363). Cambridge, MA: MIT Press. Knill, D. C., & Richards, W. (1996). Perception as Bayesian inference. New York: Cambridge University Press.
Ungerleider, J. G. , & Mishkin , M. ( 1982 ). Two cortical visual systems . In D. J. Ingle , M. A. Goodale , & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549 – 586 ). Cambridge, MA : MIT Press .
Mountcastle, V. B. (1957). Modality and topographic properties of single neurons of cat’s somatic sensory cortex. Journal of Neurophysiology, 20, 408–434.
Wallach, H. (1935). Über visuell wahrgenommene Bewegungsrichtung. Psychologische Forscheung, 20, 325–380.
Mountcastle, V. B. (1998). Perceptual neuroscience: The cerebral cortex. Cambridge, MA: Harvard University Press.
Wertheimer, M. (1912/1950). Laws of organization in perceptual forms. In W. D. Ellis (Ed. & Trans.), A sourcebook of gestalt psychology (pp. 71–88). New York: Humanities Press.
Murray, S. O., Boyaci, H., & Kersten, D. (2006). The representation of perceived angular size in primary visual cortex. Nature Neuroscience, 9, 429–434.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec9:249
7/14/09 5:14:25 PM
250 Vision Poggio, G. E. (1995). Mechanisms of stereopsis in monkey visual cortex. Cerebral Cortex, 3, 193–204. Poggio, G. F., & Poggio, T. (1984). The analysis of stereopsis. Annual Review of Neuroscience, 7, 379–412.
Sereno, M. I., Dale A. M., Reppas J. B., Kwong K. K., Belliveau J. W., Brady T. J., Rosen B. R., & Tootell R. B. (1995, May 12). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science, 268, 889–893.
Pouget, A., Dayan, P., & Zemel, R. S. (2003). Inference and computation with population codes. Annual Review of Neuroscience, 26, 381–401.
Shepard, R. N., & Cooper, L. A. (1992). Representation of colors in the blind, color-blind and normally sighted. Psychology Science, 3, 97–103.
Rao, R. P. N., Olshausen, B. A., Lewicki, M. S. (Eds.). (2002). Probabilistic models of the brain: Perception and neural function. Cambridge, MA: MIT Press.
Shimojo, S., Paradiso, M., & Fujita, I. (2001). What visual perception tells us about mind and brain. Proceedings of the National Academy of Sciences, 98, 12340–12341.
Revonsuo, A., & Newman, J. (1999). Binding and consciousness. Consciousness and Cognition, 8, 123–127.
Simoncelli, E. P., & Olshausen, B. A. (2001). Natural images statistics and neural representation. Annual Review of Neuroscience, 24, 1193–1216.
Rock, I. (1984/1995). Perception. New York: Freeman.
Tootell, R. B., Reppas, J. B., Dale, A. M., Sereno, M. I., Malach, R., Brady, T. J., et al. (1995, May 11). Visual motion aftereffect in human cortical area MT revealed by functional magnetic resonance imaging. Nature, 375, 139–141.
Sakmann, B., & Creutzfeldt, O. D. (1969). Scotopic and mesopic light adaptation in the cat’s retina. Abteilung für neurophysiologie, Pflügers Arch, 313, 168–185. Salzman, C. D., Britten, K. H., & Newsome, W. T. (1990, July 12). Cortical microstimulation influences perceptual judgments of motion direction. Nature, 346, 174–177.
Wandell, B. A. (1995). Foundations of vision. Sunderland, MA: Sinauer.
Schiller, P. H. (1997). Past and present ideas about how the visual scene is analyzed by the brain. In Rockland K., Kaas J., Peters A., (Eds), Cerebral cortex and extrastriate cortex. New York: Plenum Press.
Yang, Z., & Purves, D. (2004). The statistical structure of natural light patterns determines perceived light intensity. Proceedings of the National Academy of Sciences, USA, 101, 8745–8750.
White, M. (1979). A new effect of pattern on perceived lightness. Perception, 8, 413–416.
Handbook of Adolescent Psychology, edited by Richard M. Lerner and Laurence Steinberg. Copyright # 2009 John Wiley & Sons, Inc. c11.indd Sec9:250
7/14/09 5:14:26 PM
Chapter 12
Audition TROY A. HACKETT AND JON H. KAAS
ears produces a greater difference in sound arrival times (Sterbing-D’Angelo, 2007). Specializations for sound processing also occur at the cortical level. Across sensory systems, mammals with small brains, such as mice, process sensory information across limited arrays of only a few cortical areas. This is because cortical areas need to be large enough to contain a sufficient population of neurons to perform basic sets of functions (Kaas, 2000). For example, primary visual cortex needs to be of a certain minimal size in order to preserve the spatial information of the retinal image (Cooper, Herbin, & Nevo, 1993). Thus, mice and other small-brained mammals process auditory information at the cortical level over only a few cortical areas. However, mammals with large brains are released from this constraint. Some, such as the larger rodents, appear to have simply enlarged cortical areas without adding to the number of areas to any great extent. Others, especially primates, have increased the number of cortical processing stations, including both the number of cortical areas that are mainly involved with processing auditory information, and those areas involved in multisensory and higher-order processing. The focus of this chapter is on the elaborations and specializations of the auditory system of anthropoid primates, that is, monkeys, apes, and humans. Most of the elaborations of the auditory systems of anthropoids appear to be at the cortical level, in line with changes in other systems. But this, in part, may be because brain stem levels of processing have been studied in only a limited way in primates. Thus, some specializations may be discovered with further investigation. Much of what is known about cortical organization and function in anthropoid primates stems from physiological and anatomical studies of auditory cortex in Old World monkeys, although valuable results also have been obtained from New World monkeys. Very little is known about cortical organization in apes because invasive studies are no longer possible and noninvasive studies, such as fMRI, require some cooperation. What is known is from limited studies of cortical architecture, which can be
Audition, the process of hearing, depends on transforming perturbations in air pressure waves within a limited frequency range into nerve axon potentials by the receptor cells and the associated structures of the inner ear. Axons subserving these receptor cells then conduct these potentials to the auditory nuclei of the brain stem—the three cochlear nuclei where transformations of the neural signal already began. Further processing occurs over ascending brain stem pathways, involving a number of nuclei, terminating among subdivisions of the inferior colliculus in the midbrain. The subdivisions of the inferior colliculus project to subdivisions of the medial geniculate complex of the thalamus, which in turn relay to areas of auditory cortex. These areas of cortex provide feedback to previous stations, while distributing auditory information more broadly. The auditory system locates and identifies sounds, and it is especially important in humans as a means of mediating communication via speech. Variations in the organization and elaboration of the auditory system have allowed various species to succeed in a range of environments. For example, modification of the cochlear apparatus in a tunnel-living rodent, the mountain beaver, allows it to detect low-frequency changes in air pressure, possibly so that it can sense conspecifics and predators when they enter their underground tunnels (Merzenich, Kitzes, & Aitkin, 1973). In contrast, the auditory system has been altered in echo-locating bats so that it is sensitive to very high frequencies especially at and near the frequencies of the sounds that bats emit to reflect off prey and other objects (Dear, Simmons, & Fritz, 1993; Suga, 1990). As a result, bats have become the most successful of mammal orders next to rodents in terms of numbers of extant species. Other alterations of the auditory system occur more centrally. As a wellknown example, animals with large heads have an advantage over animals with small heads in locating the source of sounds, because the mass of the larger head better creates a difference in sound intensity for lateralizing the source of the sound, and the greater distance between the 251
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c12.indd 251
8/17/09 2:08:18 PM
252 Audition
related to such studies in monkeys where other types of data have been collected. However, a greater understanding of the cortical regions involved in auditory processing in humans is rapidly emerging from fMRI studies. While comparative studies of cortical architecture suggest that the early stages of cortical processing are shared in monkeys and humans, monkeys clearly do not have the elaborations of the auditory system that are found in humans and allow for such proficiency in language. We start with a brief description of peripheral and brain stem levels of auditory processing, and then move on to thalamic and cortical levels in primates. In the first part of the review, relevant results come from a number of mammalian species, and even from other vertebrates because many anatomical investigations have concentrated on the auditory systems of cats, rodents, and nonmammals. The features described here are those that exist or are likely to exist in humans and other primates.
EARLY STAGES OF PROCESSING: EXTERNAL, MIDDLE, AND INNER EAR The auditory system detects oscillations of air pressure as they vary in time. The system is variably sensitive according to species-specific specializations, but generally sensitivity includes a middle range of oscillation frequencies while
excluding very low or very high frequencies. Human hearing extends from a low end of detecting oscillations in the range of 20 cycles per second (20 Hertz or 20 Hz) to around 20,000 Hz. Sensitivity in the higher range is reduced with age as damage to the inner ear accumulates. Sounds are the consequence of oscillations in air pressure. The auditory system is sensitive to the amplitudes of the oscillations, as well as the frequencies. Amplitude is usually specified by sound pressure levels (SPL) expressed in a logarithmic scale in decibels (dB), which covers the large range above threshold human hearing at about 0 dB and extends to around 120 dB, where sound becomes painfully loud. The transformations of air pressure oscillations to a neural code for sound begins in the cochlea of the inner ear (Figure 12.1). Before this stage, air pressure changes are reflected by the external ear, which varies in shape and in surface properties across mammalian species in ways that modulate the amplitude and frequency characteristics of air pressure waves as they are reflected off different parts of the external ear (pinna). The resulting small alterations in the oscillation pattern reflected off the upper or lower face of the pinna can provide subtle cues about the higher or lower source of the sound oscillation (Sterbing-D’Angelo, 2007). In mammals with movable ears, changes in ear position alter sensitivity to air pressure oscillations from various locations in ways that provide additional information about the location of sound sources. Humans, with Middle Ear
External Ear
Inner Ear
Semicircular canals
Bone
Maleus
Pinna or Auricle
Incus
Auditory nerve Stapes ea
Auditory canal
Eardrum/ Tympanic membrane
Oval window
l Coch Round window
Figure 12.1 The peripheral auditory system of humans includes the external ear, the middle ear, and the inner ear. Note: The external ear includes the pinna (auricle) and ear canal. The middle ear contains the three ossicles, and is separated from the external
c12.indd 252
ear by the tympanic membrane. The auditory portion of the inner ear is the cochlea, which contains the sensory receptors (hair cells) of the auditory system.
8/17/09 2:08:18 PM
Early Stages of Processing: External, Middle, and Inner Ear
fixed external ears, depend more on head rotation, and the fixed position of the ears perhaps simplifies the computational task of sound localization. Air pressure changes are conducted via the external auditory canal to vibrate the tympanic membrane at the outer margin of the middle ear. An important function of the external auditory canal is to allow the tympanic membrane and the other components of the middle ear to be buried in the tissue of the head where they are less susceptible to traumatic damage. The role of the middle ear is to transfer oscillations in the air into oscillations of the fluid of the inner ear. Vibrations of the tympanic membrane (ear drum) via air pressure oscillations are transmitted across the middle ear space by a chain of three small bones (ossicles): The malleus (hammer), incus (anvil), and stapes (stirrup). The footplate of the stapes conducts pressure changes into the cochlea by its contact with the oval window (Figure 12.1). The round window is an open membrane in the bony cochlea that is displaced outward upon inward deflection of the oval window. Because the tympanic membrane is much larger than the footplate, a considerable gain in the force applied to the oval window occurs, which is sufficient to overcome the higher impedance of the fluid-filled cochlea. This force can be dampened for loud sounds by the reflexive contractions of small muscles attached to the ossicles. The cochlea of the inner ear is a complicated organ for the transduction of mechanical energy of displacements of the oval window into a neural code. The mammalian cochlea is a long tube of three compartments that is coiled
253
like the shell of a snail, most likely to save space, but perhaps also to contribute to the transduction process. The three compartments constitute three parallel divisions of the coiled tube that are fluid filled. The middle compartment contains the organ of Corti, consisting of the sensory cells of the cochlea, the inner hair cells, as well as the outer hair cells and supporting cells (Figure 12.2). These cells are supported below by a basilar membrane and capped by the tectorial membrane that is attached to bone medial to the inner hair cells and makes contact with stereocilia that extend from upper surface of each hair cell. Vibrations of the middle ear are transmitted to the cochlear fluids via the oval window so that the basilar membrane and hair cells of the organ of Corti move relative to the tectorial membrane. The basilar membrane movement starts at the base of the cochlea at the oval window and proceeds various distances toward the apex. Vibrations of higher frequencies result in maximal membrane motion at the base of the cochlea, while lower frequencies displace the membrane more in more apical locations. Thus, sounds of high to low frequency maximally displace portions of the basilar membrane in a base to apex sequence along the basilar membrane. Sound frequency is thereby represented spatially along the length of the cochlea. The movements of the basilar membrane and hair cells result in the hairs of the hair cells being bent with a shearing action by the tectorial membrane, causing the hair cells to activate afferents of the auditory nerve that terminate in the cochlear nuclei of the brain stem. Single frequencies will activate
Scala vestibuli Organ Outer hair cells of Tectorial membrane Inner hair cells Corti
Scala media
Spiral ganglion Auditory nerve Basilar membrane Bony wall of Cochlea Scala tympani
Figure 12.2 A cross-section through the coiled cochlea showing the three fluid-filled compartments, or scalae.
c12.indd Sec1:253
Note: The organ of Corti is housed within the scala media. It contains the inner and outer hair cells, tectorial and basilar membranes, and supporting cells.
8/17/09 2:08:19 PM
254 Audition
hair cells over limited extents of the base of the cochlea for high frequencies and over longer extents of the apex for low frequencies. Complex sounds of many frequencies will activate hair cells within a number of zones depending on the amplitudes of the component frequencies, as greater amplitudes activate more hair cells with great magnitude. Hair cell responses are influenced by two mechanisms intrinsic to the cochlea. First, the sensitivity of hair cells is increased by a metabolically active system that induces a difference in the ionic charge of the endolymphatic fluid of the scala media (about ⫹80 mV) compared to the cytoplasm within the hair cells (–40 to –70 mV). When the stereocilia of hair cells are deflected by oscillations in the cochlea, positively charged potassium ions flow into the hair cells, causing depolarization and release of neurotransmitters to the afferent neurons of the auditory nerve. Second, the outer cells become shorter or longer according to signals they receive from neurons in the superior olivary complex of the brain stem and intrinsic properties of the outer cells. This alters the mechanical properties of the cochlea to increase responses to sound. This effect is known as the cochlear amplifier, and it is likely the cause of weak sounds emitted by the inner ear—the otoacoustic emissions. The human cochlea has about 3,500 inner hair cells and about 14,000 outer hair cells. Hearing loss results from damage to hair cells, and hair cell loss is permanent in mammals.
THE AUDITORY NERVE Auditory nerve fibers are activated by mechanical stimulation of hair cells. Mechanical stimulation of the hair bundles (stereocilia) of the inner hair cells as they move relative to the overlying tectorial membrane causes hair bundles to pivot at their base and reduce or increase tension on elastic tip links that extend from the tip of one stereocilium to the side of its taller neighbor (Corey, 2007; Holt & Corey, 2000). Deflection toward the taller cilia increases tension and opens transduction channels, while deflection toward the shorter cilia allows channels to close. The current flow generated by the open channels activates the afferents of the auditory nerve that innervate the hair cells. The major functions of the auditory afferents are to encode the frequencies and intensities of sounds. Because the basilar membrane and the hair cells are displaced in a pattern that reflects the sound waveform, auditory nerve afferents are activated during the phase of the waveform during which the hair cells are displaced in the direction that opens transduction channels. When the sound waveform is repeated at a given frequency, producing the sensation of a pure tone, the auditory nerve afferents tend to
c12.indd Sec1:254
discharge at the same phase of each waveform cycle. This feature of the responses of auditory nerve fibers, which is most apparent for lower frequency tones, is called phase locking. Phase locking means that a given afferent will discharge once or twice for each cycle of the waveform, and the temporal spacing of these discharges will be greater for low rather than high tones. Because the discharge pattern over a number of activated afferents corresponds to the frequency of the stimulating tone, this pattern provides a temporal code for sound frequency, at least for low frequencies where the phase locking is most precise. A second important source of information about sound frequency depends on the location along the organ of Corti where afferents are activated (a place code). As the traveling wave displaces the basilar membrane, hair cell activation peaks near the base of the cochlea for high-frequency sounds and near the apex for low-frequency sounds. Complex sounds composed of a combination of different frequencies activate different populations of afferents. Sound intensities reflect sound pressure levels. Low sound pressure levels activate few afferents at low levels within restricted portions of the cochlea. Higher sound pressure levels activate more afferents over longer portions of the cochlea and at higher discharge rates. These changes in neural discharge patterns provide information on sound intensity. Information on sound frequency and sound intensity is clearly reflected in the discharge patterns of single auditory nerve afferents. At low sound pressure levels, each afferent will be activated by sounds over a narrow range of frequencies. As the sound pressure level is lowered further, a level is reached where the afferent discharges just above spontaneous levels only at a particular sound frequency. This is called the characteristic or best frequency for that afferent. As sound pressure levels are systematically increased, the afferent will discharge at higher and higher rates for the characteristic sound frequency and will also discharge over a greater range of frequencies, typically with a sharp high-frequency cutoff. When afferent responses are plotted relative to tone frequency and sound pressure levels, the afferent response zone is called the frequency response area, or tuning curve (Figure 12.3), for that neuron. Many neurons at higher levels of the auditory system retain this information about sound intensity and frequency and have similar, but usually broader tuning curves. In addition to the signals sent by the large Type I afferents from the inner hair cells, thin afferents from the outer hair cells, the Type II afferents, send information of uncertain significance to the cochlear nuclei. Complex sounds add complexity to the simple summary of coding in the auditory nerve fibers because auditory neurons become less responsive (adapt) to a repeated
8/17/09 2:08:19 PM
Processing Auditory Signals in the Brain Stem
255
Frequency response area (normalized firing rate) 1
Tone intensity (dB SPL)
60
0.9
50
0.8
40
0.7
Corpus callosum AC
Commissure of inferior colliculus
0.5 20
0.3
0
0.2
IC
IC
0.4
10
MGC
MGC
0.6
30
AC
Commissure of probst NLL
NLL
0.1
⫺10
0
SOC
0.3 0.5 0.8 1.2 1.9 3 4.8 7.6 12.1 19.2 30.5 Tone frequency (kHz)
CN
SOC CN
Figure 12.3 A typical plot of the frequency response area, or tuning curve, of an auditory neuron. Note: The firing rate of the neuron is recorded as both the frequency of the sound (tone) and the intensity (sound pressure level) are systematically varied. For each combination of intensity and frequency, the normalized firing rate of the neuron is shown, where 1.0 ⫽ maximum firing rate. At the lowest effective intensity, the neuron responds to a single frequency (or a narrow range of frequencies) that is called the characteristic or best frequency. In this example, the best frequency is 1.2 kHz. The plot also reveals that the neuron is responsive to other frequencies at higher intensities. Plot courtesy of Corrie R. Camalier.
stimulus, and responses to one tone can reduce responses to another tone. When auditory nerve responses to one sound are reduced by a second sound, the first sound is said to mask or hide the second sound. Other complications arise from the roles of the three systems of brain stem efferent neurons that send information to the auditory periphery. As noted, olivocochlear efferents activate outer hair cells to alter their shapes and the mechanical properties of the organ of Corti. Other olivocochlear efferents terminate on the afferents of the inner hair cells to inhibit conduction. In addition, brain stem motor neurons activate the muscles of the middle ear, reducing sensitivity to intense sounds.
PROCESSING AUDITORY SIGNALS IN THE BRAIN STEM Auditory processing in the brain stem of mammals involves three divisions of the cochlear nucleus that receive direct inputs from the auditory nerve, nuclei of the superior olivary complex that combine information about sound sources (localization) from the two ears, nuclei of the lateral lemniscus (the ascending auditory pathway), and subdivisions of the inferior colliculus of the midbrain (Figure 12.4).
c12.indd Sec2:255
C
C
Figure 12.4 The major ascending auditory pathways from the cochlea (C) to primary auditory cortex (AC). Note. Subdivisions of nuclear complexes and minor pathways are not shown. Major pathways and projections are indicated by thick lines. CN ⫽ cochlear nuclear complex; SOC ⫽ superior olivary complex; NLL ⫽ nuclei of the lateral lemniscus; IC ⫽ inferior colliculus; MGc ⫽ medial geniculate complex. From Hackett and Kaas, 2002. Adapted with permission.
The subdivisions of the inferior colliculus project in turn to the medial geniculate complex of the dorsal thalamus, where neurons relay auditory information to auditory cortex. Auditory information reaches the brain stem via large, rapidly conducting, myelinated type I axons in the auditory or eighth cranial nerve with cell bodies in the spiral ganglion of the cochlea. These axons terminate in the cochlear nuclei of the lower brain stem in spatial patterns that preserve their order of origin in the cochlea. Thereby, their terminations are cochleotopic or tonotopic in organization. The terminating axons branch to activate several different populations of neurons of different morphological types. Local circuits and synaptic arrangements result in these neuron types having different response properties, thus eighth nerve information is used in different ways by multiple systems of neurons that process information in parallel. Cochlear Nuclei The cochlear nuclear complex is commonly divided into a ventral part, which is divided further into an anterior ventral cochlear nucleus (AVCN), and a posterior ventral cochlear nucleus (PVCN), and a dorsal part, the dorsal
8/17/09 2:08:19 PM
256 Audition
cochlear nucleus (DCN). Auditory nerve axons entering the cochlear complex divide into an ascending branch that innervates the AVCN and a descending branch that innervates the PVCN and the DCN. These axons and their branches maintain the orderly cochleotopic arrangement of their origin in the organ of Corti, and thereby impose tonotopic patterns of organization on the three cochlear nuclei. Axons responsive to low frequencies terminate in the most ventral portions of the three nuclei, while axons responsive to progressively higher frequencies terminate in more and more dorsal portions. The cochlear nuclei contain small, intrinsic inhibitory neurons, an excititory, granule cell, and several types of relay neurons that project to other brain stem auditory structures. Each type preserves specific details of the discharge patterns of the afferent fibers of the auditory nerve (Romand & Avan, 1997). The output of these neurons is further altered in various ways by the circuitry within the complex. This is the start of a diversification of response types to permit a range of auditory functions.
SUPERIOR OLIVARY COMPLEX Encoding the cues used to localize sound is an important function of the superior olivary complex, which contains nuclei specialized for this purpose. The localization of sound sources in space largely depends on two cues resulting from differences in the information relayed from the two ears. As the waveforms of sounds located on one side of the head reach the closer ear first, there is a time difference in the activation of the eighth nerve afferents between the two ears. Because neural discharges are phased locked to the waveform, they are synchronized for stimuli with sources of equal distances from the two ears, or offset by various amounts with discharges from the closest ear occurring first. The phase locking information is most useful for sounds of low frequencies. The second important cue about sound location comes from intensity differences in the sound waves that reach the two ears, as the head dampens the air pressure wave so that air pressure levels are higher in the closest ear. Damping is greatest for higher frequencies. For both cues, a larger head provides better information in terms of phase and intensity differences. Other cues are subtle and depend on the reflective properties of the external ear and the environmental substrate so that the intensityfrequency spectrum of sounds from low or high sources sounds different. Only the processing of information from the phase and intensity difference is well understood. Sound location information is extracted from phase and intensity differences in the binaural signals by neurons in the superior olivary complex of the brain stem (Yin &
c12.indd Sec2:256
Chan, 1990). Cells in the cochlear nuclear complex project to the medial superior olivary nuclei of both sides of the brain stem. Inputs to the ipsilateral medial superior olive course from ventral to dorsal, synapsing on a sequential line of medial superior olivary neurons. Those inputs from the contralateral cochlear complex synapse on the same sequence of neurons from the opposite direction. Because distance is also time in these sequences of activation (the axons are considered delay lines), neurons at different locations in the medial superior olive are simultaneously activated depending on the location of the sound source. Because activation by both ears is necessary for above threshold responses by medial superior olivary neurons, the array of neurons in the structure represents auditory space in the horizontal plane, and provides a place code for space. This has been called a computational map of auditory space (Knudsen, du Lac, & Esterly, 1987). Cells in the cochlear nucleus also project to the ipsilateral lateral superior olive and to the contralateral nucleus of the trapezoid body, which in turn projects to the adjacent lateral superior olive. As the projections to the lateral superior olive from the nucleus of the trapezoid body are inhibitory, neurons in the lateral superior olive of each side of the brain stem are highly activated when sound pressure levels are highest at the ipsilateral ear and they are most inhibited when the sound pressure levels are highest at the contralateral ear. As the sound pressure level differences increase in a positive manner at the ipsilateral ear, the olivary neurons increase in firing rate. Thus, they provide a rate code for sound location in the horizontal plane. This use of sound level differences as a cue to sound source location is most useful for high-frequency sounds. Nuclei of the Lateral Lemniscus The axons from the cochlear nuclei that cross the brain stem and ascend to the contralateral inferior colliculus are joined by axons from the superior olivary complex to form the pathway known as the lateral lemniscus (Figure 12.4). Axons of ipsilaterally projecting auditory neurons also join this pathway. In most mammals, collections of neurons form two nuclei within the axons of the lateral lemniscus, the ventral nucleus (VNLL), and the dorsal nucleus (DNLL) of the lateral lemniscus. The main inputs to the ventral nucleus are from the contralateral ventral cochlear nucleus and projections are mainly to the central nucleus of the inferior colliculus. The dorsal nucleus receives bilateral inputs from the anterior ventral cochlear nucleus and the lateral superior olive, an ipsilateral projection from the medial superior olive, and other inputs from the dorsal nucleus of the lateral lemniscus. The projections of the dorsal nucleus are to the central nuclei of the inferior
8/17/09 2:08:20 PM
Superior Olivary Complex
colliculus of both sides, and, more sparsely, to the deep layers of the superior colliculi of both sides. These connections suggest that the ventral nucleus is involved in processing signals from the contralateral ear, and the dorsal nucleus in binaural processing, but the precise roles of these nuclei remain unclear. Auditory Midbrain: Inferior Colliculus and Deep Superior Colliculus The relay of auditory information to the auditory thalamus and then to auditory cortex depends on the inferior colliculus (Ehret, 1997; Spangler & Warr, 1991). The inferior colliculus has been subdivided in several ways, but most investigators define a central nucleus, an external or lateral nucleus, a dorsal cortex, and sometimes a pericentral nucleus (Figure 12.5). The large central nucleus consists of a series of disk-shaped layers with rather indistinct boundaries that consist of sheets of neurons with dendritic arbors that are flattened in the plane of the layers and afferent axon arbors from the axons of the ascending auditory lemniscal pathway. Intrinsic connections course rostrocaudally within the plane of each layer, as do commissural connections connecting each layer with its counterpart within the opposite central nucleus. Each layer contains neurons that are maximally responsive to nearly the same characteristic frequency (isofrequency layers), and they project in a topographic manner to a tonotopically matched layer of (A)
257
the ipsilateral ventral nucleus of the medial geniculate complex (a smaller contralateral projection also exists in at least some mammals). Thus, neurons within layers of the central nucleus receive frequency specific information from the cochlear complex and other brain stem auditory nuclei, share that information via intrinsic connections between neurons within layers and commissural connections between frequency matched layers of the two colliculi, and project as a unit to a layer of the ventral nucleus of the medial geniculate complex, thereby creating a sequence of frequency matched thalamic layers. These geniculate layers in turn have the topographic projections that provide primary areas of auditory cortex with their topotopic organization. The dorsolateral layers of the central nucleus of the inferior colliculus represent the lowest frequencies, while the ventromedial layers are devoted to the highest frequencies (Schreiner & Langner, 1988). The external nucleus of the inferior colliculus, or external cortex, forms a rostral cap on the colliculus where neurons receive ascending auditory inputs, inputs from nonprimary areas of cortex, and even ascending somatosensory inputs. Projections are to the medial and dorsal nuclei of the medial geniculate complex. Thus, the external nucleus appears to have a role in multisensory functions. The dorsal cortex of the inferior colliculus has three or four layers over most of its extent. The dorsal cortex caps the dorsal ends of the isofrequency layers of the central nucleus (Irvine, 1986). Descending inputs from primary
(B)
Sg
S
PD AD V
Ri M Multisensory Cortico-tectal
DC
LN
ICC
cut
Id
DM
Tpt
Ig o Pr
CiS
Tpt
re Co lt Be
LGN
?
Pulvinar
cut
lt
e ab
D V
M
r Pa
LS
SN
Ts2 VM
Figure 12.5 A: Connections of subdivisions of the inferior colliculus with the auditory thalamus. B: Subdivisions of the thalamic medial geniculate complex with auditory cortex. Note. Proposed subdivisions of the inferior colliculus include the central nucleus (ICc), the lateral nucleus (LN), the dorsal cortex (DC), the dorsal medial nucleus (DM), and the ventral medial nucleus (VM). The subdivisions of the medial geniculate complex of the auditory thalamus include the ventral nucleus (V), the medial or magnocellular nucleus (M), and anterior (AD) and posterior (PD) divisions of the dorsal nucleus (D). The
c12.indd Sec3:257
suprageniculate nucleus (Sg) also gets auditory inputs. In B, the lateral geniculate nucleus (LGN) and the substantia nigra (SN) are shown for reference. Auditory cortex, shown for a macaque monkey on the left in panel B, includes a core, belt, and parabelt (see text). Part of the parietal lobe has been cut away to reveal auditory cortex on the upper bank of the lateral sulcus. Areas surrounding auditory cortex include dysgranular insular cortex (Id), granular insular cortex (Ig), retroinsular cortex (Ri), pro, TpT and Ts2. CiS, circular sulcus.
8/17/09 2:08:20 PM
258 Audition
auditory cortex in monkeys (Fitzpatrick & Imig, 1978; Luethke, Krubitzer, & Kaas, 1989) and other mammals terminate within isofrequency contours in the dorsal cortex that extend into the dorsal ends of isofrequency layers in the central nucleus. The dorsal cortex receives some ascending auditory inputs, as well as connections that are mostly intrinsic to the layers of the central nucleus. Projections are to the dorsal nucleus of the medial geniculate complex. Because the major thalamic connections of the dorsal cortex differ from those of the central nucleus, and thereby contribute to a “nonprimary” auditory pathway, distinguishing the dorsal cortex from the central nucleus seems justified. Deep Layers of the Superior Colliculus The superior colliculus is predominately visual in function, with the superficial layers receiving direct inputs from the retina, as well as inputs from most or all of the areas of visual cortex (Kaas & Huerta, 1988). These inputs are retinotopically organized, and form a representation or map of the contralateral visual hemifield. Connections between the superficial and deeper layers instruct a motor map in the deeper layers so that visual inputs from any location in contralateral visual space help direct eye and head movements so that the source of the visual input is foveated (looked upon). The deeper layers of the superior colliculus also receive somatosensory and auditory input both from the brain stem and from the cortex, and these layers also have rather crude representations of auditory space and body surface location. Many of the deep neurons are multisensory. The auditory and somatosensory inputs contribute to the motor maps so that the eyes are directed toward imposing sights, sounds, and touches. While an auditory space map has been found in the homologue of the mammalian external nucleus of the inferior colliculus of owls (Cohen & Knudsen, 1999), the crude map of auditory space in the deep layers of the superior colliculus is the only auditory space map that has been found in the midbrain or at higher levels in mammals.
AUDITORY THALAMUS The auditory thalamus includes the medial geniculate complex, with inputs from the inferior colliculus, and several other nuclei with auditory and multisensory functions, the suprageniculate nucleus (Sg), the posterior nucleus (PO), the medial pulvinar (PM), and the auditory sector of the thalamic reticular nucleus (RT-aud). The medial geniculate complex is commonly divided into a ventral or principal nucleus (MGv), a medial or magnocellular nucleus (MGm), and a dorsal nucleus (MGd) that is sometimes divided into anterodorsal (MGad) and posterodorsal (MGpd) components
c12.indd Sec3:258
(Figure 12.5A; Jones, 2006). Each of these divisions or nuclei has a distinctive histological appearance (histoarchitecture), pattern of connections, and populations of neurons with differing response properties. Ventral Nucleus (MGv) The ventral nucleus has a higher cell-packing density than other nuclei of the medial geniculate complex, making it easy to delimit in mammals, including humans and other primates (Hirai & Jones, 1989). The inferior colliculus, especially the central nucleus, projects densely to MGv. The inputs from the ipsilateral inferior colliculus are much denser than those from the contralateral inferior coliculus. In all studied mammals, MGv projects to the primary auditory cortex, A1. In addition, most, and perhaps all, mammals have more than one primary or primary-like area, and these areas of the auditory core receive MGv inputs. In monkeys, and probably other primates, MGv projects to three primary-like areas of the core of the auditory cortex, A1, the rostral (R) area, and the rostrotemporal (RT) area (Hashikawa, Molinari, Rausell, & Jones, 1995; Luethke et al., 1989; Morel, Garraghty, & Kaas, 1993; Morel & Kaas, 1992). Rodents and carnivores have a primary auditory area (A1), and anterior auditory field (AAF), both with MGv inputs. MGv receives topographically organized inputs from the tonotopically organized central nucleus of the inferior colliculus, and is thereby tonotopically organized (Imig & Morel, 1985). The inferior colliculus inputs are aligned in isofrequency sheets or layers in MGv, much like the layers of an onion, and these isofrequency layers project to isofrequency bands crossing the core auditory areas, A1, R, and RT. Neurons in MGv have frequency intensity tuning curves in that they respond at lowest sound pressure levels to a characteristic or best frequency, and to wider ranges of frequencies at higher sound pressure levels. Most of these neurons are responsive to both ears and are sensitive to interaural differences in sound intensity and time, thereby providing distributed information about sound source location, without forming a representation of sound location. The smaller medial, internal, or magnocellular nucleus of the medial geniculate complex (MGm) consists of large cells and groups of smaller cells along the medial aspect of the complex. MGm appears to get inputs from the external nucleus and parts of the central nucleus and the central nucleus of the inferior coliculus (Calford & Aitkin, 1983) as well as the deep layers of the superior colliculus (Kaas & Huerta, 1988). Other inputs from vestibular nuclei and spinothalamic pathway have been suggested, but also questioned (Jones, 2006). Inputs from the external nucleus of the inferior colliculus, the deep layers of the superior coliculus, and the spinothalamic pathway may account for
8/17/09 2:08:20 PM
Auditory Cortex in Mammals
the responsiveness of neurons in MGm to somatosensory, as well as auditory stimuli. MGm projects broadly to the auditory cortex, including the core and belt areas in primates. Reflecting the mixture of cell size in MGm, recordings indicate that some neurons are binaural and have short-latency responses to pure tones, while others have long-latency responses, broad frequency tuning, and often respond poorly to tones. The dorsal nucleus (MGd) caps the medial geniculate complex. MGd has larger and less densely packed neurons than MGv. MGd also is characterized by a reduced number of parvalbumin positive neurons and increased number of calbindin positive neurons (Molinari et al., 1995). In monkeys, the dorsal nucleus is sometimes divided into an anterodorsal nucleus with more densely packed neurons, as in MGv, and a posterodorsal nucleus that is more typical of MGd of other mammals in cytoarchitecture (Jones, 2006). Inputs to MGd have not been well determined for primates, but the main source appears to be from the external nucleus of the inferior colliculus, with little or no input from the central nucleus (Hashikawa et al., 1995). MGd projects broadly to areas of the auditory belt in monkeys (de la Mothe, Blumell, Kajikawa, & Hackett, 2006; Hackett, Stepniewska, & Kaas, 1998a, 1998b; Molinari et al., 1995; Morel et al., 1993; Morel & Kaas, 1992). In rodents, at least, neurons in MGd are broadly tuned to tone frequency and respond at a long latency (Zhang, Yu, Liu, Chan, & He, 2008). Other thalamic nuclei that are involved in auditory processing include the suprageniculate (Sg) nucleus, the limitans (Lim) nucleus, and the medial pulvinar (PM), which have widespread connections with belt, parabelt, and higher-order areas of auditory cortex in primates (Kaas & Hackett, 2000). These nuclei have multisensory properties and probably have a role in modulating the activities of neurons in these cortical areas. In addition, a portion of the reticular nucleus of the ventral thalamus receives inputs from collaterals of thalamocortical axons from the medial geniculate cortex and other inputs from corticothalamic axons projecting from the auditory cortex to the medial geniculate complex. The auditory portion of the reticular nucleus projects via GABAergic neurons to the medial geniculate complex, thereby adding a source of inhibition in the complex, in addition to that provided by intrinsic inhibitory neurons (Huang, Larue, & Winer, 1999).
AUDITORY CORTEX IN MAMMALS The auditory cortex includes areas of the cortex that are mainly or exclusively involved in the processing of auditory stimuli. These include the primary or primary-like
c12.indd Sec4:259
259
areas, the surrounding second-level belt areas, and thirdlevel parabelt areas (Figure 12.5B). Across mammals, and even within a taxonomic group, proposals of how the auditory cortex is divided into areas have varied. Some of this variation undoubtedly reflects real differences between taxa, but some of the variation likely results from the sparseness and ambiguity of the collected data. Much of the earlier research on cortical organization was in cats, where a core of primary-like areas have been described, an anterior auditory field, an A1, and two posterior auditory fields that are less clearly primary (Winer & Lee, 2007). All three of the core areas have neurons that respond well to pure tones and are tonotopically organized. An anterior auditory field (AAF) and A1 are so much alike that either one could have been called A1. Likewise, in rodents, fields that resemble AAF, A1, and PAF have been described, with the AAF field sometimes being identified as A1, rather than the usually identified middle field as A1 (Kaas, 2008). In both cats and rodents, the primary-like core fields are surrounded by a number of less easily defined secondary auditory fields. Concepts of the auditory cortex in monkeys and other primates have gradually developed over years of research, and here again there is evidence for three primary fields, A1, a more rostral area (R), and a rostraltemporal area (RT). As for core areas in carnivores and rodents, these three areas are considered primary because they share a number of features. This includes direct topographically organized inputs from the tonotopically organized ventral nucleus of the medial geniculate complex. As a result of inputs, A1, R, and RT are tonotopically organized. These core areas also have the architectonic features of primary sensory areas of the cortex. This includes a layer 4 that is packed with small neurons, rather dense myelination, a high level of expression of cytochrome oxidase in layer 4, and a high level of expression of the calcium-binding protein, parvalbumin. In many mammals, auditory and other sensory areas also have high levels of AchE early in postnatal development, but monkeys and other anthropoid primates continue to express high AchE levels in the mature brain. Given that it has only become gradually apparent that most mammals have an auditory core of two to three primary areas, the question of the identities of areas termed A1 in cats, rodents, and primates has only recently emerged. In cats and rats, A1 represents high to low tones in a rostrocaudal gradient across cortex, the opposite of the direction of the tonotopic gradient in monkeys. This suggests that A1 in monkeys and other primates is not the same area (homologous) as A1 in rodents and carnivores. Instead, the rostral area, R, of primates has the caudorostral tonotopic gradient of high-to low-frequency representation expected for A1. Indeed, sometimes R appears to have been mistaken
8/17/09 2:08:21 PM
260 Audition
for A1 in monkeys. Alternatively, some (e.g., Jones, 2006) propose that the expansion of the temporal lobe in primates rotated A1 from its primitive position so that its caudal end actually became its rostral end, thus creating an apparent but false impression of a tonotopic organization reversed from primitive A1. However, while considerable rotation does occur, it is not obvious that the rotation is enough to reverse the tonotopic gradient. In addition, this proposed rotation would mean that a belt area on the caudal border of A1 in monkeys, area CM (see below) is in the relative position of the anterior auditory field of rats and cats, a core auditory area. Thus, the homology of primate A1 with A1 of other taxa remains an open question. Because the following review focuses on auditory cortex of primates, this question of homologies of primate and nonprimate areas can be avoided here.
AUDITORY CORTEX IN PRIMATES Auditory Core The auditory core includes those auditory areas that are primary or primary-like. Primary auditory areas have major activating inputs from the ventral nucleus of the medial geniculate complex (MGv), patterns of tonotopic organization, neurons that are highly responsive to pure tones, have frequency tuning curves with a characteristic or best frequency, histological features of primary sensory cortex, and projections that drive neurons in other cortical areas. In monkeys and probably other primates, the auditory core includes the classically defined primary auditory area, A1, and the more recently defined rostral area, R, and rostrotemporal area, RT. Primary area, A1, has a clearly defined gradient of tonotopic organization that proceeds from the caudomedial border to the rostrolateral border of A1 (Imig, Ruggero, Kitzes, Javel, & Brugge, 1977; Kosaki, Hashikawa, He, & Jones, 1997; Merzenich & Brugge, 1973; Morel et al., 1993; Morel & Kaas, 1992). Lines or strips of isofrequency representation cross this gradient in a rostromedial to caudolateral direction. The axon arbors of thalamocortical axons from MGv tend to elongate along the isofrequency contours, and intrinsic connections of A1 are elongated along these contours. The reciprocal cortical connections of A1 are most dense between A1 and adjacent cortical areas, including R of the core and ML, CL, CM, and MM of the auditory belt. A few connections are with more distant targets, including RT of the core and AL and RM of the belt. Remarkably, A1 has few if any connections with more distant auditory areas in the parabelt and beyond. Callosal connections are largely with A1 of the opposite
c12.indd Sec4:260
hemisphere. Because neurons in A1 are sensitive to tone frequency, sound intensity, and binaural differences in stimulation, the neurons in A1 preserve those aspects of the response characteristics of MGv and ICc neurons that are important in identifying and localizing sounds. Architectonic studies have identified an auditory core in humans and even in chimpanzees (Hackett, Preuss, & Kaas, 2001; Rivier & Clarke, 1997; Sweet, DorphPetersen, & Lewis, 2005; Wallace, Johnston, & Palmer, 2002), and functional imaging (fMRI) studies in humans indicate that part of that core has the tonotopic organization of A1 (e.g., Talavage et al., 2004). The shape of the auditory core in chimpanzees and humans suggests that there is room for areas R and RT, and the fMRI results in humans provide direct evidence for R. Thus, all anthropoid primates (monkeys, apes, and humans) appear to have a similar auditory core that consists of A1, R, and possibly RT. Finally, judging from microelectrode mapping studies of tonotopic patterns in prosimian galagos (Brugge, 1982), prosimians have a core with at least two areas, A1 and R. Area R, somewhat smaller than A1, has a tonotopic organization that mirrors that of A1 (Figure 12.6). The response properties of neurons in R are highly similar to those of A1, although there may be some differences. As noted previously, major activating inputs are from MGv, while cortical connections are mainly with adjoining areas, A1, RT of the core, and RM and AL of the belt. Callosal connections involve mainly RT of the opposite hemisphere. Area RT is smaller than either R or A1, and, because of its size and rostroventral position, has received only limited attention in experimental studies. RT appears to have a tonotopic organization that is reversed from that of R so that low tones are represented in RT near the R border, while high tones are represented at the rostroventral border of RT (Figure 12.6C). The most extensive microelectrode mapping data from RT comes from studies on marmosets, a small New World monkey (Bendor & Wang, 2005), while limited, but supportive evidence comes from owl monkeys (Imig et al., 1977; Morel & Kaas, 1992; Recanzone, Schreiner, Sutter, Beitel, & Merzenich, 1999). Connections of RT have not been fully explored, but they include inputs from MGv and connections with adjacent core (R) and belt (RTL and RTM) areas. Auditory Belt The narrow 2 to 3 mm wide auditory belt consists of cortex that surrounds the core, but is clearly outside of the architectonically defined core. This means that the histological features of the primary sensory cortex are greatly muted or absent. Thus, the belt expresses less parvalbumin,
8/17/09 2:08:21 PM
Auditory Cortex in Primates (B) CS
IPS
AS CM Tpt Ri MM CL Tpt AI
Ig
cut
Id
RM o R Pr M RT RT AL
CiS
o Pr
LuS
L M CPB
RPB
RT
Tpt
ML CPB AL
RTL
Ts2 Ts1
RM
M
RT
CL
AI
R
RTL LS
LS
Ig
Tpt
CM
Ri M M
STS
cut PS
cut
RPB
LS
ST S
(A)
261
Pro
STGr
Figure 12.6 Subdivisions of auditory cortex (A) and cortical connections of auditory cortex (B) in macaque monkeys. Note. The circular sulcus has been flattened to show medial auditory areas and the dorsal bank of the lateral sulcus has been removed to expose the lower bank. The core region (dark shading) contains three areas, A1, R, and RT (see text), the belt—a series of eight areas (CM, CL, ML, AL,
cytochrome oxidase, myelin, and acetylcholinesterase. In Nissl preparations, layer 4 is thinner and less densely packed with small neurons. More importantly, the belt is defined by dense interconnections with the fields of the auditory core. Because the belt receives only sparse inputs, at best, from the ventral nucleus of the medial geniculate complex, with most of its thalamic inputs coming from the dorsal and magnocellular divisions of the complex, belt areas likely depend on inputs from the core for most of their responsiveness to auditory stimuli. However, this has been tested only for the caudomedial area, CM, where lesions of A1 abolish the tuning of CM neurons to tone frequency, but some responses to auditory stimuli as well as to somatosensory stimuli remain (see the discussion that follows). The auditory belt appears to consist of eight auditory fields (Figure 12.7). In part, this conclusion stems from the evidence that each core area projects most densely to adjacent regions of the belt (Morel et al., 1993; Morel & Kaas, 1992; see Kaas Hackett, 2000). This suggests the existence of at least three belt areas medial to the core and three lateral to the core. Other belt areas could occupy the ends of the belt, and microelectrode recording and other evidence supports the conclusion that two areas, CM and CL, cap the caudal end of the core. Because neurons in the belt reflect a second level of cortical processing, these neurons are generally more sensitive to anesthetics and thus more difficult to activate in anesthetized animals. As a result, less is known about their response properties. In addition, these neurons are likely to have more complex response properties than neurons in the core, and they are typically less responsive to pure
c12.indd Sec5:261
RTL, RTM, RM, and MM), and the parabelt, two divisions (RPB and CPS). Sulci are labeled for reference. Central sulcus, CS, lateral sulcus, LS, lunate sulcus, LuS, superior temporal sulcus, STS, intraparietal sulcus, STS, arcuate sulcus AS, principal PS. Ts2, Ts1, and pro are proposed subdivisions of the temporal lobe. Other abbreviations as in Figure 12.5. In B, arrows indicate connections between areas, with dotted lines marking less dense connections.
tones. Nevertheless, many neurons in the belt are responsive either to pure tones or frequency-centered narrow bands of noise so that best or favored frequency can be estimated for recorded neurons, and crude tonotopic patterns of organization within belt fields can be estimated (Rauschecker & Tian, 2004; Rauschecker, Tian, & Hauser, 1995). Such recordings indicate that areas AL and ML have tonotopic organizations that parallel those in adjoining areas R and A1, and that areas CM and CL may have tonotopic organizations that mirror reversals of that in A1 (Kajikawa, de la Mothe, Blumell, & Hackett, 2005). Other proposed belt fields may also contain representations of tone frequency. However, the generally broad tuning of belt neurons to frequency and the weak responsiveness of these neurons to pure tones suggests that the tonotopic patterns are not the dominant feature of their intrinsic organization, but rather a weakly preserved consequence of their connection patterns with A1. The belt areas are connected with core and parabelt areas. Adjacent areas have the strongest connections, so that the caudal belt areas (CM, CL, ML) are mainly connected with the A1 in the core and the caudal division of the parabelt (CPB; Hackett et al. 1998a; Jones, Dell’Anna, Molinari, Rausell, & Hashikawa, 1995; Morel et al., 1993; Morel & Kaas, 1992). Similarly, the rostral areas of the belt have stronger connections with the rostral divisions of the core and parabelt. This topographic pattern of connections extends to areas beyond the auditory cortex in the temporal, frontal, and parietal lobes (Romanski, Bates, & Goldman-Rakic, 1999; Romanski et al., 1999), so that caudal and rostral areas of the belt have connections with different areas in those regions.
8/17/09 2:08:21 PM
262 Audition (A) CS AS
Parabelt
STG
LS STS (B)
CS cut
AS
Belt Core INS
A1 R
RT
LS
Parabelt STG STS
(C) RTM H H L RT RTL
MM
RM
CM H H R
A1
L L
H RL L
L
this region. The parabelt receives inputs from the MGd and MGm in thalamus and is broadly connected with all of the belt areas. As mentioned, the connections of the parabelt are topographic, so that the RPB and CPB are most strongly connected with rostral and caudal areas of the belt, respectively. The parabelt has extensive connections with areas beyond the auditory cortex. Those connection patterns match the rostrocaudal topography of the belt areas, but tend to be stronger than those in the belt (Hackett, Stepniewska, & Kaas, 1999; Romanski, Bates, et al., 1999, Romanski, Tian, et al., 1999). The distinctive connections of the rostral and caudal areas have given rise to the hypothesis that auditory information is processed in at least two functionally distinct streams (Rauschecker & Tian, 2000; Romanski, Bates, et al., 1999, Romanski, Tian, et al., 1999). Rostral auditory fields target rostral temporal and inferior, polar, and orbital prefrontal areas involved in processing related to auditory object recognition. Caudal auditory fields target dorsolateral and periarcuate prefrontal areas involved in multimodal spatial tasks. The segregation of pathways is not complete, however, because connections of rostral and caudal auditory areas overlap in the middle of the auditory cortex, and also in the dorsal superior temporal sulcus and dorsal prefrontal cortex. Thus, there appears to be substantial interaction between streams (Kaas & Hackett, 2000).
CL L
ML
H
RPB
AUDITORY CORTEX IN GREAT APES AND HUMANS
CPB
Figure 12.7 Locations and tonotopy of auditory areas in macaque monkeys. Note. A: A lateral view of a monkey brain with parabelt areas sown on the surface of the superior temporal gyrus (STG), the lateral sulcus (LS), the central sulcus (CS), and the arcuate sulcus (AS) are shown for reference. B: The upper bank of the lateral sulcus has been removed to expose the lower bank and core and belt auditory areas (see text). (C) The tonotopic organization of the core auditory areas from low (L) to high (H) tones. Lines with A1 and R indicate lines of isorepresentation. See text for belt and parabelt areas.
Auditory Parabelt The auditory parabelt region is located along the lateral border of the belt region on the superior temporal gyrus. Two areas have been identified, one caudal (CPB) and one rostral (RPB; Hackett et al., 1998a, 1998b). Very little is known about the response properties of parabelt neurons because few studies have attempted to record from
c12.indd Sec5:262
The expansion of the PT and STG in apes and especially humans is one of the clearest differences in the gross anatomy of the superior temporal lobe among primates. Because these structural differences are likely to underlie key differences in the capacity for speech and language, it is important to achieve a better understanding of the auditory cortex organization from studies of the human brain. Early descriptions of the human and nonhuman primate auditory cortex include the results of several histological studies conducted in the late-1800s and early-1900s (Beck, 1928, 1929; Brodmann, 1909; von Economo, 1925; von Economo & Horn, 1930). These classic studies localized the auditory cortex to the superior temporal gyrus of humans and nonhuman primates and still remain influential. The nomenclature derived from Brodmann’s map of the cerebral cortex, in particular, is widely used to denote areas of the brain in functional imaging, electrophysiology, and related clinical applications. The number of areas identified in human auditory cortex ranges from 3 (Brodmann’s areas 41, 42, 22) to over 30 (von Economo & Horn, 1930). There is no current consensus on the precise
8/17/09 2:08:21 PM
Auditory Cortex in Great Apes and Humans
263
Lateral Anterior
HSp HSp PT *
PT HG
HG
HSa
CiS
(B)
(A)
LS ML CPB
A1
(C)
STS
A1
?
LS ?
?
LS
STS
? A1
?
? STS
?
(D)
(E)
(F) Tissue scale ⫽ 5 mm Schematics scale ⫽ 10 mm
Tpt
Tpt
22
Ex
paAe 42
PaAc/d
t In . Pa La t. P rabe l te ara ra be t l B lt Core elt
paAi KAIt
41
Planum Temporale
KAm ProA
52
PaAr PaAi
Heschl’s Gyrus
22
38
Brodmann (1909) (G)
Galaburda & Sanides (1980) (H)
Figure 12.8 The locations of the auditory cortex in the brains of macaque monkeys (A and D), chimpanzees (B and E), and humans (C and F). Note. A through C are views of the lower bank of the lateral sulcus (the superior temporal plane) of monkey, chimpanzee, and human brains with the auditory core outlined. Broken white lines designate sulcal landmarks, circular sulcus (Cis), anterior Heschl’s sulcus (HSa), posterior Heschl’s sulcus (HSp). Scale bars ⫽ 5 mm. D through F show the location of A1, the lateral belt area, ML, and the caudal parabelt (CPB) in frontal brain sections of monkey (D), chimpanzee (E), and human (F) brains. Question marks indicate uncertainties. Scale bar ⫽ 5 mm. G through I are previous descriptions of the organization of auditory cortex in humans shown on a view of the superior temporal plane as in C. Different parcellation schemes exist,
c12.indd Sec6:263
Sweet et al (2005) (I)
and there are uncertainties. Areas 41, TC, and KA have been considered to be primary auditory cortex (A1). These proposed areas would contain all or most of the auditory core as presently defined. From “The Comparative Anatomy of the Primate Auditory Cortex” (pp. 199–219), by T. A. Hackett, in Primate Audition; Ethology and Neurobiology, A. A. Ghazanfar (Ed.), 2003 Boca Raton: CRC Press; “Organization and Correspondence of the Auditory Cortex of Humans and Nonhuman Primates” (pp. 109–119) by T. A. Hackett, in J. Kaas (Ed.), Evolution of Nervous Systems, 2007, Oxford: Elsevier; and “Architectonic Identification of the Core Region in Auditory Cortex of Macaques, Chimpanzees, and Humans,” by T. A. Hackett, T. M. Preuss, and J. H. Kaas, 2001, Journal of Comparative Neurology, 441, pp. 197–222. Adapted with permission. aBased on Brodmann (1909). bBased on Galaburda and Sanides (1980). cBased on Sweet, Dorph-Petersen, and Lewis (2005).
8/17/09 2:08:23 PM
264 Audition
number, although it appears that the divisions identified by Brodmann are large regions, which contain several subdivisions, as in other primates (Hackett, 2002, 2007). More recent studies have employed quantitative methods and studied the expression of enzymes, neurotransmitters, and receptors to identify areas and provide insight into their functional properties (Hackett et al., 2001; Morgan, Henderson, & Thompson, 1987; Morosan et al., 2001, Morosan, Schleicher, Amunts, & Zilles, 2005; Nakahara, Yamada, Mizutani, & Murayama, 2000; Rademacher, Caviness, Steinmetz, & Galabuda, 1993; Rivier & Clarke, 1997; Sweet et al., 2005; Wallace et al., 2002). Although interpretations vary, a common feature is that of a central core region flanked by belts of several nonprimary fields (Figure 12.8), similar to the pattern of organization found in monkeys and other animals. Core Region In monkeys, the core region is elongated along the anteriorposterior axis of the temporal lobe. In apes and humans, the core occupies the posteromedial two thirds of the TTG, which is oriented from posteromedial to anterolateral across the superior temporal plane (Hackett, 2002; Hackett et al., 2001). Therefore, the core region corresponds most closely to area 41 of Brodmann (1909), but there is much greater variability in apes and humans, compared to monkeys. The number of transverse temporal gyri varies between individuals and sometimes between hemispheres, and the position of the core region also varies relative to those gyri (Rademacher et al., 1993; Hackett et al., 2001). In humans, the most common expression is that of a single or paired TTG. In humans with a single TTG, the core occupies most of the gyrus and usually does not extend beyond its anterior and posterior sulcal boundaries. When the TTG is duplicated, the core usually occupies portions of both gyri.
BELT AND PARABELT REGIONS The homology of the belt and parabelt regions of monkeys and humans has not been firmly established, and so the observations included here remain speculative. Adjoining the core region on the anteromedial and posterolateral sides of the TTG are two bands that, at least in terms of relative position, appear to correspond to the belt region of monkeys. The anterior region most closely corresponds to the medial belt region of monkeys and area 52 of Brodmann. The posterior region occupies part of the planum temporale (PT) adjacent to the TTG, and corresponds most closely to area 42 of Brodmann and the lateral belt of monkeys (Galaburda & Sanides, 1980; Hackett, 2002; Sweet
c12.indd Sec6:264
et al., 2005). On the more posterior portion of the PT are at least two other anatomically distinct regions. According to Brodmann (1909), part of area 22 extends onto the PT from the STG to flank the posterior border of area 42. This portion of area 22 may correspond to part of the parabelt region of monkeys, but in that case, the remainder of area 22 on the STG has no clear homologue in monkeys, apes, and humans. Further research is needed to clarify these relationships.
SUMMARY The mammalian auditory system is comprised of a highly complex network of peripheral and central structures that can extract, encode, and interpret acoustic signals in a dynamic acoustic environment. At each stage of processing, interconnected neurons in multiple parallel pathways contribute to the computation of information contained within the acoustic waveform. From these computations emerge cues about the location and identity of auditory objects used to guide reflexive and purposive behavior. Precisely how the auditory system accomplishes these tasks is only partially understood, but comparative studies in humans and other species are gradually uncovering the structural and functional mechanisms that underlie the perception of sound.
REFERENCES Beck, E. (1928). Die myeloarchitektonische felderung es in der Syvischen furche gelegenen teiles des menschlichen schläfenlappens. Journal of Psychiatric Neurology, 36, 1–21. Beck, E. (1929). Der myeloarchitektonische bau des in der Sylvischen furche gelegenen teiles des schläfenlappens beim schimpansen (Troglodytes niger). Journal of Psychiatric Neurology, 38, 309–428. Bendor, D., & Wang, X. (2005, August 25). The neuronal representation of pitch in primate auditory cortex. Nature, 436, 1161–1165. Brodmann, K. (1909). Vergleichende Lokalisationslehre der Grosshirnrhinde. Leipzig: Barth. Brugge, J. F. (1982). Auditory areas in primates. In C. N. Woolsey (Ed.), Cortical sensory organization (pp. 59–70). Clifton, NJ: Humana Press. Calford, M. B., & Aitkin, L. M. (1983). Ascending projections to the medial geniculate body of the cat: Evidence for multiple, parallel auditory pathways through the thalamus. Journal of Neuroscience, 3, 2365–2380. Cohen, Y. E., & Knudsen, E. I. (1999). Maps versus clusters: Different representations of auditory space in the midbrain and forebrain. Trends in Neurosciences, 22, 128–135. Cooper, H. M., Herbin, M., & Nevo, E. (1993). Visual system of a naturally microphthalmic mammal: The blind mole rat. (Spalax ehrenbergi.) Journal of Comparative Neurology, 328, 313–350. Corey, D. P. (2007). Stringing the fiddle: The inner ear’s two-part invention. Nature Neuroscience, 10, 1232–1233.
8/17/09 2:08:26 PM
References 265 Dear, S. P., Simmons, J. A., & Fritz, J. (1993, August 12). A possible neuronal basis for representation of acoustic scenes in auditory cortex of the big brown bat. Nature, 364, 620–623. de la Mothe, L., Blumell, S., Kajikawa, Y., & Hackett, T. A. (2006). Thalamic connections of the auditory cortex in marmoset monkeys: Core and medial belt regions. Journal of Comparative Neurology, 496, 72–96.
Kaas, J. H. (2008). The evolution of auditory cortex: The core areas. In J. A. Winer, C. E. Schreiner (Eds.), The auditory cortex. Berlin, Germany: Springer-Verlag.
Ehret, G. (1997). The auditory cortex. Journal of Comparative Physiology, 181, 547–557.
Kaas, J. H., & Hackett, T. A. (2000). Subdivisions of auditory cortex and processing streams in primates. Proceedings of the National Academy of Sciences, USA, 97, 11793–11799.
Fitzpatrick, K. A., & Imig, T. J. (1978). Projections of auditory cortex upon the thalamus and midbrain in the owl monkey. Journal of Comparative Neurology, 177, 573–655.
Kaas, J. H., & Huerta, M. F. (1988). Subcortical visual system of primates. In H. P. Steklis (Ed.), Comparative primate biology (Vol. 4, pp. 327–391). New York: Alan R. Liss.
Galaburda, A., & Sanides, F. (1980). Cytoarchitectonic organization of the human auditory cortex. Journal of Comparative Neurology, 190, 597–610.
Kajikawa, Y., de la Mothe, L., Blumell, S., & Hackett, T. A. (2005). A comparison of neuron response properties in areas A1 and CM of the marmoset monkey auditory cortex: Tones and broadband noise. Journal of Neurophysiology, 93, 22–34.
Hackett, T. A., Stepniewska, I., & Kaas, J. H. 1999) Prefrontal connections of the auditory parabelt cortex in macaque monkeys. Brain Research, 817, 45–58. Hackett, T. A., & Kaas, J. H. (2002) Auditory Processing in the Primate Brain. In M. Gallagher. & R. J. Nelson (Eds.), Handbook of Psychology, Vol. 3: Biological psychology. Wiley & Sons: New York. Hackett, T. A. (2003). The comparative anatomy of the primate auditory cortex. In A. A. Ghazanfar (Ed.), Primate audition: Behavior and neurobiology (pp. 199–226). Boca Raton: CRC Press. Hackett, T. A. (2007). Organization and correspondence of the auditory cortex of humans and nonhuman primates. In J. Kaas (Ed.), Evolution of nervous systems (pp. 109–119). Oxford: Elsevier. Hackett, T. A., Preuss, T. M., & Kaas, J. H. (2001). Architectonic identification of the core region in auditory cortex of macaques, chimpanzees, and humans. Journal of Comparative Neurology, 441, 197–222. Hackett, T. A., Stepniewska, I., & Kaas, J. H. (1998a). Subdivisions of auditory cortex and ipsilateral cortical connections of the parabelt auditory cortex in macaque monkeys. Journal of Comparative Neurology, 394, 475–495. Hackett, T. A., Stepniewska, I., & Kaas, J. H. (1998b). Thalamocortical connections of the parabelt auditory cortex in macaque monkeys. Journal of Comparative Neurology, 400, 271–286. Hashikawa, T., Molinari, M., Rausell, E., & Jones, E. G. (1995). Patchy and laminar terminations of medial geniculate axons in monkey auditory cortex. Journal of Comparative Neurology, 362, 195–208. Hirai, T., & Jones, E. G. (1989). A new parcellation of the human thalamus on the basis of histochemical staining. Brain Research Review, 14, 1–34.
Knudsen, E. I., du Lac, S., & Esterly, S. D. (1987). Computational maps in the brain. Annual Review of Neuroscience, 10, 41–65. Kosaki, H., Hashikawa, T., He, J. & Jones, E. G. (1997). Tonotopic organization of auditory cortical fields delineated by parvalbumin immunoreactivity in macaque monkeys. Journal of Comparative Neurology, 386, 304–316. Luethke, L. E., Krubitzer, L. A., & Kaas, J. H. (1989). Connections of primary auditory cortex in the new world monkey (Saguinus). Journal of Comparative Neurology, 285, 487–513. Merzenich, M. M., & Brugge, J. F. (1973). Representation of the cochlear partition on the superior temporal plane of the macaque monkey. Brain Research, 50, 275–296. Merzenich, M. M., Kitzes, L., & Aitkin, L. (1973). Anatomical and physiological evidence for auditory specialization in the mountain beaver (Aplodontia rufa). Brain Research, 58, 331–344. Molinari, M., Dell’Anna, M. E., Rausell, E., Leggio, M. G., Hashikawa, T., & Jones, E. G. (1995). Auditory thalamocortical pathways defined in monkeys by calcium-binding protein immunoreactivity. Journal of Comparative Neurology, 362, 171–194. Morel, A., Garraghty, P. E., & Kaas, J. H. (1993). Tonotopic organization, architectonic fields, and connections of auditory cortex in macaque monkeys. Journal of Comparative Neurology, 335, 437–459. Morel, A., & Kaas, J. H. (1992). Subdivisions and connections of auditory cortex in owl monkeys. Journal of Comparative Neurology, 318, 27–63. Morgan, J. E., Henderson, Z., & Thompson, I. D. (1987). Retinal decussation patterns in pigmented and albino ferrets. Neuroscience, 20, 519–535.
Holt, J. R., & Corey, D. P. (2000). Two mechanisms for transducer adaptation in vertebrate hair cells. Proceedings of the National Academy of Sciences, USA, 97, 11730–11735.
Morosan, P., Rademacher, J., Schleicher, A., M., Amunts, K., Schormann, T., & Zilles, K. (2001). Human primary auditory cortex: Cytoarchitectonic subdivisions and mapping into a spatial reference system. NeuroImage, 13, 683–701.
Huang, C. L., Larue, D. T., & Winer, J. A. (1999). GABAergic organization of the cat medial geniculate body. Journal of Comparative Neurology, 415, 368–392.
Morosan, P., Schleicher, A., Amunts, K., & Zilles, K. (2005). Multimodal architectonic mapping of human superior temporal gyrus. Anatomy and Embryology, 210, 401–406.
Imig, T. J., & Morel, A. (1985). Tonotopic organization in ventral nucleus of medial geniculate body in the cat. Journal of Neurophysiology, 53, 309–340.
Nakahara, H., Yamada, S., Mizutani, T., & Murayama, S. (2000). Identification of the primary auditory field in archival human brain tissue via immunocytochemistry of parvalbumin. Neuroscience Letters, 286, 29–32.
Imig, T. J., Ruggero, M. A., Kitzes, L. M., Javel, E., & Brugge, J. F. (1977). Organization of auditory cortex in the owl monkey (Aotus trivirgatus). Journal of Comparative Neurology, 171, 111–128. Irvine, D. R. F. (1986). The auditory brainstem. In D. Ottoson (Ed.), Progress in sensory physiology (pp. 1–279). Berlin, Germany: Springer-Verlag. Jones, E. G. (2006). The thalamus. Cambridge: Cambridge University Press. Jones, E. G., Dell’Anna, M. E., Molinari, M., Rausell, E., & Hashikawa, T. (1995). Subdivisions of macaque monkey auditory cortex revealed by calcium-binding protein immunoreactivity. Journal of Comparative Neurology, 362, 153–170.
c12.indd Sec7:265
Kaas, J. H. (2000). Why is brain size so important: Design problems and solutions as neocortex gets bigger or smaller. Brain and Mind, 1, 7–23.
Rademacher, J., Caviness, V. J., Steinmetz, H., & Galabuda, A. (1993). Topographical variation of the human primary cortices: Implications for neuroimaging, brain mapping, and neurobiology. Cerebral Cortex, 3, 313–329. Rauschecker, J. P., & Tian, B. (2000). Mechanisms and streams for processing of “what” and “where” in auditory cortex. Proceedings of the National Academy of Sciences, USA, 97, 11800–11806. Rauschecker, J. P., & Tian, B. (2004). Processing of band-passed noise in the lateral auditory belt cortex of the rhesus monkey. Journal of Neurophysiology, 91, 2578–2589.
8/17/09 2:08:26 PM
266 Audition Rauschecker, J. P., Tian, B., & Hauser, M. (1995, April 7). Processing of complex sounds in the macaque nonprimary auditory cortex. Science, 268, 111–114.
Sterbing-D’Angelo, S. (2007). Evolution of sound localization in mammals. In J. H. Kaas, (Ed.), Evolution of nervous systems (Vol. 3, pp. 253–260). London: Elsevier.
Recanzone, G. H., Schreiner, C. E., Sutter, M. L., Beitel, R. E., & Merzenich, M. M. (1999). Functional organization of spectral receptive fields in the primary auditory cortex of the owl monkey. Journal of Comparative Neurology, 415, 460–481.
Suga, N. (1990). Cortical computational maps for auditory imaging. Neurol Networks, 3, 3–21.
Rivier, F., & Clarke, S. (1997). Cytochrome oxidase, acetylcholinesterase, and NADPH-diaphorase staining in human supratemporal and insular cortex: Evidence for multiple auditory areas. NeuroImage, 6, 288–304. Romand, R., & Avan, P. (1997). Anatomical and functional aspects of the cochlear nucleus. In G. Ehret & R. Romand (Eds.), The central auditory system (pp. 97–191). New York: Oxford University Press. Romanski, L. M., Bates, J. F., & Goldman-Rakic, P. S. (1999). Auditory belt and parabelt projections to the prefrontal cortex in the rhesus monkey. Journal of Comparative Neurology, 403, 141–157. Romanski, L. M., Tian, B., Fritz, J., Mishkin, M., Goldman-Rakic, P. S., & Rauschecker, J. P. (1999). Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nature Neuroscience, 12, 1131–1136. Schreiner, C. E., & Langner, G. (1988). Periodicity coding in the inferior colliculus of the cat: Pt. II. Topographical organization. Journal of Neurophysiolog, 60, 1823–1840. Spangler, K. M., & Warr, W. B. (1991). The descending auditory system. In R. A. Altschuler, R. P. Bobbin, B. M. Clopton, & D. W. Hoffman (Eds.), Neurobiology of hearing: The central auditory system (pp. 27–45). New York: Raven Press.
c12.indd Sec7:266
Sweet, R. A., Dorph-Petersen, K. A., & Lewis, D. A. (2005). Mapping auditory core, lateral belt, and parabelt cortices in the human superior temporal gyrus. Journal of Comparative Neurology, 491, 270–289. Talavage, T. M., Sereno, M. I., Melcher, J. R., Ledden, P. J., Rosen, B. R., & Dale, A. M. (2004). Tonotopic organization in human auditory cortex revealed by progressions of frequency sensitivity. Journal of Neurophysiology, 91, 1282–1296. von Economo, C. (1925). Die cotyarchitectonik der hirnrinde des erwachsenen menschen. Berlin, Germany: Julius-Springer. von Economo, C., & Horn, L. (1930). Uber windungsrelief, masse und rindenarchitektonik der supratemporalflache, ihre individuellen und ihre seitenunterschiede. Journal of Neurology and Psychiatry, 130, 678–757. Wallace, M. N., Johnston, P. W., & Palmer, A. R. (2002). Histochemical identification of cortical areas in the auditory region of the human brain. Experimental Brain Research, 143, 499–508. Winer, J. A., & Lee, C. C. (2007). The distributed auditory cortex. Hearing Research, 229, 3–13. Yin, T. C. T., & Chan, J. C. K. (1990). Interaural time sensitivity in medial superior olive of cat. Journal of Neurophysiology, 64, 465–488. Zhang, Z., Yu, Y. Q., Liu, C. H., Chan, Y. S., & He, J. (2008). Frequency tuning and firing pattern properties of auditory thalamic neurons: An in vivo intracellular recording from the guinea pig. Neuroscience, 151, 293–302.
8/17/09 2:08:27 PM
Chapter 13
Chemical Senses SUSAN P. TRAVERS AND JOSEPH B. TRAVERS
sensations. In olfaction, the problem is magnified due to the huge number of compounds and resulting scents that can be discriminated. Yet for both senses, there is palpable excitement over recent discoveries of peripheral transduction mechanisms. In olfaction, each peripheral sensory neuron expresses just a single G-protein coupled receptor (GPCR), yet responds to a wide variety of stimuli. In taste, too, GPCRs for sweet and bitter tasting stimuli are segregated on different receptor cells but it is not uncommon to find peripheral fibers and central neurons that respond to more than one taste. Thus, the discovery of specific receptor and transduction mechanisms does not in itself solve the coding problem. How information is extracted from these receptors by the central nervous system, for example, how we achieve a singular sensation of sweet from neurons responsive to both sweet and salty compounds is hotly debated. Further fueling the debate are technical advances in the field of neurophysiology. The ability to record from more than one sensory-responsive neuron at a time means that temporal relations between neurons can be described. In addition to their classical “receptive fields,” any two neurons share a temporal space defined by the temporal relation of their respective spike trains. This adds additional degrees of freedom with which to define a coding scheme. Similarly with the advent of technical advances in functional imaging techniques, for example, c-fos immunohistochemistry, 2-DG, fMRI, and intrinsic imaging, interest in spatial topographical representations is often conceptually pitted against theories of dynamic coding. Here we present evidence from both perspectives. Our overview of the chemical senses focuses primarily on mammalian olfaction and taste. Our strategy is to present the systems independently, making comparisons when possible but not discussing the systems jointly until the orbital cortex, where there is clear convergence between taste and smell. Although we have tried to discuss similar aspects of the two senses, the coverage is uneven due to varying amounts of information. For example, analysis of olfactory bulb circuitry is much more advanced than is our
Identification of chemicals in the external world is assigned to the special senses of taste and smell. If we add the chemosensitivity of some somatosensory fibers, typically polymodal nociceptors responsive to compounds found in spices such as capsaicin, we can also include chemesthesis, the “common” chemical sense. Although activation of receptors associated with each of these senses produces unique, clearly differentiated sensations, the fusion of all three is the potent human sensation of flavor. It is standard textbook terminology to define flavor as the amalgamation of taste, olfaction, and chemesthesia, although temperature and texture are likely involved as well. This fusion of sensory modalities is unique to the chemical senses. A hedonic dimension further differentiates the chemical senses from other sensory modalities. Many tastes and smells are associated with strong innate preferences and aversions, a characteristic most likely related to senses so closely tied to the exigencies of survival. It is axiomatic in the field of taste to assert its primary role in food selection; that is, the selection of nutrients and the avoidance of toxins. There are strong preferences and aversions associated with odors, too. Odor is critical to mating and maternal function in many animals (survival of the species), and smell helps to avoid predators (survival of the individual). The hedonic responses to tastes and odors are also highly modifiable. Not all chemosensory preferences and aversions are innate, but instead are learned. In particular, taste and flavor aversions and preferences are strongly susceptible to modification by postingestive consequences. In the fields of both olfaction and taste there is vigorous debate about how stimuli are coded or represented in the nervous system. The chemical senses as a group pose a common problem compared to the other major senses, the lack of a stimulus dimension. The difficulty of the coding problem is thus exacerbated by problems in defining the stimulus to be transduced. Taste should be somewhat easier than smell because there is at least some agreement that the rather large range of chemicals that this system detects give rise to only four or five discrete qualitative 267
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c13.indd 267
8/17/09 2:08:51 PM
268
Chemical Senses
understanding of central taste circuits. On the other hand, more studies in the taste system have focused on modulation of signals by homeostatic state. For both systems, however, a major emphasis is on neural representation of quality and how each system contributes to behavioral function. TASTE GPCR Transduction Mechanisms: Sweet, Bitter, and Umami The prototypic sweet tasting stimuli include simple and complex sugars, that is, caloric, nutritive compounds common in the diet of many animals. A few naturally occurring amino acids such as glycine also taste sweet. In addition, artificial sweeteners of varied chemical type and the D-isomers of several amino acids evoke this sensation as well. Despite this range of chemical structures, a single mechanism comprised of a heteromer of GPCR proteins, T1R2 T1R3, appears mostly responsible for the transduction of all sweet substances (X. Li et al., 2002; Nelson et al., 2001; G. Q. Zhao et al., 2003). When either, or especially both, proteins are genetically deleted in mice, preference for sweet stimuli is dramatically reduced, as are sweet-evoked peripheral nerve responses (Damak et al., 2003; G. Q. Zhao et al., 2003). In fact, it has long been evident that a certain mouse strain (B129) exhibits a reduced responsiveness to sweettasting compounds and the underlying basis for this difference is now known to be associated with a genetic variation in the T1R3 protein (Nelson et al., 2001). Variation in this receptor also explains the minimal sensitivity of wild and domestic cats to sugars, but in this case the culprit is the T1R2 element. Interestingly, pseudogenization of the T1R2 receptor accounts for the rare lack of a mammalian “sweet tooth” in these animals (X. Li et al., 2005). This loss of function has been hypothesized to result from the fact that cats are obligate carnivores and consequently, the ability to detect and prefer foods with sugars is of little use to them (reviewed in Boughter & Bachmanov, 2008). The transduction of L-amino acids also relies heavily on T1R family members (G. Q. Zhao et al., 2003). In fact, one of the same T1R proteins that comprises an element of the sugar receptor, T1R3, dimerizes with a different T1R family member, T1R1, to create a receptor that responds to L-amino acids, but not sugars (X. Li et al., 2002; Nelson et al., 2002). The human T1R1 variant results in an amino acid receptor highly preferential for L-glutamate, an amino acid found in particularly high concentrations in many foods. In fact, monosodium glutamate (MSG) is considered the prototypic umami substance. The name umami derives from the Japanese, and roughly translates to savory or delicious, but the rather subtle character of the sensation
c13.indd 268
that it elicits has long made its status as a distinct taste quality subject to debate. The discovery of specifically tuned amino acid receptors has swayed the consensus of opinion in a positive direction though there is still not unanimous agreement. Species differences, even within mammals, confuse the issue. For example, the rodent T1R1-T1R3 heteromer responds to a much broader range of L-amino acids than in humans (Nelson et al., 2002). In addition, although deletion of either T1R1 or T1R3 profoundly diminishes peripheral nerve responses and behavioral preference for L-amino acids (G. Q. Zhao et al., 2003), a different knock-out of T1R3 revealed residual responsiveness (Damak et al., 2003). Indeed, with psychophysical tests evaluating threshold and discrimination rather than preference, T1R3 deletion produced only subtle alterations (Delay, Hernandez, Bromley, & Margolskee, 2006). These remaining capacities are consistent with data suggesting additional amino-acid receptor mechanisms, including a truncated form of a metabotropic glutamate receptor, tasmGLUr4. When heterologously expressed, this receptor responds to L-glutamate and the glutamate receptor agonist, AP4 (Chaudhari, Landin, & Roper, 2000). Bitter transduction likewise involves GPCRs, but these proteins comprise a distinct class, the T2Rs. Unlike sweet stimuli and amino acids, the number of family members is much larger, in rodents numbering over 35, and in humans only slightly fewer, though with a higher proportion of pseudogenes (reviewed in Boughter & Bachmanov, 2008). Presumably, the large number of T2R receptors is necessary to bind to diverse chemicals that elicit this taste quality. Although the tendency can be modified by learning, animals appear to be hard-wired to avoid consuming bitter tastants. This is adaptive because many of these compounds are toxins, often deriving from plants. For example, the alkyloids, quinine, nicotine, and morphine all taste bitter, as do the B-glucopyranosides such as salicin, a compound in willow bark closely related to aspirin. Heterologous expression has identified ligands for several T2Rs and a complex story is emerging. The receptor range varies for different family members, with some T2Rs apparently responding to just one or two ligands, others to multiple compounds within a chemical class, and yet others to a range of seemingly dissimilar compounds. For example, mT2R5 is a mouse T2R that responds mainly to cycloheximide, a bacterial toxin, and hT2R16 is a human T2R responsive to a large number of B-glucopyranosides (Bufe, Hofmann, Krautwurst, Raguse, & Meyerhof, 2002), and hT2R7 responds to several unrelated compounds including papaverine, chloroquine, quinacrine, and strychnine (Sainz et al., 2007). There is striking inter-intervidual sequence diversity in the genes encoding various T2Rs (reviewed in Max & Meyerhof, 2008). These differences this would be expected to lead to substantial variability in sensitivity for particular bitter
8/17/09 2:08:51 PM
Taste
substances, and ultimately, consequences for food choice. In fact, genetic diversity in bitter-sensing has been apparent since the early part of the twentieth century when Fox (1932) discovered that there was a bimodal distribution of thresholds among human subjects with regard to their ability to detect phenylthiocarbamide (PTC), a thioamide with thyroid toxicity. However, the basis for that difference only became clear when the identification of the T2R family allowed positional cloning to define the responsible receptor and genetic polymorphisms (Kim et al., 2003). So far, it appears that just one of the approximately 25 human T2Rs, hT2R38, detects PTC, and remarkably, substitution of just a few amino acids explains a major portion of the 1,000-fold individual variation in response threshold, as well as in the large differences in suprathreshold intensity of PTC and closely related compounds (Bufe et al., 2005; Kim et al., 2003). This variation in PTC sensitivity is not just a laboratory curiosity. Many vegetables contain PTClike glucosinolates. When humans are asked to rate bitterness, those foods containing glucosinolates, like mustard greens, turnips, and horseradish, taste more bitter to individuals with the sensitive hT2R38 halplotype, despite the fact that their ratings of other vegetables, such as radicchio or bitter melon, vary in a random way (Sandell & Breslin, 2006). A second example of genetic diversity explains some of the individual differences in perceiving the bitter side taste of the artificial sweetener, saccharin (Pronin et al., 2007). In contrast to PTC sensitivity, there appear to be (at least) two different members of the T2R family responsible for detecting saccharin’s bitterness, but just a single amino acid substitution in either receptor increases sensitivity to the bitter side taste. It has become increasingly clear that many of these oral T1R and T2R GPCRs also are expressed and functional in the lower GI tract, including the stomach, small intestine, and/or colon (reviewed in Rozengurt & Sternini, 2007). In the duodenum, one prominent location for T1R3 is within enteroendocrine cells in the duodenal villi (Bezencon, le Coutre, & Damak, 2007; Margolskee et al., 2007). T1R2 has likewise been identified in duodenal enteroendocrine cells, sometimes colocalized with T1R3 (Dyer, Salmon, Zibrik, & Shirazi-Beechey, 2005). The T1R2/T1R3 heterodimer in these enteroendocrine L-cells appears to underlie sugar-stimulated secretion of the gut hormone, glucagonlike peptide 1, and its downstream effects on the regulation of glucose-transporting enzymes in enterocytes; artificial sweeteners have similar effects (Margolskee et al., 2007). Thus, the intake of sweet foods can have profound effects on gastrointestinal physiology, regardless of caloric content. T1R1 is likewise observed in the duodenal villi, where it is hypothesized to detect amino acids (Bezencon et al., 2007). Perhaps more surprisingly, receptors for
c13.indd 269
269
toxins, the T2Rs are also expressed in gut cells, including those in the stomach, small intestine (Wu et al., 2002) and colon (Rozengurt, 2006). Indeed, enteroendocrine cell lines expressing these receptors respond to bitter ligands, demonstrating a functional effect (Wu et al., 2002). That this information reaches the brain seems likely since intragastric administration of certain bitter tastants elicit c-fos immunoreactivty in the region of the nucleus of the solitary tract that receives information from the vagus nerve (e.g., Hao, Sternini, & Raybould, 2008; Yamamoto & Sawa, 2000). In fact, central Fos expression appears dependent on the secretion of two gut hormones, cholecystokinin and peptide YY (Hao et al., 2008). Secretion of these gut hormones, which have multiple effects including a profound depression of food intake, has long been known to be triggered by nutritive substances, but the recent data indicates that bitter tastants have the same effect. This common effect of sweet and bitter substances on the secretion of satiety hormones is in contrast to their opposite effects on intake when they interact with oral taste receptors. Transduction by Ion Channels: Salty and Sour A variety of salts can activate the taste system. The cation is the major determinant of stimulus quality but anions play a significant modulatory role. To humans, many salts evoke a characteristic salty sensation, considered one of four/five basic taste qualities. This quality is elicited in the purest form by sodium and lithium salts, especially sodium chloride and lithium chloride. Other salts, like ammonium chloride, potassium chloride, or even sodium salts in combination with other anions such as SO4, elicit a salty taste, but one that is mixed to different extents with other tastes, usually bitter and sour (van der Klaauw & Smith, 1995). An amiloride-sensitive channel (ENAC) located on the apical ending of specific taste receptor cells was actually the first taste receptor discovered. Its operation is elegantly simple; sodium ions preferentially flow into the cell so that the taste stimulus itself serves as the direct trigger for depolarization (Heck, Mierson, & DeSimone, 1984). This mechanism was the first example of a taste receptor used for other homeostatic functions. Indeed, ENAC was first cloned from the colon and is found in many epithelial tissues, including the nephron (kidney) where it plays a critical role in regulating extracellular fluid volume and blood pressure. In rodents, psychophysical studies using amiloride have established that ENACs are important for detecting sodium salts and critical for the ability to discriminate this cation from others (reviewed in Spector, 2003). However, in some other species, notably humans, ENACs are less important. In humans, amiloride blockade mainly reduces the slight sour taste elicited by sodium chloride (Smith & Ossebaard, 1995). Indeed, even in rodents, neither neural
8/17/09 2:08:52 PM
270
Chemical Senses
nor behavioral responses to sodium chloride are entirely eliminated by amiloride, and except for Li, responses to other cations such as NH4 and K are hardly affected (reviewed in S. C. Kinnamon & Margolskee, 2008). Thus, it is clear that additional mechanisms are involved in salt transduction. One strongly implicated mechanism is a variant of the TRPV1 ion channel (DeSimone et al., 2001; Lyall et al., 2004), a receptor first discovered in nociceptive afferents. In the pain system, this receptor is primarily responsible for detecting vanilloids like capsaicin, the spicy component of hot chile peppers, and partly responsible for detecting painful heat or protons (Caterina & Julius, 2001). The variant in taste receptor cells, TRPV1t, contributes to transducing both sodium and nonsodium salts, as assessed in TRPV1 knockout animals and pharmacological blockade. However, even in combination, ENAC and TRPV1 cannot entirely account for salt transduction. Acid transduction, leading to the perception of sourness, also involves mechanisms other than GPCRs. Although pH is a critical chemical feature of compounds that elicit sourness, it is significant that the magnitude of acid-evoked responses varies with intracellular, rather than extracelluar pH (Lyall et al., 2006). Organic acids are hypothesized to pass through the membrane in their undissociated state and become dissociated inside cells, whereas protons from (already dissociated) inorganic acids use ion channels to enter. These considerations are believed to account for the fact that, at a given pH, organic acids elicit larger responses and taste more sour than inorganic acids. Although acid exposure causes the intracellular pH of most taste bud cells to drop, only a subset exhibit an associated Ca response (T. A. Richter, Caicedo, & Roper, 2003). Thus, specificity is apparently conferred by the presence or absence of cellular machinery that performs this conversion. Several ion channels that permit the passage of hydrogen ions or which are modulated by pH are common to taste receptor and other cells, but there is no firm link between these channels and sour transduction (reviewed in S. C. Kinnamon & Margolskee, 2008). In contrast, recent work by three independent groups pinpointed two members of the TRPP ion channel family—PKD1L1 and PKD2L1—that are more specifically expressed in taste bud cells (A. L. Huang et al., 2006; Ishimaru et al., 2006; LopezJimenez et al., 2006). When cells with these proteins were ablated by expressing diphtheria toxin under the control of the PKD2L1 promoter, peripheral nerve responses to acids were abolished (at least for the one region of the mouth tested; A. L. Huang et al., 2006). Thus, it seems clear that the taste bud cells containing PKD1L1/PKD2L1 are critical to sour transduction although the role of the proteins, per se, remains to be established. Interestingly, besides taste bud cells, PKD1L1/PKD2L1 are found in just
c13.indd 270
one other location—the cells lining the central canal of the spinal cord. These cells send processes into the cerebrospinal fluid and respond to pH changes. Detection of Fats Dietary fats are a dense source of energy and highly preferred by most mammalian species, but until recently the basis of how these nutrients are perceived has been obscure. The elucidation of transduction mechanisms for other tastants is also quite recent but there is an important difference in the case of fats. As discussed later, recording studies in peripheral taste fibers dating back to the 1940s demonstrated that compounds associated with sweet, salty, sour, and bitter sensations gave rise to neural responses in primary afferent taste fibers that were clearly different from responses elicited by the somatosensory (i.e., mechanical or thermal) properties of fluid flow alone. In contrast, fats have not been shown to elicit such differential responses in peripheral nerves or central recordings. Nevertheless, recent findings provide compelling evidence that taste bud cells possess specific molecular mechanisms for detecting free fatty acids—components of dietary fats that drive preference behavior. The initial studies that rekindled interest in the possibility that the taste system detects fats demonstrated that rodent taste bud cells contained delayed-rectifying potassium ion channels that could be blocked by polyunsaturated free fatty acids (Gilbertson, Fontenot, Liu, Zhang, & Monroe, 1997). The strongest evidence to date, however, is for CD36, a fatty acid transporter originally described in the lower GI tract, which has recently been located in taste bud cells (Fukuwatari et al., 1997; Gaillard et al., 2008; Laugerette et al., 2005). When the gene for CD36 is genetically ablated, preference for long-chain fatty acids is nearly obliterated. In addition, in these animals, the oral application of fatty acids fails to induce a typical cephalic phase response (Laugerette et al., 2005). Furthermore, when CD36 cells from the taste bud were immunomagnetically isolated and cultured, fatty acid stimulation could evoke calcium ion responses in these cells, but not in cells not expressing CD36 (Gaillard et al., 2008). Several groups have demonstrated that section of the oral taste nerves affects fatty acid-driven behavior, including preference and the ability to detect these compounds, as assessed in a conditioned taste aversion paradigm (Pittman, Crawley, Corbin, & Smith, 2007; Stratford, Curtis, & Contreras, 2006). Taste Bud Morphology and Processing The molecular machinery for taste transduction resides in modified epithelial cells organized into discrete onionshaped clusters called taste buds. Taste buds are found in
8/17/09 2:08:52 PM
Taste
the lingual, palatal, laryngeal, and pharyngeal epithelium. Oral taste buds are those considered to participate in taste perception, taste-driven appetitive behaviors, and consummatory reflexes. Laryngeal buds, on the other hand, mostly function in airway protection and little is known of the few taste buds populating the pharynx. These more caudal populations will not be considered further in this chapter. Oral taste buds are innervated by different cranial nerves or branches: the VIIth nerve innervates both the anterior tongue and palatal taste buds via the chorda tympani and greater superficial petrosal branches, respectively; the lingual branch of the glossopharyngeal nerve innervates posterior tongue taste buds (reviewed in Lundy & Norgren, 2004b; Pritchard & Norgren, 2004). At the light microscopic level, taste bud cells are similar to one another but a closer look reveals diversity (see J. Kinnamon & Yang, 2008, for a recent review of taste bud ultrastructure). The first hint came from electron microscopic studies demonstrating ultrastructural differences, the most obvious being a striking variability in electron density, suggesting three classes of cells, initially called “Dark,” “Light,” and “Intermediate” and now known as Types, I, II, and III. Because only Type III cells synapse with the primary afferent nerve, they were initially considered to be the primary receptor cells and the other types were relegated to supporting roles. However, this model is undergoing radical revision. Details are still murky, but two important changes in how the bud is viewed are nearly certain: Synapses are probably not required for a taste bud cell to communicate directly with primary afferent fibers and there is considerable communication among taste bud cells themselves. Recent studies have provided strong evidence that ATP is a critical transmitter between the taste bud and afferent nerves. P2X receptors for ATP are expressed in primary taste afferents (Bo et al., 1999), ATP is released by taste buds, and double knockout of P2X2/P2X3 proteins obliterates taste responses in both the chorda tympani and glossopharyngeal nerves, as well as behavioral responses to most stimuli (Finger et al., 2005). Because taste bud cells can release ATP nonsynaptically through pannexin 1 hemichannels (Y. J. Huang et al., 2007), the conundrum raised by earlier findings demonstrating that T1R/T2R receptors are only expressed by Type II cells, that is, those cells without synapses, appears partly resolved. Rather than playing a supporting role, Type II cells are clearly receptor cells that probably communicate directly, via nonsynaptic ATP release, with primary afferent nerves. However, there are further complications. Type II cells are not the only receptor cells because other types contain receptors for ionic stimuli (Kataoka et al., 2008; Vandenbeuch, Clapp, & Kinnamon, 2008). In addition, Type III cells likely use a different a transmitter, perhaps
c13.indd 271
271
serotonin, to modulate activity in primary afferent nerves (Y. J. Huang et al., 2007; Kaya, Shen, Lu, Zhao, & Herness, 2004). Furthermore, several other neurotransmitters and modulators, including NPY (F. L. Zhao et al., 2005), CCK (Herness, Zhao, Lu, Kaya, & Shen, 2002), serotonin (Kaya et al., 2004), and ATP (Y. J. Huang et al., 2007) have been implicated in within-bud communication. The families of receptors for sweet or umami (T1Rs) and bitter (T2Rs) tastants are both expressed in Type II cells but it is important to realize that they occur in different cells, as revealed by double situ hybridization and immunohistochemistry (Nelson et al., 2001; G. Q. Zhao et al., 2003). Likewise, as shown in Figure 13.1, PKD1L1/ PKD2L1 proteins, indicative of sour-sensing cells, are expressed in yet a different group, probably Type III cells (Kataoka et al., 2008). In fact, even the different T1Rs underlying sweet and umami taste (T1R1 and T1R2) are at least partially segregated (Nelson et al., 2001). In contrast, despite evidence for some independence (Behrens, Foerster, Staehler, Raguse, & Meyerhof, 2007), many T2Rs detecting varied bitter stimuli are co-expressed (Adler et al., 2000). These patterns of quality-specific expression may seem intuitively satisfying but they were initially somewhat surprising. Receptor segregation implies response specificity, but much previous physiological work had implied the opposite (e.g., Gilbertson, Boughter, Zhang, & Smith, 2001; Herness, 2000). This apparent conflict between the molecular and physiological data rekindled a decades-old argument about whether taste quality is coded by specifically tuned neural elements, that is, a labeled line, or by an ensemble code. The mechanisms downstream of transduction provided a further opportunity to probe the coding question. Even though distinct receptors detect sweet, umami, and bitter stimuli, these GPCRs share a common (Y. Zhang et al., 2003), though perhaps
PKD2L1
T1R3
PKD2L1
T2Rs
Figure 13.1 (Figure C.16 in color section) Double in situ hybridization. One probe was for putative sour-detecting cells expressing PKD2L1; the second probe stained receptors for other qualities. Note: (Left) Co-localization of PKD2L1 with T1R3, a common component of the receptor for sweet and umami compounds. (Right) Co-localization with a mixture of 20 T2Rs, receptors for bitter tastants. Note that PKD2L1 never co-localizes with either of the other receptors. From figure 1 in “The Cells and Logic for Mammalian Sour Taste Detection,” by A. L. Huang et al., 2006, Nature, 442, pp. 934–938. Adapted with permission.
8/17/09 2:08:52 PM
272
Chemical Senses
not sole (Damak et al., 2003) transduction cascade utilizing PLCβ2 and the TRP channel, TRPm5. When PLCβ2 or TRPm5 were genetically deleted, neural and behavioral responses to sweet, bitter, and amino acid stimuli were profoundly diminished, but responses to salty and sour tastants remained (Y. Zhang et al., 2003). Significantly, when PLCβ2 expression was rescued under the control of a single T2R promoter, bitter responsiveness to a broad array of bitter tastants, but not responsiveness to sweet and umami stimuli was restored, suggesting functional segregation of sweet versus bitter and umami pathways (Mueller et al., 2005). These findings are somewhat at odds with the broader tuning detected using physiological techniques. However, the discrepancy should not be exaggerated, since physiological data indicate that most nonspecific taste neurons are broadly tuned to electrolytes; in particular few respond to both bitter and sweet stimuli (reviewed in Spector & Travers, 2005). Nevertheless, experiments that deleted taste bud cells expressing the PKD2L1 molecule told a similar story; in this case, chorda tympani responses to sour stimuli were obliterated, but responses to other tastants, including sodium chloride, remained (A. L. Huang et al., 2006). This is more surprising since, beyond the taste bud, broadly tuned acid/sodium chloride neurons are commonly observed. Even this, however, would be partly explicable by convergence of receptor cells onto primary afferents or by virtue of interaction within the bud. As depicted in Figure 13.2, recent studies suggest that some of this interaction may entail convergence of qualityspecific information from Type II cells onto more broadly tuned Type III cells (Tomchik, Berg, Kim, Chaudhari, & Roper, 2007). One neurotransmitter mediating communications between Type II and Type III cells is ATP, the same transmitter responsible for communication between Type II cells and the primary afferent nerves (A. L. Huang et al., 2006). Thus, it seems likely that Type II cells communicate both directly and indirectly with primary afferents (Tomchik et al., 2007). Indeed, the complex actions of the many neurotransmitters and modulators in the taste bud suggest that we have only begun to glimpse the tip of the iceberg with regard to how taste bud processing modifies signals (e.g., Roper, 2006; F. L. Zhao et al., 2005). Finally, there are hormonal influences at the level of the taste bud. Leptin, a polypeptide secreted by fat cells, that influences hypothalamic mechanisms to suppress appetite, also directly influences taste receptor cells, in particular it inhibits responses to sweet stimuli, tastants that potently promote feeding behavior (Shigemura et al., 2004). Indeed, the lack of the leptin receptor in the db/db mouse appears to account for the fact that this strain exhibits increased peripheral nerve responses to sweet, but not other stimuli (Ninomiya, Sako, & Imai, 1995).
c13.indd 272
Sweet
Bitter
??Salt Sour
??Salty Umami ATP IIT1R2
IIIPKD2L1 IIT2R
I IIT1R1 ?
?
ATP
Primary afferents (Vii&ix)
Figure 13.2 Schematic diagram of taste bud processing. Note: Separate sets of Type II cells in the taste bud contain GPCRs for substances perceived as sweet, umami, or bitter. Cells expressing PKD2L1 are important for sour detection and have the distinct ultrastructural characteristics of Type III cells. Type II cells secrete ATP in response to stimulation with sweet, bitter, and umami tastants. Both primary afferent neurons and Type III cells are capable of responding to the ATP signal. Type III cells are the only taste bud cells that make classic synapses with primary afferent nerves; the transmitter is unknown. Type I cells resemble glia and have long been thought to play a purely supporting role, but recent data suggest that they may respond to salts. Based on figure 7 in Tomchik et al. (2007) and additional data from Y. J. Huang et al. (2007), Finger et al. (2005), and Vandenbeuch et al. (2008).
Primary Afferent Responsiveness and Function Information from taste buds is transmitted to primary afferent gustatory fibers. Individual fungiform papillae are separated from each other and contain a limited number of taste buds, making it possible to gain insight into the degree of convergence onto primary afferents. In fact, rodents have just a single taste bud per fungiform papilla, providing an opportunity to derive rather precise estimates for chorda tympani fibers. In the mouse, a recent study elegantly demonstrated that a single afferent chorda tympani fiber usually receives input from just one taste bud (Zaidi & Whitehead, 2006). In other species, there is apparently more but still relatively limited convergence (e.g., Nagai, Mistretta, & Bradley, 1988). It would be interesting to know if this pattern generalizes to posterior tongue taste
8/17/09 2:08:53 PM
Taste
buds, but this information is not available because taste buds are so densely packed in the circumvallate and foliate papillae. In addition, because individual taste buds express varied receptors it is difficult to extrapolate the amount of convergence from a given receptor type to a primary afferent as has been possible for convergence of olfactory receptors onto neurons in the bulb (see discussion that follows). However, the relatively small receptive fields for single anterior tongue fibers, along with the narrow response profiles of many single fibers in the chorda tympani, greater superficial petrosal, and glossopharyngeal nerves, implies considerable specificity. The popular notion of a lingual taste map, with various regions devoted to one or another quality is obviously in error. Indeed, in humans, it has been demonstrated that stimulation of a single fungiform papilla can give rise to enough sensory information to elicit recognition of multiple qualities (Bealer & Smith, 1975). Nevertheless, there are regional differences in relative responsiveness to different taste qualities. One common pattern is for the posterior tongue to exhibit a greater degree of responsiveness to bitter substances, although this may not be true in primates (see Spector & Travers, 2005, table 1). There are additional sensitivity differences as well, but these are more idiosyncratic across species. Initial recordings from single taste fibers were performed in cats (Pfaffmann, 1941) and revealed a lack of absolute specificity for a given quality. This basic result has endured scores of replications but has been refined in important ways. Even using moderate concentrations, there are peripheral fibers that respond nearly equally to stimuli evoking entirely distinct qualities. Other fiber types, however, are quite narrowly tuned. In fact, even broad fibers are far from a hodgepodge of sensitivities. One cogent scheme considers narrowly tuned fibers to be “specialists” and those more broadly tuned as “generalists” (e.g., Breza, Curtis, & Contreras, 2007; Frank, 1973; Lundy & Contreras, 1999). There are multiple reports of specialist fibers that respond several-fold more robustly to sweeteners, bitters, or sodium salts than to contrasting qualities. Fibers strongly responsive to acids, on the other hand, usually are generalists that respond to other electrolytes, including sodium salts, nonsodium salts, and ionic bitters like quinine hydrochloride. In fact, the electrolyte generalist category is probably divisible into a number of subtypes (Breza et al., 2007; Lundy & Contreras, 1999; Sollars & Hill, 2005). It is important to recognize that an apparent lack of narrowly tuned fibers for a particular quality in a given nerve and species can be misleading. For example, in rat chorda tympani fibers or their cell bodies in the geniculate ganglion, responses to the bitter stimulus, quinine, are observed mostly in neurons that also respond to other electrolytes
c13.indd 273
273
(Breza et al., 2007; Frank, Contreras, & Hettinger, 1983; Lundy & Contreras, 1999). However, this is not true in the glossopharyngeal nerve, which contains a large population of neurons highly and specifically responsive to bitter tastants (Frank, 1991). Likewise, in the rat, narrowly tuned fibers that respond preferentially to sweeteners like sucrose comprise just a small group of neurons in the chorda tympani nerve but are prominent among geniculate ganglion neurons innervating the palate (Sollars & Hill, 2005) and in the glossopharyngeal nerve (Frank, 1991). Indeed, they are plentiful in the chorda tympani of most other species (see Spector & Travers, 2005, table 1). Figure 13.3 shows representative recordings from specialist and generalist fibers, taken from geniculate ganglion neurons innervating the anterior tongue of the rat. How amino and fatty acids fit into the picture is not clear. Receptor mechanisms for these compounds have been defined but these stimuli have been used infrequently when assessing peripheral nerve (and central) responses. For amino acids (reviewed in Spector & Travers, 2005), most information is available for glutamate, usually its sodium salt, monosodium glutamate (MSG). Many electrolyte-sensitive neurons respond to MSG, including sodiumspecialists and electrolyte-generalists, apparently due to the sodium cation, but this stimulus also drives neurons robustly responsive to sweeteners (see Figure 13.3). Even less is known about other amino acids, although those that taste sweet to humans, for example, the D-isomers and certain L-amino acids, drive the sweet-specialist fibers. Thus, in contrast to the segregation of T1R2 and T1R3 receptors in taste bud cells, there is not much evidence for a differential response to umami stimuli in peripheral fibers though there is one report of single fibers in the mouse glossopharyngeal nerve that respond more specifically to MSG (Ninomiya, Sako, & Funakoshi, 1989). There is even less information on fatty acid responses. A recent study of rat chorda tympani fibers used linoleic acid (a fatty acid present in corn oil). Not one of the 52 fibers assessed responded to this stimulus (Breza et al., 2007), despite the fact that transection of this nerve blunts behavioral recognition of linoleic acid (Stratford et al., 2006). Because the CD36 receptor is expressed more heavily in the posterior tongue, it is possible that responses would be more evident in the glossopharyngeal nerve. Indeed, a brief report of whole nerve recordings from the pharyngeal branch of the glossopharyngeal nerve demonstrated responses to oleic acid but not a substance with a similar texture, mineral oil. Further, it was interesting that the oleic acid responses were blunted by intraveneous injection of a satiety-producing agent, leptin (Kitagawa et al., 2007). Additional work using both amino and fatty acids is clearly warranted to determine if these classes of chemicals evoke clear taste signals.
8/17/09 2:08:54 PM
274
Chemical Senses Sucrose-specialist
15 s
NaCl-specialist
NaCl-generalistI
NaCl-generalistII
Acid-generalist
0.5 M Sucrose
0.1 M NaCl
0.01 M citric acid
Figure 13.3 Representative single-unit recordings from two types of specialist and three types of generalist neurons recorded from rat geniculate ganglion neurons that send their peripheral processes into the chorda tympani nerve to supply the anterior tongue. Note: The sodium chloride-specialist and electrolyte generalist neurons were the most common types, typical of rat chorda tympani recordings. The sucrose-specialist neuron was one of just a few cells of this type, but such neurons are plentiful in afferent fibers innervating the palate or posterior tongue in this species. Note that monosodium glutamate (MSG) drove both the specialist neurons. It is likely that sodium moiety was responsible
The claim that some peripheral fibers are specialists, specifically tuned for a given quality, must be tempered by the critical issue of stimulus concentration. Parametric data on concentration is sparse, but available information indicates that higher concentrations generally give rise to broader tuning. At the same time, it is clear that the response-concentration functions in a given cell have stimulus-specific slopes, which are often shallower for stimuli eliciting nonoptimal responses. In fact, in some cells, certain stimuli never elicit an above-threshold response, regardless of concentration (e.g., Hanamori, Miller, & Smith, 1988). Neurons with both types of responseconcentration functions can be seen in Figure 13.4. The broader tuning with higher concentrations would seem to make an ensemble code a necessity for disambiguating concentration and quality. However, the precise nature of that code, for example, the critical size of the ensemble, is unclear. The flip side of the concentration issue is that lower concentrations activate cells more selectively. Under these conditions, the ensemble may consist of only a single, selectively responsive cell type more akin to a sparse code
c13.indd 274
0.02 M QHCl
0.1 M MSG
for activating sodium chloride-specialist, glutamate for activating the sucrose specialist. Thus, similar to the few other available reports, there is little evidence that umami stimuli elicit differential responses upstream, despite a notable degree of segregation of T1R1 and T1R2 receptors in taste bud cells. Also note the absence of quinine hydrochloride/bitterspecialists. Although such neurons are rare or absent in the chorda tympani, they are plentiful in the glossopharyngeal nerve. From figure 1 in “Monosodium Glutamate but Not Linoleic Acid Differentially Activates Gustatory Neurons in the Rat Geniculate Ganglion,” by J. M. Breza et al., 2007, Chemical Senses, 32, pp. 833–846. Reprinted with permission.
or labeled-line. Critical tests of these hypotheses require a larger amount of coordinated behavioral and neural data. In addition to the relative differences in chemosensitivity, other functional distinctions characterize taste nerves. This has been studied most thoroughly in rat. In light of the fact that the glossopharyngeal nerve innervates several times as many taste buds as the chorda tympani, it is surprising that loss of glossopharyngeal input generally has little effect on tasks that measure taste quality discrimination, whereas chorda tympani section, and especially combined section of the chorda tympani and greater superficial petrosal nerve, has a profound effect (reviewed in Spector, 2003). In these experiments, an animal is required to make a correct choice between different qualities to receive a reward or avoid a punishment; that is, taste quality is a discriminative stimulus. That this difference between the VIIth and IXth nerves represents a functional distinction with regard to task type rather than a stimulus-specific effect was particularly evident when rats were challenged to distinguish between quinine and potassium chloride (St. John & Spector, 1998). In the rat, the chorda tympani nerve
8/17/09 2:08:55 PM
Taste
275
500
Impulses/10 s
400
Q
Fiber 66 QHCI - best
Fiber 68 QHCI - best Q
H
300 200
S 100
H
N
N
S
0 0.0003
0.003
0.03
0.3 0.0003 Stimulus concentration (M)
Figure 13.4 Concentration-response functions for two neurons in the glossopharyngeal nerve of the hamster. Note: These neurons were both classified as “QHCl (quinine)-best,” based on the fact that the total response to quinine summed across concentrations elicited the largest response. These two fibers, however, had very different response-concentration functions; the fiber on the left maintained specificity over a range of concentrations; that on the right
is weakly responsive to these tastants, particularly quinine, and more important, most responses to these stimuli cooccur in neurons broadly responsive to electrolytes (e.g., Frank et al., 1983). In contrast, the glossopharyngeal nerve contains separate fiber groups robustly and selectively responsive to potassium chloride and other electrolytes versus those responsive to quinine and other bitter stimuli (Frank, 1991). Even so, chorda tympani section compromises potassium chloride/quinine discrimination more severely than section of the glossopharyngeal nerve (St. John & Spector, 1998). On the other hand, the glossopharyngeal nerve provides a much more effective afferent signal for triggering “gaping,” an oromotor rejection reflex preferentially elicited by bitter-tasting stimuli (e.g., J. B. Travers, Grill, & Norgren, 1987). When voluntary licking, driven by the hedonic nature of the stimulus is measured, effects are more varied and dependent on the sensitivity of a given primary afferent nerve. Even more important, as Spector (2003) has proposed, there appears to be synergy between different nerves in contributing to these tasks. The relative differences between primary gustatory nerves in subserving reflexive, hedonic, and discriminative behaviors bear some resemblance to the distinctive roles of olfactory axons innervating the main olfactory epithelium and vomeronasal organ; however, both the functional distinctions and sensitivity differences are more pronounced in the olfactory domain. Central Pathways and Hierarchy of Function Cranial nerves carrying gustatory information make their first synapse in the major visceral sensory nucleus of the
c13.indd 275
0.003
0.03
0.3
responded nearly as well to hydrogen chloride at higher concentrations. Nevertheless, the cell remained poorly responsive to sucrose and sodium chloride over the entire intensity range. H Hydrogen chloride; N Sodium chloride; Q Quinine; S Sucrose. From figure 5 in “Gustatory Responsiveness of Fibers in the Hamster Glossopharyngeal Nerve,” by T. Hanamori et al., 1988, Journal of Neurophysiology, 60, p. 485. Adapted with permission.
medulla, the nucleus of the solitary tract (NST), with the special visceral afferents terminating rostral to their general visceral counterparts that convey signals from the gastrointestinal tract, respiratory, and cardiovascular systems (reviewed in Lundy & Norgren, 2004b; Pritchard & Norgren, 2004). One class of efferent projections from NST terminate locally, in the subjacent parvocellular reticular formation including regions close to preganglionic parasympathetic neurons, and more lightly in oromotor nuclei and the caudal NST (Beckman & Whitehead, 1991; Halsell, Travers, & Travers, 1996; J. B. Travers, 1988). These local pathways underlie cephalic phase and somatic reactions to taste stimuli (Figure 13.5). Ascending pathways from NST are somewhat different in rodent and primate. In both species, information ultimately reaches gustatory cortex and limbic structures, but the routes diverge across species. In rodents, the gustatory NST projects densely to the parabrachial nucleus (PBN) of the pons (Norgren & Leonard, 1971, 1973), and PBN neurons give rise to a thalamocortical pathway, including the most medial, parvicellular portion of the ventroposteromedial thalamic nucleus (VPMpc), which in turn projects to the primary gustatory cortex in the insula (Kosar, Grill, & Norgren, 1986a, 1986b). PBN projections along a second trajectory reach various nuclei in the ventral forebrain, most prominently the central nucleus of the amygdala and bed nucleus of the stria terminalis as well as a contiguous set of structures extending from the lateral hypothalamus caudally to the ventral pallidum rostrally (Bernard, Alden, & Besson, 1993; Halsell & Frank, 1992; Norgren, 1974; Norgren & Leonard, 1973). These two pathways are commonly thought to contribute to sensory/discriminative versus motivational functions (e.g., see
8/17/09 2:08:55 PM
276
Chemical Senses Interneurons Reticular formation
Sensory Taste
Motor Jaws & tongue
Muscle Ingestion & Rejection
Glu Lick nNMDA
rNST
Gape
mV o
GABA
AD Glycine
QHCI r
nNMDA
STY mXII
GEN p Glu
Figure 13.5 Connectional model of a neural substrate for switching between oromotor responses of ingestion and rejection produced by palatable and unpalatable gustatory stimuli. Note: Projections from rostral (gustatory) nucleus of the solitary tract (rNST) synapse on populations of preoromotor interneurons in the subjacent reticular formation. Excitatory projections from the rNST produce the ingestive sequence of licking, characterized by an alternating sequence of tongue protrusion and retraction, with tongue protrusion coincident with jaw opening. Rejection responses to a bitter stimulus quinine mono-hydrochloride (QHCl) are induced by disinhibition of the network that recruits additional preoromotor interneurons or increases their response frequency
Spector & Travers, 2005), but this is an oversimplication because many complex behaviors require both cortical and limbic structures. In primate, ascending gustatory efferents appear to bypass PBN and ascend directly to VPMpc (Beckstead, Morse, & Norgren, 1980), then reach the insular/opercular cortex (Pritchard, Hamilton, Morse, & Norgren, 1986). Projections from the insula reach many of the same limbic structures, but the direct brain stemlimbic taste pathway appears missing in the primate (T. Pritchard & Norgren, 2004). This latter conclusion, however, is not ironclad. Although it is clear that there is a direct NST-VPMpc pathway in the primate not present in the rodent, the data for the lack of a PBN gustatory relay in the primate is less conclusive as it is based only on studying efferents from the rostral pole/anterior tongue region of NST. Thus, it remains possible that posterior mouth taste information does reach the PBN, as is certainly the case for efferents from the general visceral NST (Beckstead et al., 1980). In fact, human imaging data hints at a PBN taste relay (Topolovec, Gati, Menon, Shoemaker, & Cechetto, 2004). If so, a direct taste pathway from the brain stem to the limbic system is a possibility since the primate PBN has ventral forebrain projections similar to the rodent (Pritchard, Hamilton, & Norgren, 2000).
c13.indd 276
QHCI to produce larger amplitude EMG responses. Phase switching during the rejection response is apparent as a three-phase sequence of initial tongue retraction followed by tongue protrusion coincident with jaw-opening and followed by a second tongue retraction. Dynamic modeling suggests that this phase switching can be accomplished by differences in the decay kinetics of inhibitory synapses between interneurons (see Venugopal et al., 2007, for details). AD Anterior digastric (jaw-opener); GABA Gamma-aminobutyric acid; GEN Genioglossus muscle (tongue protrudor); Glu Glutamate; nNMDA Non-N-methyl-D-aspartate; STY Styloglossus muscle (tongue retractor).
Classic experiments done 20 years ago revealed key features of gustatory functional anatomy. When rats are stimulated with small amounts of tastants delivered through intraoral cannulae, they exhibit one of two stereotyped sets of responses indicating the acceptability of the fluid: sweet, salty, umami, and sour stimuli mainly elicit lick-like movements and swallowing; bitter-tasting stimuli evoke a dramatically different response consisting of a very wide mouth opening with a unique tongue/jaw coordination (the “gape”) that leads to fluid rejection (Grill & Norgren, 1978b). Similar oral reactions occur in humans, including newborn infants (Steiner, 1973), and nonhuman primates (Steiner & Glaser, 1984). These reactions of rats to experimenter-delivered taste stimuli resemble those observed during voluntary intake, but experimentercontrolled stimulation made it possible to test these consummatory reactions after a key experimental manipulation where appetitive behavior does not occur- decerebration. Using these techniques, Grill and Norgren (1978c) demonstrated that these responses are preserved in decerebrate rats. Others have shown that similar behaviors persist in anencephalic human infants (Steiner, 1973) and decerebrate cats (Miller & Sherrington, 1916). These findings indicate that the gustatory brain stem is capable of making
8/17/09 2:08:56 PM
Taste
the basic distinction between acceptable and toxic fluids. Furthermore, the animal retains the regulatory capacity to respond appropriately to changes in stimulus concentration and to adjust behavior in response to certain postingestive signals. Thus, the animal exhibits longer licking bouts with increases in sucrose concentration; more gaping with higher quinine concentrations (Grill & Norgren, 1978b), and less licking to sucrose when testing is preceded by gastric fill (Grill & Norgren, 1978a) or cholycystokinin injection (Grill & Smith, 1988). However, other aspects of regulatory control are absent. In intact animals, food or sodium deprivation augment licking to sucrose and sodium, respectively, but these changes are lost in decerebrates (Grill, Schulkin, & Flynn, 1986; Seeley, Grill, & Kaplan, 1994). Likewise, when an animal is given a conditioned taste aversion (CTA) by inducing illness after the consuming a novel, preferred stimulus, oromotor reactions switch from licking to gaping. This type of plasticity is also demolished by decerebration (Grill & Norgren, 1978a).
277
(A)
IX
CT
GSP (B)
(C)
Brain Stem Topography and Circuitry There is a distinct topography of primary afferent fibers in the first-order gustatory relay, though it is much less dramatic than the precise topography established by olfactory bulb afferents discussed later. The chorda tympani, greater superficial petrosal, and glossopharyngeal nerves synapse in an orderly sequence roughly from rostrolateral to caudomedial (reviewed in Lundy & Norgren, 2004b; T. Pritchard & Norgren, 2004). Despite this orderly sequence, there is significant overlap between primary afferents. Figure 13.6 shows data from a recent triple-label tracing study that illustrates this overlapping topography (May & Hill, 2006). The overlap between nerves is particularly pronounced between the greater superficial petrosal and the two other nerves (Hamilton & Norgren, 1984; May & Hill, 2006), presumably forming the basis of the convergence between apposing regions of the tongue and palate observed neurophysiologically (Ogawa, Hayama, & Ito, 1982; S. P. Travers, Pfaffmann, & Norgren, 1986). Although less striking, there is also overlap between chorda tympani and glossopharyngeal endings, but neurophysiological evidence for convergence is apparent mostly with intracellular (Grabauskas & Bradley, 1996), not extracelluar recordings (S. P. Travers & Norgren, 1995), suggesting weaker connections. In vivo neurophysiological studies of the gustatory NST that map receptive fields find an orderly orotopic representation, with significant segregation of the anterior and posterior oral cavity (Dickman & Smith, 1989; S. P. Travers & Norgren, 1995). The topography apparent in mammals is emphasized greatly in certain
c13.indd 277
Figure 13.6 (Figure C.17 in color section) Fluorescent photomicrographs showing triple-labeled terminal fields from the chorda tympani (CT, red), greater superficial petrosal (GSP, green) and glossopharyngeal (IX, blue) nerves in the nucleus of the solitary tract (NST) in the horizontal plane. Note: The approximate border of the nucleus is outlined in white. Sections are arranged from dorsal (A) to ventral (D). Because of the orientation of the nucleus, dorsal sections are also more caudal; ventral sections more rostral. The IXth nerve projection extends most caudally and medially (A); the chorda tympani most rostrally and laterally. Despite this orotopic organization, there is obvious overlap, particularly between the CT and GSP (yellow region in C) and between the GSP and IX (blue-green region in B). From figure 3 in “Gustatory Terminal Field Organization and Developmental Plasticity in the Nucleus of the Solitary Tract Revealed through Triple-Fluorescence Labeling,” by O. L. May and D. L. Hill, 2006, Journal of Comparative Neurology, 497, p. 661. Adapted with permission.
fish that possess taste buds on the exterior body innervated by the VIIth nerve, and in the mouth by the Xth cranial nerve. This topography is visible at a gross anatomical level where the separate representations are associated with definable lobes protruding from the medullary surface (reviewed in Whitehead & Finger, 2008). Although gustatory orotopy remains evident at the cortical level (Benjamin & Pfaffmann, 1953; Hanamori, Kunitake, Kato, & Kannan, 1997; Yamamoto, Matsuo, & Kawamura, 1980),
8/17/09 2:08:56 PM
278
Chemical Senses
it is clearest at the first-order relay. Even at the second synaptic relay in rodents, PBN neurons exhibit greater convergence from the anterior and posterior oral cavity (Halsell & Travers, 1997) and descriptions of orotopic organization exhibit more variability between studies, suggestive of a greater overlap and complexity (discussed in Lundy & Norgren, 2004b). Although orotopy is obvious, an ordered organization of what is arguably the most functionally salient feature of the system, taste quality, is more debatable. In contrast to the well-documented quality-specific spatial patterns in the olfactory bulb, with a few exceptions (e.g., Geran & Travers, 2006; Halpern, 1965; T. R. Scott, Yaxley, Sienkiewicz, & Rolls, 1986b), neurophysiological studies have not reported a chemotopic organization in NST. To some extent, this is excusable on the basis of technical difficulties. Gustatory neurons are small, relatively difficult to isolate, and along two of three anatomical axes, distributed over just a couple hundred microns, making it difficult to build up topographical maps. To a limited extent, these difficulties have been overcome by using Fos immunohistochemistry, which succeeded in uncovering a rough chemotopy for bitter tastants that activated cells restricted more medially than those activated by sweet (Harrer & Travers, 1996) and sour (S. P. Travers, 2002) stimuli. Fos immunohistochemistry has also revealed evidence for chemotopy in the PBN. In this higher order relay, there was a dichotomy between stimuli preferred (sucrose and sodium chloride) and avoided (acid and quinine) in appetitive tasks (Yamamoto, Shimura, Sakai, & Ozaki, 1994). As in other systems (Hunt, Pini, & Evan, 1987), however, the cells that express Fos in the gustatory system seem to represent just a subpopulation of all activated cells. In fact in NST, the prototypic and neurophysiologically potent stimulus, sodium chloride, does not elicit any Fos at all (S. P. Travers, 2002). Thus, though providing a strong hint of a systematic underlying organization, Fos staining cannot reveal the full story on a putative brain-stem gustatory chemotopy. As discussed later, recent intrinsic imaging in the gustatory cortex provides further evidence that an overlapping chemotopy indeed characterizes the organization of the system (Accolla, Bathellier, Petersen, & Carleton, 2007). Both the rostral NST and gustatory PBN have differentiated morphologies indicative of subnuclei/subdivisions, identifiable on the basis of fiber staining and the different concentrations of varied cell types (Fulwiler & Saper, 1984; Whitehead, 1988). The circuitry has been best studied in NST. Glutamate, acting through nonNMDA and NMDA receptors is a critical neurotransmitter for conveying signals from primary afferent fibers to NST (C. S. Li & Smith, 1997; Wang & Bradley, 1995). These primary afferents synapse most densely in a dorsal/central
c13.indd 278
region, the rostral central subnucleus (Whitehead & Frank, 1983); the same subnucleus from which the largest proportion of ascending projections arise. PBN projection neurons have morphologies described as “elongate” and “stellate” (J. B. Travers, 1988; Whitehead, 1990). The prominent dendritic orientation of elongate cells in the mediolateral plane is parallel to the trajectory of incoming fibers; stellate cells have dendrites radiating in all directions. Local medullary projections, on the other hand, arise most prominently from the lateral and especially the ventral subnucleus, which contains a higher concentration of larger stellate neurons. These relationships suggest the possibility of a more direct pathway from primary afferents to the ascending pathway. Indeed, electron microscopic reconstructions of identified PBN projection cells show that synapses resembling those from primary afferents terminate on dendritic shafts or spines of these neurons (Whitehead, 1986). However, parallel studies are not available for local medullary projection cells and this conjecture is further complicated by the extensive dendritic arborizations of NST neurons (Davis & Jang, 1988; King & Bradley, 1994; Whitehead, 1988). There is also a prominent class of small ovoid neurons throughout the various subnuclei; these are considered to be inhibitory interneurons and some contain GABA (Davis & Jang, 1988; Lasiter & Kachele, 1988; Whitehead, 1990). EM reconstructions show that symmetric, putative inhibitory synapses terminate on both dendritic and somal compartments (Whitehead, 1986, 1993), in distinction to the preferential dendritic terminations of primary afferents (May, Erisir, & Hill, 2007; Whitehead, 1986). Response Properties Chemosensitive profiles of NST and PBN neurons resemble, to a first approximation, those observed in primary afferent fibers; that is, there are a limited number of types definable on the basis of an optimal or “best” stimulus quality. However, due at least in part to convergence, on average, brain-stem neurons are more broadly tuned than their peripheral counterparts (reviewed in Spector & Travers, 2005). A striking example was reported in the rat, when it was observed that single NST neurons that received convergent input from the anterior tongue and anterior palate often responded to sodium chloride applied to the tongue, but sucrose applied to the palate (S. P. Travers et al., 1986). A series of studies systematically compared tuning profiles using the same species (hamster), stimuli, and method of stimulation and found that the breadth of responsiveness of certain cell types increased at each successive relay (Smith, Van Buskirk, Travers, & Bieber, 1983). Because of this broadening, many have concluded that a sparse code or labeled line is not a viable option for coding taste
8/17/09 2:08:57 PM
Taste
c13.indd 279
intravenous injections of glucose, glucacon, or insulin preferentially decrease multiunit responses elicited by sweet stimuli (Giza, Deems, Vanderweele, & Scott, 1993; Giza & Scott, 1983, 1987). Gastric distention likewise suppresses NST taste responses (Glenn & Erickson, 1976). Another satiety-mimicking treatment, intraduodenal lipid, diminished PBN gustatory responses, with particularly potent effects for sucrose responses in neurons narrowly tuned to sweet stimuli (Hajnal, Takenouchi, & Norgren, 1999). Gastric distension also influences PBN neurons, though, as shown in Figure 13.7, gastric-induced taste response enhancements as well as suppressions were observed (Baird, Travers, & Travers, 2001). These changes are assumed to
(A) Taste off Net spikes/10 sec
Taste on 10
Fully distended
8
De
ing
Spikes/500 msec
Inf
6
Taste response
100 80 60 40 20 0
Before During After
fla
lat
tin
g Deflated
Gastric: Deflated
4
2 Before During
Rinse on
0 10
20
30
40
50
60
Time (sec) (B) Taste off Net spikes/10 sec
Taste on 10
Fully distended
8
De
g
in lat
6
120 100 80 60 40 20 0
Taste response
Before During After
fla
Inf
Spikes/500 msec
quality in the CNS (see Simon, de Araujo, Gutierrez, & Nicolelis, 2006, for a contemporary exposition of this opinion). Moreover, work in NST has noted considerable variability in responses over multiple trials (Di Lorenzo & Victor, 2003). Interestingly, this analysis also showed that, for some cells, the temporal characteristics of the spike train were more reliable suggesting that spike timing could make a contribution to quality coding. Despite the problem of response variability and the average increase in breadth of responsiveness, some brain stem neurons remain narrowly tuned (reviewed in Spector & Travers, 2005). For example, in a series of studies using an awake, behaving preparation Norgen’s group explicitly subdivided best-stimulus types of NST (Nakamura & Norgren, 1991) and PBN neurons (Nishijo & Norgren, 1990), demonstrating that there were specifically and broadly tuned neurons within a given type. Other work makes it clear that it is important to consider receptive field when making generalizations about tuning curves. For example, one investigation that recorded from the NST region most responsive to anterior mouth stimulation concluded that responses to bitter-tasting stimuli occurred mainly within electrolyte-generalist cells, and that salty, sour, and bitter stimuli were nearly equipotent in driving this class of neurons (Lemon & Smith, 2005). A second study that sampled the entire NST concluded instead that there was a class of neurons optimally and narrowly tuned to bitter tastants; however, all of these neurons had posterior tongue-receptive fields (Geran & Travers, 2006). The significance of these varied tuning curves is unknown but it is possible that they relate to the varied functions of the taste system. At a minimum, taste has discriminative, motivational, and reflex functions, and it is possible that these operations function to some extent in parallel (Spector & Travers, 2005). In fact, not only are NST projections to the medullary reticular formation and PBN concentrated in different subnuclei, double-labeling studies reveal that very few neurons project to both locations (Halsell et al., 1996). Likewise in PBN, there is some evidence for segregation between thalamic and ventral forebrain outputs (Voshart & van der Kooy, 1981). These anatomical studies support the notion of parallel processing in the taste system, although a coherent story regarding the relationship between response properties and function has yet to emerge (discussed in Spector & Travers, 2005).
279
tin
g
Gastric: Deflated
Deflated
4 Before During
2
Rinse on
0 10
20
30
40
50
60
Time (sec)
Modulation
Figure 13.7 Modulation of parabrachial nucleus taste responses by gastric distention.
Brain stem gustatory responses are far from static. Instead they are subject to modulation by homeostatic state and prior history, presumably reflecting the markedly different behavioral reactions to taste evident in deprived and sated states and as a result of past experience. In the rodent NST,
Note: Average time course of responses that were suppressed (A, n 18) or enhanced (B, n 7) by gastric distension. Modulation occurred in both directions but suppression was over twice as common as enhancement. From figure 4 in “Integration of Gastric Distension and Gustatory Responses in the Parabrachial Nucleus,” by J. P. Baird et al., 2001, American Journal of Physiology: Regulatory, Integrative, and Comparative Physiology, 281, p. R1587. Adapted with permission.
8/17/09 2:08:58 PM
280
Chemical Senses
be part of the substrate for the reduced preference for palatable stimuli in satiated animals. Another powerful manipulation that affects taste preference is the induction of a sodium appetite in response to deprivation or hormonal treatment. In nondeprived animals, isotonic and lower concentrations of sodium are preferred and higher concentrations are rejected. However, sodium deprived animals avidly prefer even hypertonic salt, as long as it contains the sodium cation (C. Richter, 1956). Neurally, this manipulation usually (but not always, see Tamura & Norgren, 1997) reduces sodium chloride-evoked responses in the brain stem (Jacobs, Mark, & Scott, 1988; McCaughey & Scott, 2000; Shimura, Komori, & Yamamoto, 1997; Tamura & Norgren, 2003). Indeed, this phenomenon was first observed in the chorda tympani nerve and noted to be most dramatic for sodium-specialist fibers with high firing rates (Contreras, 1977), a specificity maintained centrally. The reduction in sodium chloride responses in sodium-specialist neurons is hypothesized to be part of the mechanism responsible for blunting the aversiveness of hypertonic saline and thus promoting consumption in the deprived state. Induction of a conditioned taste aversion (CTA) also influences brain stem taste cells. The influence of CTA is also evident in neurophysiological responses recorded from both NST (Chang & Scott, 1984) and PBN (Shimura, Tanaka, & Yamamoto, 1997; Shimura, Tokita, & Yamamoto, 2002). Increases and decreases in responsiveness have both been reported, the direction apparently dependent on the nature of the conditioned stimulus. The neural substrate for brain stem plasticity most likely includes local connections and forebrain influences. In NST, gut afferents in the vagus project to an almost entirely separate region caudal and medial to gustatory projections (e.g., Beckstead & Norgren, 1979; Hamilton & Norgren, 1984). However, anatomical data suggest a possible intranuclear projection from the caudal to the rostral NST (Karimnamazi, Travers, & Travers, 2002). The gustatory and visceral NST send projections to PBN that maintain significant topography, but also exhibit considerable overlap (Hermann & Rogers, 1985; Karimnamazi et al., 2002). Thus, the effects of the intravenous satiety factors, gastric distension, and intraduodenal lipids could be mediated in part through activation of vagal afferents or chemosensitive neurons in caudal NST/area postrema, with subsequent transmission to the rostral NST and/or to the gustatory PBN. Such a pathway would be consistent with the sufficiency of the brain stem for exhibiting behavioral changes after similar manipulations. On the other hand, alterations after sodium appetite or conditioned taste aversion are more likely to be mediated by forebrain projections or hormonal influences. Indeed, decerebration alters responses of brain stem taste neurons (Di Lorenzo,
c13.indd 280
1988; Mark, Scott, Chang, & Grill, 1988), and abolishes changes induced by CTA, as measured neurophysiologically (Tokita, Karadi, Shimura, & Yamamoto, 2004) or by Fos expression (Tokita, Shimura, Nakamura, Inoue, & Yamamoto, 2007). Several forebrain regions, including the insular cortex, lateral hypothalamus, central nucleus of the amygdala, and bed nucleus of the stria terminalis send robust, direct projections to the rostral NST (M. C. Whitehead, Bergula, & Holliday, 2000) and gustatory areas of the PBN (Saggu & Lundy, 2008). Furthermore, recent studies demonstrate that activation of forebrain regions powerfully modulates taste responses (e.g., Cho, Li, & Smith, 2003; Di Lorenzo & Monroe, 1992, 1995; C. S. Li, Cho, & Smith, 2005; R. F. Lundy & Norgren, 2004a; Smith, Ye, & Li, 2005). The neurochemical substrate for modulation is unknown, but in addition to the ubiquitous neurotransmitters, GABA and glutamate, the gustatory NST and PBN are supplied by fibers containing a variety of neuromodulatory substances including several neuropeptides and their receptors (see Bradley, 2008; R. F. Lundy & Norgren, 2004b, for recent reviews). Forebrain Lemniscal Pathway Similar to other sensory systems, except olfaction (see later discussion), ascending gustatory information terminates in a highly focused projection in a first-order (see Chapter 10) thalamic nucleus, the VPMpc, and VPMpc neurons in turn, synapse in a circumscribed cortical region, the dysgranular insular (rodents) or insular/opercular (primates) cortex. Presumably, VPMpc serves an important processing and gating role, as is apparent for homologous thalamic regions for other sensory systems. However, only a handful of careful studies have described thalamic responses (e.g., T. R. Scott & Erickson, 1971) and a clear picture of thalamic function has yet to emerge. Likewise, knowledge of thalamic circuitry is sparse, although, as for other first-order thalamic relays, the thalamic reticular nucleus provides a notable input (Hayama, Hashimoto, & Ogawa, 1994). More is known about gustatory cortex and the pace of discovery has accelerated in the past few years, particularly with the increasing use of chronic recording using microwire bundles that allow observations over many trials and simultaneously recorded cells. Earlier studies used either acute, anesthetized preparations or chronic recording with limited trials recorded on a cell-by-cell basis. Perhaps not surprisingly, the pictures that emerge from these two perspectives are quite different. When studied on a cell-by-cell basis with limited trials, most primary cortical taste neurons have been grouped into a limited number of types according to their optimal
8/17/09 2:08:58 PM
Taste
281
(A)
Water
Sucrose
HCl
Quinine
Mixture
0.01 M NaCl
0.03 M NaCl
0.1 M NaCl
0.3 M NaCl
0.5 M NaCl
Quinine
0.1 M NaCl
0.5 M NaCl
(B)
10 spikes/s MSG
10 s
Figure 13.8 Poststimulus time histograms (upper traces) for a gustatory-responsive neuron recorded from the insular cortex in an awake, freely moving rat. Note: This cell responded relatively specifically to sodium chloride, similar to the sodium chloride-specialist geniculate ganglion neuron shown in Figure 13.5. Increasing concentrations of sodium chloride produced systematic increases in the neural response. Responses in A were evoked when the animal drank from a sipper spout, licks are shown on the lower
taste quality (Figure 13.8). This is true, regardless of whether acute/anesthetized or chronic/awake rodents (e.g., Ogawa, Hasegawa, & Murayama, 1992; Yamamoto, Matsuo, Kiyomitsu, & Kitamura, 1989; Yamamoto, Yuyama, Kato, & Kawamura, 1985) or primates (e.g., Scott, Sienkiewicz, Rolls, & Yaxley, 1986a) are studied. Also like their brain stem counterparts, cortical neurons exhibit a range of tuning. There is no clear trend for tuning to sharpen or broaden (reviewed in Spector & Travers, 2005). Moreover, unlike the somatosensory (Chapter 14), visual (Chapter 11), or olfactory system (see later), there is little evidence that more complex stimulus configurations, for example, mixtures (Plata-Salaman, Smith-Swintosky, & Scott, 1996), become particularly potent. However, the intermingling of responsive and unresponsive neurons in much of primate cortex leaves a lingering suspicion that more appropriate stimuli or behavioral contexts may still be discovered. Also in contrast to other cortical regions, there have been no reports of a functional columnar organization. However, this has not been studied sufficiently because of the difficulty of making penetrations perpendicular to the gustatory cortical surface, inconveniently located very far laterally, and in primate, buried in the Sylvian fissure. Despite providing an overall picture similar to that in brain stem, these cell-by-cell studies reveal some cortical cells with novel characteristics. Ogawa and colleagues (Ogawa et al., 1992) reported that a subset of cortical neurons were “double-peaked.” These neurons did not show
c13.indd 281
trace. Responses in B were elicited by infusing stimuli though an intraoral cannula; jaw muscle activity is shown in the lower trace. Note that there was a minimal response to quinine under both conditions, demonstrating that the small response in A is not due to a lack of adequate stimulation resulting from less licking. From “Taste Responses of Cortical Neurons in Freely Ingesting Rats,” by T. Yamamoto et al., 1989, Journal of Neurophysiology, 61, p. 1246. Reprinted with permission.
the orderly response profiles typical throughout the gustatory system; that is, a systematic decline to either side of the optimal stimulus when the four standard stimuli are arranged from most to least preferred: sucrose, sodium chloride, hydrogen chloride, quinine. These novel profiles were taken as evidence for convergence. In addition, Yamamoto and his group (Yamamoto et al., 1989) reported that, although most cortical neurons had response profiles resembling those in the brain stem (Type I; see Figure 13.8), there was also a smaller group of cells (Type II) with opponent processing characteristics; that is, a response increment to preferred stimuli and a decrement to aversive stimuli, or vice versa. At lower levels of the neuraxis, not only are such on/off type responses rare, even frank response suppressions are uncommon. The more recent, multisite experiments reach starkly different conclusions. These studies find a strong trend toward very broad tuning; in fact overall, taste neurons are so broad that defining an optimal stimulus on the basis of response magnitude is not attempted (e.g., Fontanini & Katz, 2006; Stapleton, Lavine, Wolpert, Nicolelis, & Simon, 2006). A graphic example is depicted in Figure 13.9. This different perspective partly derives from the increased power gained by analyzing multiple trials and focusing on the temporal properties of the response. Indeed, in the seminal paper in this field, a major point was that the population of cortical gustatory neurons expands dramatically when the fine grain of spike timing is taken into account
8/17/09 2:08:58 PM
282
Chemical Senses MSG (M) 0.3
80 40 0
80 40 0 0.2
Spikes/sec
Sucrose (M) 0.3
NaCl (M) 0.3
0 0.1
80 40 0
80 40 0 0.2
0.2
0 0.1
0 0.075
80 40 0
0 0.075
0
0.2
0 0.075
0.2
80 40 0 0.2
0
0.2 0.2 Time (sec)
Figure 13.9 (Figure C.18 in color section) Raster displays and cumulative poststimulus time histograms for a gustatory-responsive neuron recorded from the insular cortex. Note: Like the cell in Figure 13.10, this neuron was recorded from an awake, licking rat. However, in this paradigm (water-deprived) rats licked at a dry sipper spout to receive a taste stimulus on every fifth lick. Aligned red symbols on the raster plots indicate the reinforced lick; that is, the point at which stimulation occurred; the other red symbols indicate the previous and following nonreinforced (dry) licks. If spike count summed over time was considered as the response measure, this neuron would
(Katz, Simon, & Nicolelis, 2001). Further potential for coding is apparent in temporal relationships between simultaneously recorded cell ensembles (Jones, Fontanini, Sadacca, Miller, & Katz, 2007). The precise nature of the putative spatiotemporal code is still emerging and appears to bear little resemblance to the obvious rhythmicity and synchrony exhibited by neurons in the olfactory system, as discussed later. One proposal is that the cortical spike train evolves in time in order to code different aspects of the fluid stimulus; that is, somatosensory (0 to 500 ms), taste quality (500 to 1,000 ms), and taste hedonics (1,000 to 2,500 ms; Katz et al., 2001). However, as shown in Figure 13.9, other studies find that taste quality is coded in the very early (200 ms) time period (Stapleton et al., 2006). Some of these apparent differences may relate to the varying behavioral paradigms under which animals are tested. Indeed, one study showed that gustatory cortical responses were labile, dependent on the rat’s state of “engagement” (Fontanini & Katz, 2006).
c13.indd 282
0.2
0.2
0 0.005
0.2
0.2
0 Water
0.2
0.2
0
0.2
80 40 0 0.2
0.2
80 40 0 0.2
0 0.1
80 40 0 0.2
0.2
80 40 0 0.2
0.2
80 40 0 0.2
Citric Acid (M) 0.01
80 40 0 0
0.2
be much more broadly tuned than the cell in Figure 13.10 and in fact, a differential response to taste questionable, since the cell also responds robustly to a nongustatory stimulus, water. However, when multiple trials were used to model the Poisson distribution of the fine grain of the spike counts (15 ms bins), a differential response emerged. Importantly, this information was available in the interval occurring during a single lick. From figure 7 in “Rapid Taste Responses in the Gustatory Cortex during Licking,” by J. R. Stapleton et al., 2006, Journal of Neuroscience, 26, p. 4132. Reprinted with permission.
Another novel approach that sheds a different perspective on cortical processing focuses on the spatial domain. Accolla and colleagues (Accolla et al., 2007) performed a rigorously controlled experiment that demonstrated a convincing chemotopy using intrinsic optical imaging in anesthetized rats. Responses to sucrose, sodium chloride, hydrochloric acid, and quinine were arranged in an overlapping rostral to caudal sequence (Figure 13.10). Because these investigators mainly stimulated the anterior tongue and palate, this chemotopy was not merely secondary to orotopy; in fact, if more regions of the mouth had been stimulated an even more striking cortical map of taste quality might have been observed (Yamamoto et al., 1980). This spatial organization is similar to that observed in earlier neurophysiological investigations in rats (Yamamoto et al., 1985) and monkeys (T. R. Scott, Yaxley, Sienkiewicz, & Rolls, 1986a). How this orderly chemotopy relates to the very broadly tuned neurons observed in the multisite studies has yet to be explained.
8/17/09 2:08:59 PM
Taste NaCl
Sucrose
NaCl Suc
NaCl CA
NaCl Quin
Suc CA
Suc Quin
Suc Quin
Figure 13.10 (Figure C.19 in color section) Spatial mapping of taste quality in the rat insular cortex using intrinsic optical imaging. Note: Tastants were applied (mainly to the anterior mouth) in anesthetized animals. (Top Panels) (blue background): Population maps for sodium chloride, sucrose, citric acid, and quinine based on 27 animals over 8 to 18 total presentations with each stimulus. (Bottom Panels) (black background): Comparison of each stimulus pair. For each pair, both overlap
As noted, cortical responses are plastic, depending on behavioral state (Fontanini & Katz, 2006) and insular responses in the rat also reflect satiety (de Araujo et al., 2006), similar to the brain stem. In the monkey, however, as in the medulla (Yaxley, Rolls, Sienkiewicz, & Scott, 1985), responses in the primary gustatory cortex seem unchanged by feeding the animal to satiety (Rolls, Scott, Sienkiewicz, & Yaxley, 1988). Because satiety does modulate higher-order gustatory neurons in the orbitofrontal cortex (Rolls, Sienkiewicz, & Yaxley, 1989, see below), initial interpretations suggested that this reflected a perceptual/ motivational split from primary to higher-order cortex. However, functional imaging studies in humans suggest that insular gustatory neurons are indeed modulated by satiety (Small, Zatorre, Dagher, Evans, & Jones-Gotman, 2001). Thus, further investigations may be necessary to reveal the full story on how homeostatic state affects gustatory responsiveness in the primate primary taste cortex. Learning also alters insular taste responses. A pioneering study (Yamamoto et al., 1989) observed the same cortical taste neurons in awake animals prior to and following induction of a CTA. Pairing a particular stimulus with subsequent intraperitoneal injection of a nausea-inducing agent produced varied changes. About 20% of the Type I cortical taste cells exhibited enhanced responses specific to the conditioned stimulus; that is, both response increments and decrements were magnified. In contrast, almost all of the Type II (hedonic) neurons were affected and in these cells
c13.indd 283
Citric Acid
283
Quinine
and segregation are apparent. The representation of sucrose and quinine are the most separate. For each panel, the white vertical line indicates the location of the middle cerebral artery; posterior is to the right; dorsal (medial) at the top of each figure. Scale bars 500 uM. From figure 5 in “Differential Spatial Representation of Taste Modalities in the Rat Gustatory Cortex,” by R. Accolla et al., 2007, Journal of Neuroscience, 27, p. 1401. Adapted with permission.
the CTA caused response direction to change, excitatory responses became inhibitory and vice versa. A more recent multisite investigation also reported conditioning-induced changes in about one third of cortical taste neurons, but in contrast to the earlier work, there was nearly ubiquitous suppression of the preconditioning responses. Furthermore, these suppressions were confined to the later, “hedonic” period of firing (Grossman, Fontanini, Wieskopf, & Katz, 2008). Intriguingly, the temporal aspects of this effect resembled one reported in a much earlier study of CTA effects on brain stem neurons, though the direction of the modulation was opposite. In the NST, a CTA produced an increase in the response to the conditioned stimulus, but one that occurred only with a “long” (1 second) latency (Chang & Scott, 1984). Finally, another recent investigation employed optical imaging to demonstrate that a CTA to saccharin changed the spatial topography of the response to one resembling that elicited by quinine. In other words, the shift entailed increases of activity in one region and decreases in another (Accolla & Carleton, 2008). Thus, there is little doubt that CTA produces profound cortical changes. However, further work is necessary not only to clarify the nature of those alterations, but also to discern their synaptic basis and contribution to behavioral changes. Limbic The limbic forebrain and hypothalamus are critical substrates for ingestion and other motivated behaviors. Eating
8/17/09 2:09:00 PM
284
Chemical Senses
and drinking culminate in relatively fixed, species-specific patterns of oromotor behaviors. However, these patterns are subject to modifications induced by learning and physiological needs and more complex environmental controls. Even more so, appetitive behaviors necessary for consumption must be flexible to correspond with alterations in internal state, environment, and past experience. Gustatory signals influence the ventral forebrain through both direct and indirect pathways. As discussed, in the rodent there are direct projections from the gustatory PBN to the amygdala, bed nucleus of the stria terminalis, lateral hypothalamus, and ventral pallidum (Bernard et al., 1993; Norgren, 1974; Norgren & Leonard, 1973). These connections are supplemented by projections from the insular cortex; in primates, the cortex is their main source (reviewed in Lundy & Norgren, 2004b; Pritchard & Norgren, 2004; Whitehead & Finger, 2008). Gustatory signals reach other limbic regions, of particular significance, the nucleus accumbens, a structure strongly implicated in reward and motivation. There are a variety of potential indirect routes through which taste signals could influence nucleus accumbens. One is a projection from the insular cortex (Reynolds & Zahm, 2005). However, this projection does not appear to originate in the primary taste cortex, as it does not arise from the dysgranular or granular regions where taste responses are most prominent (Kosar et al., 1986a; Ogawa, Ito, Murayama, & Hasegawa, 1990). Instead, the insularaccumbens projection arises from the juxtaposed agranular insular cortex (Reynolds & Zahm, 2005), which in turn receives inputs from the dysgranular (Shi & Cassell, 1998). On the other hand, functional work suggests that taste information may primarily influence accumbens neurons via ventral forebrain projections since sucrose-elicited dopamine release in accumbens was dampened by lesion of the parabrachial nucleus but not changed by thalamic lesions (Hajnal & Norgren, 2005). There are many descriptions of presumptive taste responses in limbic structures during consumption of food or fluid rewards in operant tasks. These same cells are often modulated by other unconditioned sensory stimuli, including olfaction, and by the conditioned stimulus or the operant response itself. Studies that focus on the details of gustatory processing are fewer. Compared to the thalamocortical pathway, gustatory-modulated neurons in limbic regions exhibit more convergence from nongustatory sources and exhibit more decrements in response to gustatory stimuli (reviewed in Spector & Travers, 2005). Lateral hypothalamic (Yan & Scott, 1996) and amygdalar (Nishijo, Uwano, Tamura, & Ono, 1998; T. R. Scott et al., 1993), neurons respond differentially to tastants but the critical variable appears to be hedonics instead of quality. This hypothesis is supported by the nearly ubiquitous influence
c13.indd 284
of satiety on taste responses in these regions (e.g., Burton, Rolls, & Mora, 1976; Yan & Scott, 1996). Contemporary studies have also turned their attention to the nucleus accumbens (Roitman, Wheeler, & Carelli, 2005; Taha & Fields, 2005) and ventral pallidum (Tindell, Smith, Pecina, Berridge, & Aldridge, 2006). Both hedonically positive (sucrose) and negative (quinine) stimuli delivered through intraoral cannulas drive accumbens neurons and responses are apparent on the very first presentation, demonstrating that gustatory responsiveness is innate (Roitman et al., 2005). Novel responses subsequently develop to the conditioned visual and auditory stimuli signaling tastant delivery further demonstrating the plasticity of these neurons. Similar to the hypothalamus and amygdala, palatability appears to be a critical stimulus dimension. In accumbens, sucrose and quinine not only drive different sets of cells, but the two sets of neurons respond differentially, that is, with response decrements and increments, respectively. A recent investigation (Figure 13.11) made the intriguing observation that these palatable flavor-elicited responses could be altered based on whether they served as a signal for the availability to self-deliver cocaine or saline (Wheeler et al., 2008). In this experiment, sweet saccharin was mixed with different flavors of Kool-Aid. After training, the flavor which served as the discriminative stimulus for saline was unaltered, but somewhat counterintuitively, the flavor that predicted cocaine now elicited response decrements, a switch concurrent with a change in the oromotor response to rejection (gaping) (Figure 13.11). These alterations in behavioral and neural responsiveness were hypothesized to be caused by a dysphoric state induced by the anticipation of drug withdrawal. This experiment not only illustrates the plasticity of accumbens responses but also emphasizes convergence between modalities since the different Kool-Aid flavors are largely distinguishable on the basis of olfaction. Additional insight regarding accumbens responses was derived from studying responses that occurred when waterdeprived rats licked water and varying concentrations of sucrose (Taha & Fields, 2005). Similar to what Roitman and colleagues observed, most sucrose responses were inhibitory. However, these inhibitory responses appeared more directly related to licking than to sensory parameters. Thus, varying sucrose concentration did not modulate the magnitude of the response decrements, and licking water or even a dry spout also produced suppressions. These neural suppressions were interpreted as “gating”; that is, allowing, an ingestive response to be made (Taha & Fields, 2005), a conclusion consistent with the feeding-stimulatory effects of inhibiting accumbens neurons pharmacologically (e.g., Pecina & Berridge, 2000). However, in agreement with Roitman’s report, Taha and Fields (Taha & Fields, 2005)
8/17/09 2:09:01 PM
Taste (A)
(C)
0
(B) Naive Orange
Naive Grape
Inhibitory n 17, 77%
Inhibitory n 17, 77%
Excitatory n 5, 23%
Excitatory n 5, 23%
(D)
CS
CS
Inhibitory n 34, 74%
Inhibitory n 18, 39%
Excitatory n 12, 26%
Excitatory n 28, 61%
CS
CS
4s
Figure 13.11 Effects of discriminative stimulus training for cocaine availability on flavor-induced responses of accumbens neurons. Note: (Top Panels) A and B: Under naive conditions, when saccharin is mixed with orange or grape flavoring, intraoral delivery of either stimulus primarily elicits firing decrements in accumbens neurons as is typical for palatable stimuli. C and D: After training, the stimulus that has been paired with the opportunity to self-deliver saline (intraveneously) still elicits response decrements, but the stimulus paired with cocaine elicits
also observed a small population of sucrose-driven excitatory responses and these cells exhibited positive concentration-response functions. Furthermore, a given sucrose concentration elicited larger responses when paired with water than with a higher sugar concentration. Such a “contrast” paradigm is well known to change the reward value of a gustatory stimulus (Flaherty, Turovsky, & Krauss, 1994). Thus, it seems reasonable to hypothesize that the sucrose-driven accumbens excitatory cells are more directly related to stimulus palatability. The ventral pallidum, a target of accumbens projections (reviewed in Zahm, 2000) also contains gustatoryresponsive neurons whose firing rates are modulated by
c13.indd 285
285
0
4s
mostly response increments. (Lower Panels) Behavioral changes in stimulus-induced oromotor responses under the two conditions, when the stimulus signals saline (left) or cocaine (right). Upper insets show still video frames of licking (left) and gaping (right); lower insets depict associated electromyographic recordings from a jaw-opening muscle. From figures 1 and 3 in “Behavioral and Electrophysiological Indices of Negative Affect Predict Cocaine Self-Administration,” by R. A. Wheeler et al., 2008, Neuron, 57, pp. 774–785. Reproduced with permission.
the hedonics. The effects of salt appetite are particularly compelling (Tindell et al., 2006). Before and after recovery from sodium deprivation, ventral pallidal neurons responded with higher firing rates to a preferred stimulus, 0.5 M sucrose, than to a stronger (1.5 M), but very aversive sodium chloride concentration. However, during the deprived state, firing rates to salt were equal to those elicited by sucrose. This neural change occurred in parallel with a switch in oromotor responses from gaping to licking. It is informative to compare these results to lower levels of the gustatory neuraxis, where sodium deprivation produced decreases in neural responses. It seems possible that the peripheral and brain stem decreases in responsiveness may
8/17/09 2:09:02 PM
286
Chemical Senses
function to decrease the aversiveness of high sodium concentrations, whereas the increases in accumbens responses are responsible for driving salt-seeking behavior. OLFACTION Transduction The Nobel Prize winning work of Buck and Axel (1991) demonstrated that olfactory transduction involves a (uniquely) large superfamily of G-protein coupled receptors (GPCRs), and that each olfactory receptor neuron expresses only one receptor type (but see Mombaerts, 2004c). The number of such genes varies greatly across species, ranging from 82 in chickens (Niimura & Nei, 2005) to several hundred in humans (Niimura & Nei, 2003), to well over a thousand in rodents and dogs (Quignon et al., 2003; Young et al., 2003; reviewed in Ache & Young, 2008). Despite the fact that a large number of these genes, including well over half in humans, are nonfunctional pseudogenes (Glusman, Yanai, Rubin, & Lancet, 2001; Zozulya, Echeverri, & Nguyen, 2001; reviewed in Mombaerts, 2004b), olfactory genes in all vertebrates represent a considerable proportion of the genome, perhaps 4% in mice, 1.4% in humans. Despite the remarkable achievement of characterizing the olfactory receptor genome, relatively little is known about the specificity of the ligands that activate these receptors. The majority of mammalian olfactory receptors are “orphans,” lacking an identified ligand (reviewed in Malnic, 2007). Unlike taste receptor GPCRs, in which specific receptors are associated with distinct classes of perception, for example, sweet, bitter, olfactory receptors respond to a wide variety of odorants with different chemical structures. This suggests that olfactory receptors only recognize specific parts of a given odorant and that it is the combination of activated receptors that produces the afferent signal to the brain. Many vertebrates also possess a second olfactory epithelium located in the vomeronasal organ (VNO) and the discovery of GPCRs on sensory neurons in the main olfactory epithelium was soon followed by the discovery of two additional unique superfamilies of G-protein coupled receptor in the VNO (Dulac & Axel, 1995; Herrada & Dulac, 1997; Matsunami & Buck, 1997). In addition, there are several smaller, less well-characterized olfactory sensory cells in the septal organ of Masera and the Grueneberg ganglion. Furthermore, not all olfactory receptors are GPCRs that rely exclusively on cAMP (Spehr, Spehr, et al., 2006). In addition, there is a class of vomeronasal olfactory receptors associated with members of the transient receptor protein superfamily (TRPC2; Kelliher, Spehr, Li, Zufall, & Leinders-Zufall, 2006; Liman, Corey, & Dulac, 1999).
c13.indd 286
The large number of genes and pseudogenes associated with olfaction engenders genetic variability as a ready explanation for the considerable variation in olfactory sensitivity within and between species. It is likely that polymorphisms in OR genes contribute to olfactory thresholds in mice, for example, that span four orders of magnitude (Joshi, Volkl, Shepherd, & Laska, 2006; Laska, Joshi, & Shepherd, 2006). In humans, variations in olfactory sensitivity range from specific anosmias (Whissell-Buechy & Amoore, 1973) to hyposmia to hyperosmia. The origins of these variations likely reflect genetic polymorphisms. A study in humans suggested that single nucleotide polymorphisms that are particularly susceptible to genetic variation could account for hyperosmia associated with the perspirant odorant isovalaric acid (Menashe et al., 2007). Across a population of 377 individuals, there was a close association between those individuals with hypersensitivity to isovalaric acid and a particular gene for the olfactory receptor OR11H7P. Similarly, estimates of the proportion of individuals perceiving the urinary metabolites of asparagus as malodorous range from 10% to 24% (Lison, Blondheim, & Melmed, 1980; reviewed in Mitchell, 2001). Genetic polymorphisms may also explain why only certain individuals produce the offending odor; with estimates ranging from 40% to 79% (Mitchell, Waring, Land, & Thorpe, 1987). Olfactory Bulb Processing Afferent fibers from olfactory receptor neurons (ORN) located in the main olfactory epithelium (MOE) enter the brain through the cribiform plate where they synapse in the glomerular layer of the main olfactory bulb (MOB, Figures 13.12 and 13.13; reviewed in Shepherd, Chen, & Greer, 2004; Shipley, Ennis, & Puche, 2004). The glomerulus is a spheroid concentration of neuropil demarcated by a shell of surrounding glia, that ranges from 100 um to 200 um dia in mammals. Within the neuropil is a dense matrix of synaptic connections between ORNs making contact with dendrites of olfactory bulb output neurons, the mitral and tufted cells (M/T), as well as the dendrites of juxtaglomerular neurons surrounding each glomerulus. The glomerular structure is conserved across phyla and can be recognized in the antennal lobe of many insects, as well as in the olfactory bulb of different classes of vertebrates, for example, amphibia and fishes (Hildebrand & Shepherd, 1997). The similarity of this structure in so many different species belies its importance in sensory processing and it is considered a fundamental, universal unit of olfactory processing. In vertebrates, there are multiple topographical relationships between the MOE and the glomeruli in the main olfactory bulb. The MOE itself can be divided into four zones, arranged dorsal to ventral based on the expression
8/17/09 2:09:03 PM
Olfaction (A)
(B)
287
(C)
AOB MOB
Glomerulus Olfactory sensory MOE neurons VNO
Figure 13.12 (Figure C.20 in color section) Olfactory receptor neurons expressing the same GPCR in the main olfactory epithelium (MOE) converge to one or a few glomeruli in the main olfactory bulb (MOB).
ORN
GL
PG ET
SA
ET PG
EPL
MC
MC MCL
GC
GCL
Figure 13.13 Schematic representation of the olfactory bulb indicating major layers, cell types, and circuits. Note: Excitatory synapses shown with solid arrows, inhibitory synapses with small filled circle. Major layers: EPL External plexiform layer, GCL Granule cell layer; GL Glomerular layer; MCL Mitral cell layer. Major cell types: ET External tufted cell; GC Granule cell; MC Mitral cell; PG Periglomular cell; SA Short-axon cell. From figure 3 in “Coding and Synaptic Processing of Sensory Information in the Glomerular Layer of the Olfactory Bulb,” by M. Wachowiak and M. T. Shipley, 2006, Seminars in Cell and Developmental Biology, 17, pp. 411–423. Adapted with permission.
c13.indd 287
Note: In contrast, GPCRs located in the vomeronasal organ (VNO) epithelium converge to multiple glomeruli in the accessory olfactory bulb (AOB). From figure 4 in “Genes and Ligands for Odorant, Vomeronasal, and Taste Receptors,” by P. Mombaerts, 2004a, Nature Reviews: Neuroscience, 5, 263–278. Reprinted with permission.
of odorant receptor genes (Sullivan, Adamson, Ressler, Kozak, & Buck, 1996), and olfactory neurons from these zones map loosely onto the olfactory bulb along the dorsal to ventral axis. Within each MOE zone, however, the 6,000 to 10,000 ORNs, each expressing a unique receptor, are more or less randomly distributed. The loose topography between the zones of MOE and MOB and the random distribution of classes of ORNs within each zone, gives way to a visually stunning molecular topography when genetic markers for a particular ORN are visualized (see Mombaerts et al., 1996, figure 4). The landmark achievement of two groups working independently (Ressler, Sullivan, & Buck, 1994; Vassar et al., 1994) demonstrated that each of the 6,000 to 10,000 ORNs expressing a unique G-protein coupled receptor converged onto only a few glomeruli, numbering in some cases as few as two, one located in the medial half of the bulb, the other in the lateral half (Figure 13.12a). What is the significance of this molecular specificity with regard to odorant quality coding? Despite the receptor specificity of ORNs, most respond to a wide range of odorants, implying that the receptor is responsive to only a part of a chemical compound, one that might be shared by odorants with vastly different perceptual qualities. Thus, odorant quality specificity is not to be found in the molecular specificity of the receptor. Rather, because
8/17/09 2:09:03 PM
288
Chemical Senses Decanal
“Relative” response
“Absolute” response
AlphaBenzaldehyde L-carvone phellandrene
Figure 13.14 (Figure C.21 in color section) Quantitative analysis of the uptake of 2-DG in the olfactory bulb in response to different odorants. Note: (Top Row) Absolute levels of activity measured against nonresponsive tissue. (Bottom Row) When relative values of activity are calculated
cells with identical receptors converge in discrete glomeruli, it is now axiomatic that it is the “combinatorial” pattern of activated glomeruli that is unique for a given odorant. Dependent measures of physiological activity, unit recording, 2-DG, optical imaging and c-fos expression all confirm that a given odorant, particularly at low concentration, activates only a small number of glomeruli (reviewed in Johnson & Leon, 2007) (Figure 13.14). The high degree of spatial convergence of ORNs onto specific glomeruli, and the fact that a unique spatial array of glomeruli are activated by a given odorant, however, does not in itself dictate that there is a spatial code for odor quality per se, that is, that location matters. Although the degree to which these maps have significance for quality coding is not universally accepted, they reveal important organizational properties of the olfactory bulb itself. Odorant activated maps are clearly not random. Other studies, however, have demonstrated strong dynamic, temporal components to olfactory responses that may contribute to quality discrimination and learning (reviewed in Kepecs, Uchida, & Mainen, 2006; Laurent et al., 2001; Schaefer & Margrie, 2007). Nevertheless, the anatomical singularity of glomeruli across phyla, together with their quantal-like innervation by a single receptor type has led to wide acceptance of the glomerulus as a functional unit of olfactory processing. As a functional unit, two major levels of organization are relevant to the question of olfactory coding: first, are glomeruli spatially organized?, and second, what processing takes place within the glomerulus?
c13.indd 288
1-pentanol
Santalol
Valeric acid
GL/SEZ 3.03.5 2.53.0 2.02.5 1.52.0 1.01.5 0.51.0 0.00.5 0.50.0 1.00.5 1.51.0
Z score 45 34 23 12 01 10 21
by a Z transformation, different activated regions are more pronounced as indicated by yellow From figure 8 in “Chemotopic Odorant Coding in a Mammalian Olfactory System,” by B. A. Johnson and M. Leon, 2007, Journal of Comparative Neurology, 503, p. 56. Reprinted with permission.
Spatial Organization of Glomeruli: Maps Although there is no single physicochemical dimension to odorant stimuli, odorants with similar structural characteristics activate similar “clusters” of glomeruli that may be organized into “modules” or “domains” (reviewed in Johnson & Leon, 2007). These modules, and even in some cases individual glomeruli, appear relatively invariant across individuals, perhaps indicative of a spatial map for quality representation. Johnson and Leon have identified approximately nine such modules including those responsive to carboxylic acids, primary alcohols, aromatic hydrocarbons or compounds with high water solubility. Within these modules, there is further spatial organization. Systematically increasing the carbon number of a compound, for example, progressively alters the location of the activated glomeruli from dorsal to ventral within the module. Increasing odorant concentration recruits additional glomeruli, frequently outside the confines of a module (Stewart, Kauer, & Shepherd, 1979). Although in some instances, this increase in intensity may be associated with a change in odorant quality, this is not always the case Thus, the spatial code for odorant quality is not necessarily restricted to a single domain within the bulb, and activation of glomeruli across the entire olfactory bulb, may reflect a (second) more “global” level of spatial mapping (Johnson & Leon, 2007). However, even across concentrations there is still a degree of spatial specificity, and it is possible to look across the entire glomerular layer of
8/17/09 2:09:07 PM
Olfaction
the bulb and observe that similarly structured compounds activate similar patterns of glomeruli. The observation that odorants with similar chemical structures (or qualities) activate neighboring glomeruli draws attention to the possibility of functional interactions. Functional interactions between adjacent glomeruli, such as lateral inhibition that might serve to sharpen the glomerular map to a given odorant, would certainly imply that location is important in the representation of odorant quality. Glomerular Processing Incoming afferents from ORNs form monosynaptic glutamatergic connections with the apical dendrites of the ≈15 mitral and tufted cells that populate a given glomerulus (Shipley et al., 2004). In addition, incoming afferents form axodendritic synapses on juxtaglomerulur neurons within glomeruluar neuropil (Figure 13.13). These classical axodendritic synapses, however, are only a fraction of the synaptic contacts within the glomerulus. Fundamental to glomerular processing is the presence of an intricate assemblage of more unusual connections, excitatory and inhibitory dendro-dendritic synapses between and among juxtaglomerular and M/T neurons that shape olfactory bulb output. Thus, many of the putative circuits involving intra-glomerular processing require the backpropagation of somatic action potentials for dendritic release of excitatory and inhibitory neurotransmitters (Chen & Shepherd, 2005; G. Scott et al., 2003; Shepherd et al., 2004). Three classes of juxtaglomerular neurons contribute to glomerular processing (reviewed in Hayar, Karnup, Ennis, & Shipley, 2004; Hayar, Karnup, Shipley, & Ennis, 2004; Figure 13.13). Periglomular neurons (PG) constitute the largest class and have a primarily local (intraglomerular) GABAergic inhibitory function; external tufted neurons (ET) have a local intraglomerular excitatory function, and short axon neurons (SA) provide excitatory interglomerular connections (Aungst et al., 2003). There are extensive dendrodendritic connections among these juxtaglomerular neurons and together with afferent ORN terminals there is a bewildering array of potential synaptic glomerular interactions from which numerous putative circuits can be extracted (Figure 13.13). These intraglomerular circuits appear designed to both amplify the sensory signal and improve the signal to noise ratio (Chen & Shepherd, 2005; Wachowiak & Shipley, 2006). One form of signal amplification comes from the massive convergence of the 6,000 to 10,000 ORNs, all with the same receptor profile, that each release glutamate onto M/T neuron dendrites within a given glomerulus. This convergence combined with the presence of gap junctions
c13.indd 289
289
between M/T neurons (Kosaka & Kosaka, 2005) provides a highly focused input. But this intense input is not without regulation and glomerular processing is subject to presynaptic, feedforward and lateral inhibition. Following depolarization from afferent stimulation, M/T cells display a delayed hyperpolarization, indicative of inhibitory processes (Wachowiak & Shipley, 2006). One source of feed-forward inhibition is from direct ORN activation of PG neurons that form inhibitory dendro-dendritic synapses on M/T neurons. However, most ORN terminals are onto ET (excitatory) interneurons, such that a indirect source of feedforward inhibition is via ORN projections to external tufted (ET) that excite PG (inhibitory) connections to M/T dendrites (Hayar, Karnup, Ennis, et al., 2004). Other types of inhibition that serve to limit overall excitability within a glomerulus include presynaptic inhibition, perhaps originating from PG neurons, that activate GABAB and dopamine receptors on ORN terminals (AroniadouAnderjaska, Zhou, Priest, Ennis, & Shipley, 2000; Ennis et al., 2001; Murphy, Darcy, & Isaacson, 2005). Lateral inhibition between glomeruli can potentially occur in both the glomerular and the external plexiform layers (EPL; Figure 13.13). Adjacent glomuruli are synaptically linked in the EPL via reciprocal synapses between the apical dendrites of granule cells and the lateral dendrites of M/T cells (Shipley et al., 2004). Recording from M/T neurons in vivo, Yokoi demonstrated that iontophoresis of either inhibitory (GABAA) or excitatory (AMPA) receptor antagonists broadened the response profile of neurons to odorants, both by increasing excitatory responses to odorants and by suppressing inhibitory responses (Yokoi, Mori, & Nakanishi, 1995). This effect was interpreted as interfering with dendrodendritic synapses between M/T neurons and granule cells in the EPL, that is AMPA receptor antagonists blocked the excitatory M/T—granule cell synapse, thus preventing granule cell inhibition of M/T dendrites, whereas the GABAA antagonist directly blocked the inhibitory synapse between granule cells and M/T neurons. Lateral inhibition between glomeruli can also occur within the glomerular layer. In vitro slice recordings in which the contribution of the external plexiform layer was eliminated with microcuts, showed that inhibitory postsynaptic potentials could be evoked from a neighboring glomerulus in response to excitation of a nearby glomerulus. Anatomically, it was demonstrated that SA neurons were the major source of interglomerular interactions but because these neurons are excitatory, it was postulated that they contact GABAergic PG neurons to effect the inhibition (Aungst et al., 2003). Because ORN axons do not directly synapse on short axon neurons, it is likely that this lateral inhibition is mediated via ORN projections onto
8/17/09 2:09:09 PM
290
Chemical Senses
ET neurons that, in turn drive SA neurons (Wachowiak & Shipley, 2006). A third source of lateral inhibition between glomeruli could involve presynaptic inhibition. Calcium imaging of an intact olfactory bulb in response to natural odorants showed that GABAB antagonists differentially affected the excitability of highly activated glomeruli compared to more weakly activated surround glomeruli (Vucinic, Cohen, & Kosmidis, 2006). Specifically, when the GABAB antagonist CGP46381 was applied to the exposed bulb, weakly activated glomeruli were more excited compared to glomeruli that were highly activated under control conditions. The implication is that highly activated glomeruli excite inhibitory PG neurons via the ET-SA—PG pathway, and that “spillover” of PG GABA activates GABAB receptors on ORN terminals. One of the more intriguing characteristics of ET neurons is that they have intrinsic bursting properties (Hayar et al., 2004), a characteristic that could account for the long lasting depolarizations in mitral cells (Carlson, Shipley, & Keller, 2000; Wachowiak & Shipley, 2006). At a more speculative level, it has been proposed that the resonant, bursting property of ET cells could facilitate a properly timed phasic sensory input, that is, an animal sniffing. However, the bursting activity of ET cells is also likely to lead to phasic inhibition of M/T neurons via ET projections onto inhibitory PG neurons. Thus, ET phasic activity could modulate M/T excitability to produce a “window of opportunity” that further enhances a given sensory input (Wachowiak & Shipley, 2006). Thus, intense activation of a glomerulus with an odorant can suppress adjacent glomerular activity that is more weakly stimulated. This suppression of adjacent glomeruli, together with the several mechanisms that serve to amplify activity within a glomerulus, massive ORN convergence, synchronization of intraglomerular mitral cells via gap junctions, or ET bursting, serve to heighten activity of a given glomerulus glomerular while “sharpening” glomerular output relative to more weakly activated surrounding glomeruli. This “winner take all” scenario (Chen & Shepherd, 2005; Wachowiak & Shipley, 2006) based on lateral inhibition is a powerful argument in favor of a spatial code, and one might therefore assume that lesions restricted to one part of the bulb or another would leave a hole in the olfactory map, that is, produce specific anosmias. However, this does not appear to be the case and specific deficits following olfactory bulb lesions are hard to demonstrate. Rather, a “mass action” effect is more descriptive of lesions, in which progressively larger lesions increasingly blunt olfactory discrimination in general (Johnson & Leon, 2007). In a recent review, however, Johnson and Leon (2007) argue that lesion studies have not yet adequately tested spatial
c13.indd 290
specificity. For example, they note the lack of an effect of olfactory bulb lesions on the discrimination of different enantiomers of carvone. These lesions were restricted to the dorsal bulb and no problems with discrimination were detected (Slotnick & Bisulco, 2003) Based on 2-DG mapping studies, however, different enantiomers of carvone produced a more differentiated pattern of glomerular activation in the posterior ventromedial glomerular layer compared to the highly similar glomerular patterns in the dorsal aspect of the bulb; that is, the locations where lesions were made. Thus, Johnson and Leon predict that lesions made in the posterior ventromedial layer might well specifically impair carvone discrimination. In short, despite the overwhelming anatomical and physiological evidence for lateral inhibition in the olfactory bulb, evidence that this inhibition actually increases or sharpens behavioral discriminability awaits further investigation. Temporal Processing Much as recognition of a high degree of spatial organization in the olfactory system (e.g., glomerular structure) dates back to the earliest anatomical studies of Cajal, observations of temporal patterning date back to early recordings from the olfactory bulb where oscillating field potentials were observed at the respiratory rhythm (Adrian, 1950). A relationship between the respiratory rhythm and olfactory neuron responsiveness is clearly evident in M/T cells (Macrides & Chorover, 1972; Walsh, 1956), but can also be observed across the olfactory epithelium (Chaput, 2000). Within the olfactory bulb, M/T neurons show clear respiratory phase locking (Macrides & Chorover, 1972; Walsh, 1956) and a variety of patterns of excitation and inhibition have been described (reviewed in Buonviso, Amat, & Litaudon, 2006). Some cells show simple action potential burst patterns of excitation or inhibition, while others show more complex mixtures of these responses. Some of this activity clearly relates to the peripheral input. In vivo whole cell recording from M/T neurons revealed bursts of action potentials riding on rhythmic excitatory post synaptic potentials (EPSPs) locked to the respiratory rhythm in response to odorants (Cang & Isaacson, 2003). In those cells that showed odorant related suppression of action potential bursts, subthreshold EPSPs were still present, suggesting that glomerular inhibitory circuits were suppressing afferent excitation (Figure 13.15). In animals that don’t sniff, other forms of phasic modulation are present, for example, fish “cough” and insects “flick” (Dethier, 1987). Thus, much as a glomerulus represents a spatial unit of olfactory processing, so too might a sniff represent a dynamic unit (Kepecs et al., 2006). Although no one questions the fundamental phasic relationship between olfactory neuron activity
8/17/09 2:09:10 PM
Olfaction (A)
(B) AA 10% AA 10% 10 mv 1s
Figure 13.15 Whole-cell recordings from mitral or tufted cells in the olfactory bulb show bursts of activity in phase with respiration in response to odorant stimulation. Note: Respiratory rhythm is shown below each cell response. (A) Response of one cell to amylacetate produces bursts of activity in phase with respiration. (B) A cell that was inhibited by amyl acetate stimulation shows subthreshold, respiratory-timed responses, suggesting that inhibition is of central, presumably glomerulus origin. From figure 2 in “In Vivo Whole-Cell Recording of Odor-Evoked Synaptic Transmission in the Rat Olfactory Bulb,” by J. Cang and J. S. Isaacson, 2003, Journal of Neuroscience, 23, p. 4110. Reprinted with permission.
and respiration, the origin and functional significance of this relationship is still open to question. Superimposed on the question of the origin and significance of the relationship between respiration and olfactory responses, are experiments demonstrating that odor discrimination takes place within a sniff cycle (Abraham et al., 2004; Rinberg, Koulakov, & Gelperin, 2006; Uchida & Mainen, 2003), thus putting some constraints on a temporal code. With regard to mechanisms, a recent study (Grosmaitre, Santarelli, Tan, Luo, & Ma, 2007) suggests that approximately 50% of olfactory sensory neurons have a mechanical sensitivity that could underlie respiratory entrainment. Patch recordings revealed a cAMP-dependent sensitivity that tracked the strength of odorless puffs delivered to the cell. A knock-out mouse lacking a cyclic-nucleotide-gated channel lacked this sensitivity and failed to show respiratory entrainment in the main olfactory bulb. Although peripheral input clearly plays an important role in olfactory bulb entrainment with respiration, it is likely that centrifugal influences also play a role because dissociation of the bulb from the rest of the brain reduces olfactory bulb respiratory synchronization (M. Chaput, 1983; Potter & Chorover, 1976). Further contributions to respiratory synchronization come from intrinsic circuitry within the bulb as well as intrinsic membrane properties of glomerular neurons since rhythmic activity can be recorded in slice preparations, devoid of both phasic peripheral input and centrifugal influences. Electrical stimulation of olfactory afferents elicited 2 Hz oscillations in M/T cells (Schoppa & Westbrook, 2001) and ET cells burst in the theta range due to intrinsic membrane properties (Hayar et al., 2004). Synchronization via excitatory intraglomerular processing is suggested by the tight cross-correlations obtained
c13.indd 291
291
between M/T neurons belonging to the same glomerular unit (Buonviso, Chaput, & Berthommier, 1992). The functional significance of respiratory time-locked activity in the theta range (4 to 9 cycles/sec) has been the subject of much speculation (A. T. Schaefer & Margrie, 2007; J. W. Scott, 2006). Does it simply reflect the active pursuit of odorants that occurs when an animal modulates its respiration rhythm to engage in sniffing, or does synchronized activity reflect an underlying dynamic principle of temporal coding? One of the constraints on a theory of temporal coding is evidence that animals can make odorant discriminations within one sniff cycle (Abraham et al., 2004; Rinberg et al., 2006; Uchida & Mainen, 2003) reviewed in (Kepecs et al., 2006; Uchida, Kepecs, & Mainen, 2006). Estimates from these studies suggest that odor discrimination can take place in less than 100 msec, although difficult tasks might take somewhat longer. But even within a single sniff cycle, temporal properties of the spike train can potentially provide information. Response latency is potentially one such property. Increasing odorant concentration shortens the latency of the first spike in a mitral cell burst and increases the number of action potentials within the burst; however, the instantaneous firing rate remains relatively constant at 40 Hz (Cang & Isaacson, 2003; Margrie & Schaefer, 2003). Because latency is a reflection of stimulus strength, it might be argued that a given odorant will produce a variety of latencies in different glomeruli depending on the affinity of the ORN to that odorant. The shortest latencies might occur with an oscillating postsynaptic membrane, such that a stronger input reaches threshold earlier (Hopfield, 1995). Thus, a spatio-temporal pattern is produced when strongly activated glomeruli also have shorter latencies (A. T. Schaefer & Margrie, 2007). A potential advantage of a latency code is that it can be “read” earlier than the time required for assessing the cumulative action potentials emitted over a sniff cycle. Stronger stimuli that result in earlier latencies can thus ensure rapid discrimination. More subtle discrimination might require the inclusion of more weakly activated glomeruli recruited later in the cycle. Although low-frequency phasic activity in the olfactory bulb can contribute to information processing to the extent threshold influences latency, other studies point to yet higher-frequency oscillations as a source or reflection of temporal coding. Superimposed over theta range activity are higher-frequency single-cell responses in the beta range, 20 to 35 Hz and an even higher gamma range, 35 to 100 Hz. These higher frequencies can also be observed recording local field potentials where gamma activity can be seen riding on theta activity (Barrie, Freeman, & Lenhart, 1996). Gamma activity might originate from
8/17/09 2:09:10 PM
292
Chemical Senses
when oscillatory activity in the antennal lobe of the honeybee was blocked with perfusion of GABAA antagonists, discrimination was impaired (Stopfer, Bhagavan, Smith, & Laurent, 1997). Using a GABAA receptor beta 3 subunit knockout mouse, there was enhanced gamma oscillatory activity in the olfactory bulb but the change in performance on correlated behavioral discrimination tasks appeared somewhat ambiguous (Nusser, Kay, Laurent, Homanics, & Mody, 2001).
(A)
OB PC raw Odor
PC (B)
Vomeronasal System and Pheromones OB raw OB PC raw PC
Odor 0
1
2
3 4 Trial time (s)
5
6
Figure 13.16 Olfactory bulb oscillatory activity in the gamma range increases in power during a difficult odor discrimination task (B) compared to a simple discrimination task (A). Note: Arrow indicates increase in gamma activity just after the odorant stimulus-evoked potential (top trace in A and B). From figure 2 in “Olfactory Bulb Gamma Oscillations Are Enhanced with Task Demands,” by J. Beshel et al., 2007, Journal of Neuroscience, 27, p. 8360. Reprinted with permission.
mitral cells that fire at a preferred 40 Hz frequency (A. T. Schaefer & Margrie, 2007). Although evidence for temporal coding in the olfactory system of invertebrates has been more extensively investigated (Laurent et al., 2001), several studies provide indirect evidence for a functional role for beta and gamma activity in vertebrate olfactory learning and memory. Local field potentials recorded in the olfactory bulb of rats learning an olfactory discrimination showed enhanced beta but relatively little gamma activity when rats had mastered the task (Martin, Gervais, Hugues, Messaoudi, & Ravel, 2004). A somewhat opposite conclusion was reached in an experiment that compared easy (coarse) versus difficult (fine) olfactory discriminations (Beshel, Kopell, & Kay, 2007). Here, it was evident that the more difficult discrimination was associated with enhanced olfactory bulb gamma activity but changes in beta activity were not observed (Figure 13.16). These studies are correlational and the investigators concede that the oscillatory activity may not be necessary to the task; for example, gamma activity was not always observed in the early trials of a block. Several studies have tried to observe changes in discrimination ability when oscillatory activity is disrupted. For example,
c13.indd 292
The vomeronasal epithelium lies at the dead-end of a tube in the nasal septum. The difficulty of stimuli reaching these receptors is overcome by a sympathetically mediated rhythmic vasoconstriction that pumps chemicals into the lumen (Meredith, 1994) and thus the conventional view that vomeronasal stimulation requires “contact,” as occurs during conspecific exploration for potential mates. However, there is evidence that vomeronasal receptors can respond to “traditional,” volatile olfactory stimuli and that the main olfactory system is also involved in pheromone detection (Brennan & Zufall, 2006; Mombaerts, 2004a; Zufall & Leinders-Zufall, 2007). The more recent formulation is that there are multiple olfactory systems with overlapping functions that operate in parallel (Breer, Fleischer, & Strotmann, 2006; Spehr, Spehr, et al., 2006). Two superfamilies of GPCRs are segregated within the vomeronasal epithelium (Figure 13.12B). One class (V1s) is located in the superficial (apical) zone of the epithelium, and a second class (V2rs) is located in the deep (basal) zone. The receptor systems differ in structure and receptivity to ligands. More recent work suggests that V1rs consist of 191 functioning genes (out of 308 total; X. Zhang, Zhang, & Firestein, 2007) that are primarily responsive to small organic compounds (Leinders-Zufall et al., 2000). Of the 280 V2 receptor genes, 120 appear functional (Young & Trask, 2007). Both families are GPCRs but V2rs are unique among olfactory receptors in that they possess a very long extracellular N-terminus. Axons from VNO neurons expressing these receptors project to the accessory olfactory bulb located caudal to the main olfactory bulb. Unlike neurons projecting to the main olfactory epithelium, VNO neurons expressing the same receptor project to multiple glomeruli in the accessory olfactory bulb (Belluscio, Koentges, Axel, & Dulac, 1999; Rodriguez, Feinstein, & Mombaerts, 1999; Figure 13.12B). One of the more intriguing aspects of V2rs is that they respond to MHC class 1 peptides (Leinders-Zufall et al., 2004). These nonvolatile compounds that are associated with immune function are found in body fluids such as urine and milk and provide a unique, genetic-based
8/17/09 2:09:11 PM
Olfaction
Angle (degrees)
0
B
H-2b
0 16
90
90
80
180
70
C
H-2bm8
270 0
60 1,000
2,000
Rostro-caudal distance (um)
360
1,000
2,000
Rostro-caudal distance (um)
Figure 13.17 (Figure C. 22 in color section) Quantitative analysis of c-fos expression in the olfactory bulb of a female mouse in response to urine from male mice with modified MHC gene. Note: Wild-type mouse with H-2b gene and its spontaneous mutant H-2bm8 produce class I glycoproteins that impart variation in body scent. From figure 3 in “Olfactory Fingerprints for Major Histocompatibility Complex-Determined Body Odors II: Relationship among Odor Maps, Genetics, Odor Composition, and Behavior,” by M. L. Schaefer, K. Yamazaki, K. Osada, D. Restrepo, and G. K. Beauchamp, 2002, Journal of Neuroscience, 22, p. 9516. Reprinted with permisison.
identification of an individual. Among other functions, receptors sensitive to MHC peptides mediate the Bruce effect in mice, a condition in which pregnancy is terminated when a pregnant female is exposed to the urine of a nonparent male (Bruce, 1959). Thus, when the urine of the mating male is adulterated with foreign MHC peptides, the previously benign urine is now effective (LeindersZufall et al., 2004). MHC sensitive receptors are not limited to the vomeronasal system and are evident in the main olfactory system as well (Spehr, Kelliher, et al., 2006; Figure 13.17). However lesions of the VNO alone are effective in suppressing the Bruce effect (Kelliher et al., 2006). MHC peptides are likely candidates to mediate other olfactory-based social and reproductive functions as well. For example, mice are more likely to mate with conspecifics with dissimilar MHCs, and dams are more likely to retrieve pups with similar MHCs (reviewed in Brennan & Zufall, 2006). Pheromone detection in general, and responsiveness to MHC peptides in particular, is not restricted to the vomeronsal system, and numerous studies now confirm a role of the MOB in pheromone identification (reviewed in Shepherd, 2006; see also Brennan & Zufall, 2006; Spehr, Kelliher, et al., 2006). Mating preference, for example, may utilize MHC receptors in the MOB. Mutant mice lacking CNga2, a membrane channel associated with cAMP receptors in the main olfactory bulb, but not found in the vomeronasal organ, showed deficits in MHC stimulusevoked field potentials in the olfactory bulb as well as deficits in mating behavior (Mandiyan, Coats, & Shah, 2005; Spehr, Kelliher, et al., 2006). Anatomical tracing studies further suggest that the MOB has specific hypothalamic projections that initiate classical pheromone-induced behavior. Luteinizing hormone-releasing hormone (LHRH)
c13.indd 293
293
is essential for female reproduction and mating and is secreted by specific populations of hypothalamic neurons. Using the transynaptic tracing properties of viruses, Yoon, Enquist, and Dulac (2005) engineered a pseudorabies virus that was only expressed by LHRH neurons. They determined that LHRH cells received second or third order projections from the main olfactory bulb but no projections from the accessory olfactory bulb. The detection and functional effects of pheromones would seem to epitomize labeled line coding in that a specific receptor/ligand interaction leads to activation or release of a single, often stereotyped, behavior. Although both the VNO and the main olfactory systems contribute to pheromone detection, evidence is accumulating that they process olfactory signals differently. Unlike sensory neurons in the main olfactory sensory epithelium, vomeronasal sensory neurons appear highly specific. Using confocal imaging and patch recording, Leinders-Zufall et al. (2000) demonstrated that each of six putative pheromones from male or female urine elicited responses from a very small subset of VNO sensory cells and at very low concentrations. Individual cells were responsive to only one stimulus and coded the intensity of the stimulus over a range of concentrations, but higher concentrations did not recruit new cells. Neither did these cells respond to control stimuli in which the stimulating peptide was structurally altered. The location of these cells in the dorsal epithelium suggested that they were V1 receptors. Similarly, in response to two different MHC peptides, calcium imaging of V2 receptors in the ventral epithelium showed narrow tuning properties and a high degree of specificity (Leinders-Zufall et al., 2004). Only about 1.6% of the cells were responsive to the MHC peptides, and of these, only 0.4% responded to both. The main olfactory epithelium responds to MHC peptides as well, but with less specificity (Spehr, Kelliher, et al., 2006). Here, sensory responses had a threshold about two orders of magnitude higher than their vomeronasal counterparts and also responded at higher concentrations to the control peptides. Chronic unit recording also suggests that pheromone detection is processed differently in the AOB compared to the MOB (Luo, Fee, & Katz, 2003). The response properties of single cells in the AOB of mature mice were tested with stimuli originating from a lightly anesthetized female or male mouse of the same or different strain placed in the testing chamber, or with volatile olfactory stimuli presented on cotton swabs. Unlike neurons in the MOB, AOB neurons did not respond to the volatile stimuli, and instead, required physical contact with the mouse “stimulus.” Responses in the AOB were highly specific with the majority of neurons responding to only a single male/female X strain/nonstrain pair. AOB responses had very long latencies compared to volatile stimulus-evoked
8/17/09 2:09:11 PM
294
Chemical Senses
responses in the MOB and may reflect the trade-off of speed for a highly specific “labeled line” system. The complimentary and overlapping sensitivity of the MOB and AOB to pheromones extends to more central connections. Although the immediate projections of the MOB and AOB show little overlap, secondary projections converge in the medial amygdala that forms an important “nexus” for integration of these signals with projections to the hypothalamus and other structures mediating neuroendocrine and motivated behavior (Brennan & Zufall, 2006). Olfactory Cortex Output from the olfactory bulb is carried by projections from M/T axons that reach a number of rostral telencephalic structures collectively referred to as the olfactory cortex (Neville & Haberly, 2004; Price, 1973; Shipley et al., 2004). The list is long and includes the anterior olfactory cortex, tenia tecta, dorsal peduncular cortex, piriform cortex, olfactory tubercle, cortical amygdala, agranular insula, and entorhinal cortex. These structures have been parsed based on their anterior, medial, or lateral location (Shipley et al., 2004), whether they are of cortical or striatal origin (Wilson, Kadohisa, & Fletcher, 2006), or whether they have the architectonic structure characteristic of paleocortex (Neville & Haberly, 2004). Although the agranular insula and entorhinal cortex represent a transitional form of the cortex between the three-layered paleocortex and the sixlayered neocortex (Neville & Haberly, 2004), unambiguous olfactory neocortical representation in the orbitofrontal cortex originates from secondary olfactory cortex projections and from (tertiary) dorsal medial thalamic projections that are the target of cortical amygdala and piriform efferents. Piriform cortex (PC) is the largest of the olfactory cortex structures and is the best characterized. Mitral and tifted cell axons leave the olfactory bulb and form the lateral olfactory tract (LOT) that serves as a boundary between several PC subdivisions. Piriform cortex coincident with the tract is anterior PC (aPC), piriform cortex posterior to the tract is posterior PC (pPC). The LOT further demarcates a dorsal anterior PC, dorsal to the LOT, from a ventral anterior PC ventral to the tract (Neville & Haberly, 2004). There is a high degree of segregation between the pyramidal cell dendritic targets of olfactory bulb efferents and intracortical (association) projections. The former synapse on distal dendrites in layer Ia; the latter on more proximal sites in layer Ib. This pattern of innervation may form a fundamental substrate for olfactory learning and memory. Both types of synapses support NMDA-mediated long-term potentiation, that is, a short theta frequency burst of stimulation applied to either class of afferent fibers increases the magnitude of the postsynaptic response (Jung, Larson, &
c13.indd 294
Lynch, 1990; Kanter & Haberly, 1990) and both rodent (e.g., Staubli, Fraser, Faraday, & Lynch, 1987) and human lesion studies (e.g., Dade, Zatorre, & Jones-Gotman, 2002) support a role for the piriform cortex in olfactory learning. Nor has it gone unnoticed that the theta frequency for eliciting long-term potentiation (LTP) corresponds roughly to the range of olfactory sniffing. As with olfactory bulb activity, PC neurons produce bursts of olfactory induced activity in phase with respiration (Neville & Haberly, 2004; Wilson et al., 2006). Further controlling the induction of some forms of LTP is the removal of inhibition. Long-term potentiation involving simultaneous activation of both OB afferent and association fiber inputs requires the blockade of GABAA receptors (Kanter & Haberly, 1993). Local GABAergic neurons within PC may mediate this action and several classes of GABAergic interneurons in PC provide a substrate for both feedforward or feedback inhibition of pyramidal cell responses (reviewed in Neville & Haberly, 2004; Suzuki & Bekkers, 2007). Local inhibitory interneurons could also play a role in hyperpolarizations observed between respiratory bursts (Wilson et al., 2006) as well as beta oscillatory activity (Neville & Haberly, 2003). The high degree of molecular specificity conferred by the convergence of olfactory receptor cells onto discrete olfactory bulb glomeruli does not appear to be maintained in the olfactory cortex. Functional mapping studies of PC using 2-DG and Fos expression indicate that individual odors evoke activity over large areas (Cattarelli, Astic, & Kauer, 1988; Illig & Haberly, 2003). Within the dorsal aPC, there is a rostral to caudal gradient such that higher concentrations of olfactory stimuli recruit successively caudal regions (Sugai, Miyazawa, Fukuda, Yoshimura, & Onoda, 2005). Studies are beginning to show more explicitly how the olfactory cortex processes sensory input in the service of higher cognitive function such as olfactory learning and memory. Several studies indicate that M/T neurons from different OB glomeruli converge in the olfactory cortex. Lei, Mooney, and Katz (2006) directly compared single unit responses from M/T neurons and anterior olfactory nucleus neurons to the same battery of olfactory stimuli consisting of binary olfactory mixtures and the mixture components. Overall, cortical neurons appeared more broadly tuned than OB neurons. Although OB neurons responded to single mixtures about as often as their cortical counterparts, individual cortical neurons responded to a variety of mixtures compared to OB neurons and were more responsive to the individual components of a mixture. Thus, many more OB neurons were responsive to just one of the components of a mixture compared to cortical neurons, and many cortical neurons responded to four or even five of the individual
8/17/09 2:09:12 PM
Orbital Frontal Cortex and Flavor 295
mixture components, an uncommon response profile for OB neurons. In addition, some olfactory cortical neurons responded to mixtures of stimuli whose components were previously shown to activate distinctly different OB glomeruli. For example, some olfactory cortical neurons responded to a mixture of 6, 20 hexamone and heptanal. These were stimuli that had been previously shown to activate different glomeruli, based on 2-DG studies by Leon and colleagues. (Discussed in Lei et al., 2006). Mixture suppression and facilitation were also common in the cortical neurons (Lei et al., 2006). Another study in PC using natural food odors reached a similar conclusion (Yoshida & Mori, 2007). A large proportion of PC neurons responded to more than one of a number of core odorants that have been used to characterize foods. These core odorants would be expected to activate different patterns of OB glomeruli, as a result of having different chemical structures (e.g., sulfides, esters). Thus, multiply responsive PC neurons imply a pattern of convergence. A large proportion of anterior PC neurons with nonlinear mixture effects further suggest cortical processing. Neither of these studies can differentiate between convergence mediated by OB afferents onto the distal dendrite of pyramidal neurons and convergence via intracortical pathways. Multisite recording from anterior PC, however, indicate both types of convergence (Rennaker, Chen, Ruyle, Sloan, & Wilson, 2007). In particular, 15% of the neurons showed a cross-correlogram indicative of direct cell-to-cell interaction, that is, intracortical convergence. Further evidence of convergence also comes from an analysis of olfactory adaptation (Kadohisa & Wilson, 2006a). Mitral and tufted cells adapt rapidly to continuous olfactory stimulation. Recording from anterior PC, however, showed that a novel olfactory stimulus applied once the adaptation had occurred reactivated the neuron, presumably via OB efferents acting indirectly through intracortical collaterals. Thus, the aPC effectively filters out background odors to enhance olfactory “figure-ground” relationships. The differential pattern of PC innervation in which OB efferents dominate in aPC and intracortical assocation fibers dominate in pPC suggests differential roles with respect to associative processing. Neurons in pPC become more broadly tuned as a function of odor experience compared to neurons in aPC that become more narrowly tuned (Kadohisa & Wilson, 2006b). The narrow tuning of aPC neurons is also associated with an experience-dependent lowering of response correlations between a mixture and its components, suggesting that aPC has synthesized the mixture and given it its own “identity,” that is, uncorrelated with component odors. In contrast, as neurons become more broadly tuned in pPC, individual neuron responses between a mixture and its binary components
c13.indd 295
become more highly correlated(i.e., a mixture and its components are more similar and presumably now share a common percept, for example, quality). Similar conclusions were reached in the human piriform cortex using fMRI (Gottfried, Winston, & Dolan, 2006). Perhaps the most striking difference between aPC and pPC comes from studies comparing neural response profiles during associative learning (Calu, Roesch, Stalnaker, & Schoenbaum, 2007; Roesch, Stalnaker, & Schoenbaum, 2007). Neurons in aPC and pPC were recorded as rats were conditioned to discriminate between two odors—one associated with a positive (sucrose) reinforcer and the other associated with a negative (QHCl) reinforcer. In general, neurons in both structures showed a high degree of associative activity, that is, (population) odorant responses became larger following the conditioning procedure. When the odor-conditioning pairs were reversed, however, such that an odor previously paired with sucrose was now paired with quinine, neurons in aPC continued to respond as they had during the prior conditioning but neurons in the pPC altered their response characteristic and neurons previously responsive only to one odor became responsive to the other when positively reinforced.
ORBITAL FRONTAL CORTEX AND FLAVOR Food in the mouth stimulates gustatory, olfactory (via a retronasal pathway), and somatosensory afferents that fuse into the perception of flavor. This fusion is primarily associated with specific regions of the orbital frontal cortex (OFC), although evidence for convergence has been described at lower levels in rodents. For example, a few neurons in the insular cortex of rats responded to both olfactory and gustatory stimulation (Yamamoto et al., 1989). But by and large, the neural substrate for flavor, and its close conceptual linkage with experience-dependent phenomena, has been best characterized in lateral orbital cortex, based on neural recordings from nonhuman primates and functional imaging studies in humans. The path for olfactory signals to the orbitofrontal cortex is both direct and indirect. Because some PC neurons project to OFC, the first neocortical representation of odorants is only two synapses removed from an odorant in a best-case scenario. A second pathway from the endopiriform nucleus (olfactory cortex) to the mediodorsal thalamus, and then to the OFC represents a second pathway which has the virtue of dignifying the olfactory system with a thalamic relay as found in all other major sensory systems. Gustatory projections reach the orbital cortex from the insula, although the pathway may involve more synapses that originally proposed (discussed in Pritchard & Norgren, 2004).
8/17/09 2:09:13 PM
296
Chemical Senses
Response in ventral insula/OFC
0.5 0.4 0.3
frOP vINS
0.2 0.1 0 0.1 5
10 15 20 25 Peri-stimulus time (sec) Vanilla/sweet Vanilla/salty Sweet Vanilla Salty
Figure 13.18 (Figure C.23 in color section) Functional magnetic resonance imaging of human insular and orbitofrontal cortex show enhanced activity when taste and smell are congruent (e.g., vanilla/sweet) compared to incongruent stimuli (e.g., vanilla/salty). Note: Color coding of graphs does not correspond to enhanced activity in the images. From figures 4 and 5 in “Experience-Dependent Neural Integration of Taste and Smell in the Human Brain,” by D. M. Small et al., 2004, Journal of Neurophysiology, 92, pp. 1896 & 1899. Reprinted with permission.
Sensory-responsive cells in OFC of monkeys include those responsive to olfaction, vision, and taste, as well as all binary combinations (Rolls & Baylis, 1994). Some of these neurons appear to have response profiles consistent with the concept of flavor, for example, responses to both a sweet taste and fruit odor. Somewhat more recently, cortical neurons responsive to fat have been described, but most of this responsiveness appears to be due to textural rather than chemosensory properties. Nevertheless, it is noteworthy that some of these neurons also responded to congruent olfactory stimuli, for example, the odor of cream (Rolls, Critchley, Browning, Hernadi, & Lenard, 1999). Functional MRI studies in humans support the existence of chemosensory convergence in OFC and have identified specific areas of the rostral or caudal OFC where these interactions take place. Figure 13.18 illustrates the important finding that this convergence resulted in synergistic activity only when a particular taste-olfactory combination was one that would be expected to occur in food; for example, sweet and vanilla, but not when the components of the combination were incongruent, for example, salty and vanilla (Small et al., 2004). A strong case can be made that the sensory convergence (pairing) of olfactory and gustatory signals that produces flavor depends on experience (Small et al., 2007). There is ample evidence from single cell studies in both rodent and
c13.indd 296
primate, as well as functional imaging studies in human, that show experience-dependent changes in OFC associated with satiety, habituation, classical conditioning, and learned reversal learning. Thus, over the course of a meal and the onset of satiety, food becomes less preferred. This lack of preference is associated with a specific loss of sensitivity in monkey OFC neurons that were initially responsive to the taste of a particular food. Thus, a neuron responsive to a sweet beverage looses its responsiveness to that beverage as it is consumed to satiety, but retains gustatory responsiveness to other stimuli, for example, salt (Rolls et al., 1989). Odorant-responsive neurons in OFC show similar satiety effects (Critchley & Rolls, 1996a) as do fMRI images in human OFC (O’Doherty et al., 2000). Yet other studies in both human and subhuman primates also show experience-dependent changes in chemosensory, particularly olfactory-responsive OFC. For example, olfactory response profiles of single OFC units change when an odor, initially paired with a positively reinforcing taste stimulus, is subsequently paired with a negatively reinforcing taste stimulus (Critchley & Rolls, 1996b). Thus, these neurons reflect the meaning or hedonic valence of the olfactory signal, rather than chemical structure. Even without associative pairing, fMRI studies in humans demonstrated that experience (exposure) alone was sufficient to induce changes in OFC odorant-induced activity correlated with perceptual changes (Gottfried, 2007). Exposure to a (target) odorant for 3.5 minutes increased the subject’s ability to differentiate the odor from a closely related odorant chosen to be similar in either quality (e.g., minty or floral) or chemical group (alcohol or ketone). Thus, OFC activity in response to the related compound increased after exposure to the target stimulus in parallel with the change in discriminability. Odorant compounds unrelated to the target stimulus showed no such change in activity. In a second experiment, Gottfried demonstrated that pairing an aversive shock to one of two initially indistinguishable odorants increased their perceptual discriminability as well as OFC activity associated with the CS (Gottfried, 2007). Outputs of the OFC to subcortical structures involved with behavioral choice (e.g., striatum) and food intake (lateral hypothalamus) confers on these multimodal chemosensory neurons an important role in adaptive, flexible behavior (Krettek & Price, 1977; McDonald, 1991; Ongur, An, & Price, 1998). Although the simple decision to ingest sweets and avoid bitters is made at the brain stem level in both humans and other animal species, the selection of many foods, particularly in humans that appears as an “acquired taste” is probably an acquired flavor, with which we can give thanks to our frontal cortex.
8/17/09 2:09:13 PM
References 297
REFERENCES Abraham, N. M., Spors, H., Carleton, A., Margrie, T. W., Kuner, T., & Schaefer, A. T. (2004). Maintaining accuracy at the expense of speed: Stimulus similarity defines odor discrimination time in mice. Neuron, 44, 865–876.
Bezencon, C., le Coutre, J., & Damak, S. (2007). Taste-signaling proteins are coexpressed in solitary intestinal epithelial cells. Chemical Senses, 32, 41–49.
Accolla, R., Bathellier, B., Petersen, C. C., & Carleton, A. (2007). Differential spatial representation of taste modalities in the rat gustatory cortex. Journal of Neuroscience, 27, 1396–1404.
Bo, X., Alavi, A., Xiang, Z., Oglesby, I., Ford, A., & Burnstock, G. (1999). Localization of ATP-gated P2X2 and P2X3 receptor immunoreactive nerves in rat taste buds. NeuroReport, 10, 1107–1111.
Accolla, R., & Carleton, A. (2008). Internal body state influences topographical plasticity of sensory representations in the rat gustatory cortex. Proceedings of the National Academy of Sciences, USA, 105, 4010–4015.
Boughter, J. J., & Bachmanov, A. A. (2008). Genetics and evolution of taste. In S. Firestein & G. K. Beauchamp (Eds.), The senses: A comprehensive reference (Vol. 4, pp. 371–390). San Diego, CA: Academic Press.
Ache, B., & Young, J. M. (2008). Phylogeny of chemical sensitivity. In A. Basbaum, A. Kaneko, G. Shepherd, & G. Westheimer (Eds.), The senses: A comprehensive reference (Vol. 4, pp. 1–26). San Diego, CA: Academic Press.
Bradley, R. (2008). Neurotransmitters in the taste pathway. In S. Firestein & G. K. Beauchamp (Eds.), The senses: A comprehensive reference (pp. 261–270). San Diego, CA: Academic Press.
Adler, E., Hoon, M. A., Mueller, K. L., Chandrashekar, J., Ryba, N. J., & Zuker, C. S. (2000). A novel family of mammalian taste receptors. Cell, 100, 693–702. Adrian, E. D. (1950). The electrical activity of the mammalian olfactory bulb. Electroencephalography and Clinical Neurophysiology, 2, 377–388. Aroniadou-Anderjaska, V., Zhou, F. M., Priest, C. A., Ennis, M., & Shipley, M. T. (2000). Tonic and synaptically evoked presynaptic inhibition of sensory input to the rat olfactory bulb via GABA(B) heteroreceptors. Journal of Neurophysiology, 84, 1194–1203. Aungst, J. L., Heyward, P. M., Puche, A. C., Karnup, S. V., Hayar, A., Szabo, G., et al. (2003, December 11). Centre-surround inhibition among olfactory bulb glomeruli. Nature, 426, 623–629. Baird, J. P., Travers, S. P., & Travers, J. B. (2001). Integration of gastric distension and gustatory responses in the parabrachial nucleus. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 281, R1581–R1593. Barrie, J. M., Freeman, W. J., & Lenhart, M. D. (1996). Spatiotemporal analysis of prepyriform, visual, auditory, and somesthetic surface EEGs in trained rabbits. Journal of Neurophysiology, 76, 520–539. Bealer, S. L., & Smith, D. V. (1975). Multiple sensitivity to chemical stimuli in single human taste papillae. Physiology and Behavior, 14, 795–799. Beckman, M. E., & Whitehead, M. C. (1991). Intramedullary connections of the rostral nucleus of the solitary tract in the hamster. Brain Research, 557(1–2), 265–279. Beckstead, R. M., Morse, J. R., & Norgren, R. (1980). The nucleus of the solitary tract in the monkey: Projections to the thalamus and brain stem nuclei. Journal of Comparative Neurology, 190, 259–282. Beckstead, R. M., & Norgren, R. (1979). An autoradiographic examination of the central distribution of the trigeminal, facial, glossopharyngeal, and vagal nerves in the monkey. Journal of Comparative Neurology, 184, 455–472. Behrens, M., Foerster, S., Staehler, F., Raguse, J. D., & Meyerhof, W. (2007). Gustatory expression pattern of the human TAS2R bitter receptor gene family reveals a heterogenous population of bitter responsive taste receptor cells. Journal of Neuroscience, 27, 12630–12640. Belluscio, L., Koentges, G., Axel, R., & Dulac, C. (1999). A map of pheromone receptor activation in the mammalian brain. Cell, 97, 209–220. Benjamin, R., & Pfaffmann, C. (1953). Cortical localization of taste in the albino rat. Journal of Neurophysiology, 18, 56–64. Bernard, J. F., Alden, M., & Besson, J. M. (1993). The organization of the efferent projections from the pontine parabrachial area to the amygdaloid complex: A phaseolus vulgaris leucoagglutinin (PHA-L) study in the rat. Journal of Comparative Neurology, 329, 201–229.
c13.indd 297
Beshel, J., Kopell, N., & Kay, L. M. (2007). Olfactory bulb gamma oscillations are enhanced with task demands. Journal of Neuroscience, 27, 8358–8365.
Breer, H., Fleischer, J., & Strotmann, J. (2006). The sense of smell: Multiple olfactory subsystems. Cellular and Molecular Life Sciences, 63, 1465–1475. Brennan, P. A., & Zufall, F. (2006, November 16). Pheromonal communication in vertebrates. Nature, 444, 308–315. Breza, J. M., Curtis, K. S., & Contreras, R. J. (2007). Monosodium glutamate but not linoleic acid differentially activates gustatory neurons in the rat geniculate ganglion. Chemical Senses, 32, 833–846. Bruce, H. M. (1959, July 11). An exteroceptive block to pregnancy in the mouse. Nature, 184, 105. Buck, L., & Axel, R. (1991). A novel multigene family may encode odorant receptors: A molecular basis for odor recognition. Cell, 65, 175–187. Bufe, B., Breslin, P. A., Kuhn, C., Reed, D. R., Tharp, C. D., Slack, J. P., et al. (2005). The molecular basis of individual differences in phenylthiocarbamide and propylthiouracil bitterness perception. Current Biology, 15, 322–327. Bufe, B., Hofmann, T., Krautwurst, D., Raguse, J. D., & Meyerhof, W. (2002). The human TAS2R16 receptor mediates bitter taste in response to beta-glucopyranosides. Nature Genetics, 32, 397–401. Buonviso, N., Amat, C., & Litaudon, P. (2006). Respiratory modulation of olfactory neurons in the rodent brain. Chemical Senses, 31, 145–154. Buonviso, N., Chaput, M. A., & Berthommier, F. (1992). Temporal pattern analyses in pairs of neighboring mitral cells. Journal of Neurophysiology, 68, 417–424. Burton, M. J., Rolls, E. T., & Mora, F. (1976). Effects of hunger on the responses of neurons in the lateral hypothalamus to the sight and taste of food. Experimental Neurology, 51, 668–677. Calu, D. J., Roesch, M. R., Stalnaker, T. A., & Schoenbaum, G. (2007). Associative encoding in posterior piriform cortex during odor discrimination and reversal learning. Cerebral Cortex, 17, 1342–1349. Cang, J., & Isaacson, J. S. (2003). In vivo whole-cell recording of odorevoked synaptic transmission in the rat olfactory bulb. Journal of Neuroscience, 23, 4108–4116. Carlson, G. C., Shipley, M. T., & Keller, A. (2000). Long-lasting depolarizations in mitral cells of the rat olfactory bulb. Journal of Neuroscience, 20, 2011–2021. Caterina, M. J., & Julius, D. (2001). The vanilloid receptor: A molecular gateway to the pain pathway. Annual Review of Neuroscience, 24, 487–517. Cattarelli, M., Astic, L., & Kauer, J. S. (1988). Metabolic mapping of 2-deoxyglucose uptake in the rat piriform cortex using computerized image processing. Brain Research, 442, 180–184. Chang, F. C., & Scott, T. R. (1984). Conditioned taste aversions modify neural responses in the rat nucleus tractus solitarius. Journal of Neuroscience, 4, 1850–1862.
8/17/09 2:09:14 PM
298
Chemical Senses
Chaput, M. A. (1983). Effects of olfactory peduncle sectioning on the single unit responses of olfactory bulb neurons to odor presentation in awake rabbits. Chemical Senses, 8, 161–177. Chaput, M. A. (2000). EOG responses in anesthetized freely breathing rats. Chemical Senses, 25, 695–701. Chaudhari, N., Landin, A. M., & Roper, S. D. (2000). A metabotropic glutamate receptor variant functions as a taste receptor. Journal of Neuroscience, 3, 113–119. Chen, W. R., & Shepherd, G. M. (2005). The olfactory glomerulus: A cortical module with specific functions. Journal of Neurocytology, 34(3/5), 353–360. Cho, Y. K., Li, C. S., & Smith, D. V. (2003). Descending influences from the lateral hypothalamus and amygdala converge onto medullary taste neurons. Chemical Senses, 28, 155–171.
Ennis, M., Zhou, F. M., Ciombor, K. J., Aroniadou-Anderjaska, V., Hayar, A., Borrelli, E., et al. (2001). Dopamine D2 receptor-mediated presynaptic inhibition of olfactory nerve terminals. Journal of Neurophysiology, 86, 2986–2997. Finger, T. E., Danilova, V., Barrows, J., Bartel, D. L., Vigers, A. J., Stone, L., et al. (2005, December 2). ATP signaling is crucial for communication from taste buds to gustatory nerves. Science, 310, 1495–1499. Flaherty, C. F., Turovsky, J., & Krauss, K. L. (1994). Relative hedonic value modulates anticipatory contrast. Physiology and Behavior, 55, 1047–1054. Fontanini, A., & Katz, D. B. (2006). State-dependent modulation of time-varying gustatory responses. Journal of Neurophysiology, 96, 3183–3193. Fox, A. (1932). The relationship between chemical constitution and taste. Proceedings of the National Academy of Sciences, USA, 18, 115–120.
Contreras, R. J. (1977). Changes in gustatory nerve discharges with sodium deficiency: A single unit analysis. Brain Research, 121, 373–378.
Frank, M. E. (1973). An analysis of hamster afferent taste nerve response functions. Journal of General Physiology, 61, 588–618.
Critchley, H. D., & Rolls, E. T. (1996a). Hunger and satiety modify the responses of olfactory and visual neurons in the primate orbitofrontal cortex. Journal of Neurophysiology, 75, 1673–1686.
Frank, M. E. (1991). Taste-responsive neurons of the glossopharyngeal nerve of the rat. Journal of Neurophysiology, 65, 1452–1463.
Critchley, H. D., & Rolls, E. T. (1996b). Olfactory neuronal responses in the primate orbitofrontal cortex: Analysis in an olfactory discrimination task. Journal of Neurophysiology, 75, 1659–1672.
Frank, M. E., Contreras, R. J., & Hettinger, T. P. (1983). Nerve fibers sensitive to ionic taste stimuli in chorda tympani of the rat. Journal of Neurophysiology, 50, 941–960.
Dade, L. A., Zatorre, R. J., & Jones-Gotman, M. (2002). Olfactory learning: Convergent findings from lesion and brain imaging studies in humans. Brain, 125(Pt. 1), 86–101.
Fukuwatari, T., Kawada, T., Tsuruta, M., Hiraoka, T., Iwanaga, T., Sugimoto, E., et al. (1997). Expression of the putative membrane fatty acid transporter (FAT) in taste buds of the circumvallate papillae in rats. FEBS Letters, 414, 461–464.
Damak, S., Rong, M., Yasumatsu, K., Kokrashvili, Z., Varadarajan, V., Zou, S., et al. (2003). Detection of sweet and umami taste in the absence of taste receptor T1r3. Science, 301, 850–853.
Fulwiler, C. E., & Saper, C. B. (1984). Subnuclear organization of the efferent connections of the parabrachial nucleus in the rat. Brain Research, 319, 229–259.
Davis, B. J., & Jang, T. (1988). A golgi analysis of the gustatory zone of the nucleus of the solitary tract in the adult hamster. Journal of Comparative Neurology, 278, 388–396.
Gaillard, D., Laugerette, F., Darcel, N., El-Yassimi, A., Passilly-Degrace, P., Hichami, A., et al. (2008). The gustatory pathway is involved in CD36-mediated orosensory perception of long-chain fatty acids in the mouse. Federation of American Societies for Experimental Biology, 22, 1458–1468.
de Araujo, I. E., Gutierrez, R., Oliveira-Maia, A. J., Pereira, A., Jr., Nicolelis, M. A., & Simon, S. A. (2006). Neural ensemble coding of satiety states. Neuron, 51, 483–494. Delay, E. R., Hernandez, N. P., Bromley, K., & Margolskee, R. F. (2006). Sucrose and monosodium glutamate taste thresholds and discrimination ability of T1R3 knockout mice. Chemical Senses, 31, 351–357. DeSimone, J. A., Lyall, V., Heck, G. L., Phan, T. H., Alam, R. I., Feldman, G. M., et al. (2001). A novel pharmacological probe links the amilorideinsensitive NaCl, KCl, and NH(4)Cl chorda tympani taste responses. Journal of Neurophysiology, 86, 2638–2641. Dethier, V. G. (1987). Sniff, flick, and pulse, an appreciation of interruption. Proceedings of the American Philosophical Society, 131, 159–174. Dickman, J., & Smith, D. (1989). Topographic distribution of taste responsiveness in the hamster medulla. Chemical Senses, 14, 231–247. Di Lorenzo, P. M. (1988). Taste responses in the parabrachial pons of decerebrate rats. Journal of Neurophysiology, 59, 1871–1887. Di Lorenzo, P. M., & Monroe, S. (1992). Corticofugal input to tasteresponsive units in the parabrachial pons. Brain Research Bulletin, 29, 925–930. Di Lorenzo, P. M., & Monroe, S. (1995). Corticofugal influence on taste responses in the nucleus of the solitary tract in the rat. Journal of Neurophysiology, 74, 258–272. Di Lorenzo, P. M., & Victor, J. D. (2003). Taste response variability and temporal coding in the nucleus of the solitary tract of the rat. Journal of Neurophysiology, 90, 1418–1431. Dulac, C., & Axel, R. (1995). A novel family of genes encoding putative pheromone receptors in mammals. Cell, 83, 195–206. Dyer, J., Salmon, K. S., Zibrik, L., & Shirazi-Beechey, S. P. (2005). Expression of sweet taste receptors of the T1R family in the intestinal tract and enteroendocrine cells. Biochemical Society Transactions, 33(Pt. 1), 302–305.
c13.indd 298
Geran, L. C., & Travers, S. P. (2006). Single neurons in the nucleus of the solitary tract respond selectively to bitter taste stimuli. Journal of Neurophysiology, 96, 2513–2527. Gilbertson, T. A., Boughter, J. D., Jr., Zhang, H., & Smith, D. V. (2001). Distribution of gustatory sensitivities in rat taste cells: Whole-cell responses to apical chemical stimulation. Journal of Neuroscience, 21, 4931–4941. Gilbertson, T. A., Fontenot, D. T., Liu, L., Zhang, H., & Monroe, W. T. (1997). Fatty acid modulation of K channels in taste receptor cells: Gustatory cues for dietary fat. American Journal of Physiology, 272, C1203–C1210. Giza, B. K., Deems, R. O., Vanderweele, D. A., & Scott, T. R. (1993). Pancreatic glucagon suppresses gustatory responsiveness to glucose. Journal of General Physiology, 265, R1231–R1237. Giza, B. K., & Scott, T. R. (1983). Blood glucose selectively affects taste-evoked activity in rat nucleus tractus solitarius. Physiology and Behavior, 31, 643–650. Giza, B. K., & Scott, T. R. (1987). Intravenous insulin infusions in rats decrease gustatory-evoked responses to sugars. Journal of General Physiology, 252, R994–R1002. Glenn, J. F., & Erickson, R. P. (1976). Gastric modulation of gustatory afferent activity. Physiology and Behavior, 16, 561–568. Glusman, G., Yanai, I., Rubin, I., & Lancet, D. (2001). The complete human olfactory subgenome. Genome Research, 11, 685–702. Gottfried, J. A. (2007). What can an orbitofrontal cortex-endowed animal do with smells? Annals of the New York Academy of Sciences, 1121, 102–120. Gottfried, J. A., Winston, J. S., & Dolan, R. J. (2006). Dissociable codes of odor quality and odorant structure in human piriform cortex. Neuron, 49, 467–479.
8/17/09 2:09:14 PM
References 299 Grabauskas, G., & Bradley, R. M. (1996). Synaptic interactions due to convergent input from gustatory afferent fibers in the rostral nucleus of the solitary tract. Journal of Neurophysiology, 76, 2919–2927.
Hayar, A., Karnup, S., Ennis, M., & Shipley, M. T. (2004). External tufted cells: A major excitatory element that coordinates glomerular activity. Journal of Neuroscience, 24, 6676–6685.
Grill, H. J., & Norgren, R. (1978a, July 21). Chronically decerebrate rats demonstrate satiation but not bait shyness. Science, 201, 267–269.
Hayar, A., Karnup, S., Shipley, M. T., & Ennis, M. (2004). Olfactory bulb glomeruli: External tufted cells intrinsically burst at theta frequency and are entrained by patterned olfactory input. Journal of Neuroscience, 24, 1190–1199.
Grill, H. J., & Norgren, R. (1978b). The taste reactivity test: Pt. I. Mimetic responses to gustatory stimuli in neurologically normal rats. Brain Research, 143, 263–279. Grill, H. J., & Norgren, R. (1978c). The taste reactivity test: Pt. II. Mimetic responses to gustatory stimuli in chronic thalamic and chronic decerebrate rats. Brain Research, 143, 281–297. Grill, H. J., Schulkin, J., & Flynn, F. W. (1986). Sodium homeostasis in chronic decerebrate rats. Behavioral Neuroscience, 100, 536–543. Grill, H. J., & Smith, G. P. (1988). Cholecystokinin decreases sucrose intake in chronic decerebrate rats. Journal of General Physiology, 254, R853–R856. Grosmaitre, X., Santarelli, L. C., Tan, J., Luo, M., & Ma, M. (2007). Dual functions of mammalian olfactory sensory neurons as odor detectors and mechanical sensors. Journal of Neuroscience, 10, 348–354. Grossman, S. E., Fontanini, A., Wieskopf, J. S., & Katz, D. B. (2008). Learning-related plasticity of temporal coding in simultaneously recorded amygdala-cortical ensembles. Journal of Neuroscience, 28, 2864–2873. Hajnal, A., & Norgren, R. (2005). Taste pathways that mediate accumbens dopamine release by sapid sucrose. Physiology and Behavior, 84, 363–369.
Hermann, G. E., & Rogers, R. C. (1985). Convergence of vagal and gustatory afferent input within the parabrachial nucleus of the rat. Journal of the Autonomic Nervous System, 13(1), 1–17. Herness, S. (2000). Coding in taste receptor cells: The early years of intracellular recordings. Physiology and Behavior, 69(1/2), 17–27. Herness, S., Zhao, F. L., Lu, S. G., Kaya, N., & Shen, T. (2002). Expression and physiological actions of cholecystokinin in rat taste receptor cells. Journal of Neuroscience, 22, 10018–10029. Herrada, G., & Dulac, C. (1997). A novel family of putative pheromone receptors in mammals with a topographically organized and sexually dimorphic distribution. Cell, 90, 763–773. Hildebrand, J. G., & Shepherd, G. M. (1997). Mechanisms of olfactory discrimination: Converging evidence for common principles across phyla. Annual Review of Neuroscience, 20, 595–631. Hopfield, J. J. (1995, July 6). Pattern recognition computation using action potential timing for stimulus representation. Nature, 376, 33–36.
Hajnal, A., Takenouchi, K., & Norgren, R. (1999). Effect of intraduodenal lipid on parabrachial gustatory coding in awake rats. Journal of Neuroscience, 19, 7182–7190.
Huang, A. L., Chen, X., Hoon, M. A., Chandrashekar, J., Guo, W., Trankner, D., et al. (2006, August 24). The cells and logic for mammalian sour taste detection. Nature, 442, 934–938.
Halpern, B. P. (1965, October 23). Chemotopic organization in the bulbar gustatory relay of the rat. Nature, 208, 393–395.
Huang, Y. J., Maruyama, Y., Dvoryanchikov, G., Pereira, E., Chaudhari, N., & Roper, S. D. (2007). The role of pannexin 1 hemichannels in ATP release and cell-cell communication in mouse taste buds. Proceedings of the National Academy of Sciences, USA, 104, 6436–6441.
Halsell, C. B., & Frank, M. E. (1992). Organization of taste-evoked activity in the hamster parabrachial nucleus. Brain Research, 572(1/2), 286–290. Halsell, C. B., & Travers, S. P. (1997). Anterior and posterior oral cavity responsive neurons are differentially distributed among parabrachial subnuclei in rat. Journal of Neurophysiology, 78, 920–938. Halsell, C. B., Travers, S. P., & Travers, J. B. (1996). Ascending and descending projections from the rostral nucleus of the solitary tract originate from separate neuronal populations. Neuroscience, 72, 185–197. Hamilton, R. B., & Norgren, R. (1984). Central projections of gustatory nerves in the rat. Journal of Comparative Neurology, 222, 560–577. Hanamori, T., Kunitake, T., Kato, K., & Kannan, H. (1997). Convergence of afferent inputs from the chorda tympani, lingual-tonsillar and pharyngeal branches of the glossopharyngeal nerve, and superior laryngeal nerve on the neurons in the insular cortex in rats. Brain Research, 763, 267–270. Hanamori, T., Miller, I. J., Jr., & Smith, D. V. (1988). Gustatory responsiveness of fibers in the hamster glossopharyngeal nerve. Journal of Neurophysiology, 60, 478–498. Hao, S., Sternini, C., & Raybould, H. E. (2008). Role of CCK1 and Y2 receptors in activation of hindbrain neurons induced by intragastric administration of bitter taste receptor ligands. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 294(1), R33–R38. Harrer, M. I., & Travers, S. P. (1996). Topographic organization of foslike immunoreactivity in the rostral nucleus of the solitary tract evoked by gustatory stimulation with sucrose and quinine. Brain Research, 711(1/2), 125–137. Hayama, T., Hashimoto, K., & Ogawa, H. (1994). Anatomical location of a taste-related region in the thalamic reticular nucleus in rats. Neurosciences Research, 18, 291–299.
c13.indd 299
Heck, G. L., Mierson, S., & DeSimone, J. A. (1984, January 27). Salt taste transduction occurs through an amiloride-sensitive sodium transport pathway. Science, 223, 403–405.
Hunt, S. P., Pini, A., & Evan, G. (1987, August 13). Induction of c-fos-like protein in spinal cord neurons following sensory stimulation. Nature, 328, 632–634. Illig, K. R., & Haberly, L. B. (2003). Odor-evoked activity is spatially distributed in piriform cortex. Journal of Comparative Neurology, 457, 361–373. Ishimaru, Y., Inada, H., Kubota, M., Zhuang, H., Tominaga, M., & Matsunami, H. (2006). Transient receptor potential family members PKD1L3 and PKD2L1 form a candidate sour taste receptor. Proceedings of the National Academy of Sciences, USA, 103, 12569–12574. Jacobs, K. M., Mark, G. P., & Scott, T. R. (1988). Taste responses in the nucleus tractus solitarius of sodium-deprived rats. Journal of Physiology, 406, 393–410. Johnson, B. A., & Leon, M. (2007). Chemotopic odorant coding in a mammalian olfactory system. Journal of Comparative Neurology, 503, 1–34. Jones, L. M., Fontanini, A., Sadacca, B. F., Miller, P., & Katz, D. B. (2007). Natural stimuli evoke dynamic sequences of states in sensory cortical ensembles. Proceedings of the National Academy of Sciences, USA, 104, 18772–18777. Joshi, D., Volkl, M., Shepherd, G. M., & Laska, M. (2006). Olfactory sensitivity for enantiomers and their racemic mixtures: A comparative study in CD-1 mice and spider monkeys. Chemical Senses, 31, 655–664. Jung, M. W., Larson, J., & Lynch, G. (1990). Long-term potentiation of monosynaptic EPSPs in rat piriform cortex in vitro. Synapse, 6, 279–283. Kadohisa, M., & Wilson, D. A. (2006a). Olfactory cortical adaptation facilitates detection of odors against background. Journal of Neurophysiology, 95, 1888–1896.
8/17/09 2:09:15 PM
300
Chemical Senses
Kadohisa, M., & Wilson, D. A. (2006b). Separate encoding of identity and similarity of complex familiar odors in piriform cortex. Proceedings of the National Academy of Sciences, USA, 103, 15206–15211.
Laska, M., Joshi, D., & Shepherd, G. M. (2006). Olfactory sensitivity for aliphatic aldehydes in CD-1 mice. Behavioural Brain Research, 167, 349–354.
Kanter, E. D., & Haberly, L. B. (1990). NMDA-dependent induction of long-term potentiation in afferent and association fiber systems of piriform cortex in vitro. Brain Research, 525, 175–179.
Laugerette, F., Passilly-Degrace, P., Patris, B., Niot, I., Febbraio, M., Montmayeur, J. P., et al. (2005). CD36 involvement in orosensory detection of dietary lipids, spontaneous fat preference, and digestive secretions. Journal of Clinical Investigation, 115, 3177–3184.
Kanter, E. D., & Haberly, L. B. (1993). Associative long-term potentiation in piriform cortex slices requires GABAA blockade. Journal of Neuroscience, 13, 2477–2482. Karimnamazi, H., Travers, S. P., & Travers, J. B. (2002). Oral and gastric input to the parabrachial nucleus of the rat. Brain Research, 957, 193–206. Kataoka, S., Yang, R., Ishimaru, Y., Matsunami, H., Sevigny, J., Kinnamon, J. C., et al. (2008). The candidate sour taste receptor, PKD2L1, is expressed by Type III taste cells in the mouse. Chemical Senses, 33, 243–254. Katz, D. B., Simon, S. A., & Nicolelis, M. A. (2001). Dynamic and multimodal responses of gustatory cortical neurons in awake rats. Journal of Neuroscience, 21, 4478–4489. Kaya, N., Shen, T., Lu, S. G., Zhao, F. L., & Herness, S. (2004). A paracrine signaling role for serotonin in rat taste buds: Expression and localization of serotonin receptor subtypes. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 286, R649–R658. Kelliher, K. R., Spehr, M., Li, X. H., Zufall, F., & Leinders-Zufall, T. (2006). Pheromonal recognition memory induced by TRPC2independent vomeronasal sensing. European Journal of Medicine, 23, 3385–3390. Kepecs, A., Uchida, N., & Mainen, Z. F. (2006). The sniff as a unit of olfactory processing. Chemical Senses, 31, 167–179. Kim, U. K., Jorgenson, E., Coon, H., Leppert, M., Risch, N., & Drayna, D. (2003, February 21). Positional cloning of the human quantitative trait locus underlying taste sensitivity to phenylthiocarbamide. Science, 299, 1221–1225. King, M. S., & Bradley, R. M. (1994). Relationship between structure and function of neurons in the rat rostral nucleus tractus solitarii. Journal of Comparative Neurology, 344, 50–64. Kinnamon, J., & Yang, R. (2008). Ultrastructure of taste buds. In S. Firestein & G. K. Beauchamp (Eds.), The senses: A comprehensive reference (Vol. 4, pp. 000–000). San Diego, CA: Academic Press. Kinnamon, S. C., & Margolskee, R. F. (2008). Taste transduction. In S. Firestein & G. K. Beauchamp (Eds.), The senses: A comprehensive reference (Vol. 4, pp. 219–236). San Diego, CA: Academic Press. Kitagawa, J., Shingai, T., Kajii, Y., Takahashi, Y., Taguchi, Y., & Matsumoto, S. (2007). Leptin modulates the response to oleic acid in the pharynx. Neuroscience Letters, 423, 109–112. Kosaka, K., & Kosaka, T. (2005). Synaptic organization of the glomerulus in the main olfactory bulb: Compartments of the glomerulus and heterogeneity of the periglomerular cells. Anatomical Science International, 80(2), 80–90. Kosar, E., Grill, H. J., & Norgren, R. (1986a). Gustatory cortex in the rat: Pt. I. Physiological properties and cytoarchitecture. Brain Research, 379, 329–341. Kosar, E., Grill, H. J., & Norgren, R. (1986b). Gustatory cortex in the rat: Pt. II. Thalamocortical projections. Brain Research, 379, 342–352. Krettek, J. E., & Price, J. L. (1977). Projections from the amygdaloid complex and adjacent olfactory structures to the entorhinal cortex and to the subiculum in the rat and cat. Journal of Comparative Neurology, 172, 723–752. Lasiter, P. S., & Kachele, D. L. (1988). Organization of GABA and GABAtransaminase containing neurons in the gustatory zone of the nucleus of the solitary tract. Brain Research Bulletin, 21, 623–636.
c13.indd 300
Laurent, G., Stopfer, M., Friedrich, R. W., Rabinovich, M. I., Volkovskii, A., & Abarbanel, H. D. (2001). Odor encoding as an active, dynamical process: Experiments, computation, and theory. Annual Review of Neuroscience, 24, 263–297. Lei, H., Mooney, R., & Katz, L. C. (2006). Synaptic integration of olfactory information in mouse anterior olfactory nucleus. Journal of Neuroscience, 26, 12023–12032. Leinders-Zufall, T., Brennan, P., Widmayer, P., Chandramani, S. P., Maul-Pavicic, A., Jager, M., et al. (2004, November 5). MHC class I peptides as chemosensory signals in the vomeronasal organ. Science, 306, 1033–1037. Leinders-Zufall, T., Lane, A. P., Puche, A. C., Ma, W., Novotny, M. V., Shipley, M. T., et al. (2000, June 15). Ultrasensitive pheromone detection by mammalian vomeronasal neurons. Nature, 405, 792–796. Lemon, C. H., & Smith, D. V. (2005). Neural representation of bitter taste in the nucleus of the solitary tract. Journal of Neurophysiology, 94, 3719–3729. Li, C. S., Cho, Y. K., & Smith, D. V. (2005). Modulation of parabrachial taste neurons by electrical and chemical stimulation of the lateral hypothalamus and amygdala. Journal of Neurophysiology, 93, 1183–1196. Li, C. S., & Smith, D. V. (1997). Glutamate receptor antagonists block gustatory afferent input to the nucleus of the solitary tract. Journal of Neurophysiology, 77, 1514–1525. Li, X., Li, W., Wang, H., Cao, J., Maehashi, K., Huang, L., et al. (2005). Pseudogenization of a sweet-receptor gene accounts for cats’ indifference toward sugar. PLoS Genetics, 1(1), 27–35. Li, X., Staszewski, L., Xu, H., Durick, K., Zoller, M., & Adler, E. (2002). Human receptors for sweet and umami taste. Proceedings of the National Academy of Sciences, USA, 99, 4692–4696. Liman, E. R., Corey, D. P., & Dulac, C. (1999). TRP2: A candidate transduction channel for mammalian pheromone sensory signaling. Proceedings of the National Academy of Sciences, USA, 96, 5791–5796. Lison, M., Blondheim, S. H., & Melmed, R. N. (1980). A polymorphism of the ability to smell urinary metabolites of asparagus. British Medical Journal, 281, 1676–1678. LopezJimenez, N. D., Cavenagh, M. M., Sainz, E., Cruz-Ithier, M. A., Battey, J. F., & Sullivan, S. L. (2006). Two members of the TRPP family of ion channels, Pkd1l3 and Pkd2l1, are co-expressed in a subset of taste receptor cells. Journal of Neurochemistry, 98, 68–77. Lundy, R. F., Jr., & Contreras, R. J. (1999). Gustatory neuron types in rat geniculate ganglion. Journal of Neurophysiology, 82, 2970–2988. Lundy, R. F., Jr., & Norgren, R. (2004a). Activity in the hypothalamus, amygdala, and cortex generates bilateral and convergent modulation of pontine gustatory neurons. Journal of Neurophysiology, 91, 1143–1157. Lundy, R. F., Jr., & Norgren, R. (2004b). Gustatory system (3rd ed.). San Diego, CA: Elsevier Academic Press. Luo, M., Fee, M. S., & Katz, L. C. (2003, February 21). Encoding pheromonal signals in the accessory olfactory bulb of behaving mice. Science, 299, 1196–1201. Lyall, V., Heck, G. L., Vinnikova, A. K., Ghosh, S., Phan, T. H., Alam, R. I., et al. (2004). The mammalian amiloride-insensitive nonspecific salt taste receptor is a vanilloid receptor-1 variant. Journal of Physiology, 558(Pt. 1), 147–159.
8/17/09 2:09:15 PM
References 301 Lyall, V., Pasley, H., Phan, T. H., Mummalaneni, S., Heck, G. L., Vinnikova, A. K., et al. (2006). Intracellular pH modulates taste receptor cell volume and the phasic part of the chorda tympani response to acids. Journal of General Physiology, 127, 15–34. Macrides, F., & Chorover, S. L. (1972, January 7). Olfactory bulb units: Activity correlated with inhalation cycles and odor quality. Science, 175, 84–87.
Mombaerts, P. (2004c). Odorant receptor gene choice in olfactory sensory neurons: The one receptor-one neuron hypothesis revisited. Current Opinion in Neurobiology, 14, 31–36.
Malnic, B. (2007). Searching for the ligands of odorant receptors. Molecular Neurobiology, 35(2), 175–181.
Mombaerts, P., Wang, F., Dulac, C., Chao, S. K., Nemes, A., Mendelsohn, M., et al. (1996). Visualizing an olfactory sensory map. Cell, 87, 675–686.
Mandiyan, V. S., Coats, J. K., & Shah, N. M. (2005). Deficits in sexual and aggressive behaviors in Cnga2 mutant mice. Journal of Neuroscience, 8, 1660–1662.
Mueller, K. L., Hoon, M. A., Erlenbach, I., Chandrashekar, J., Zuker, C. S., & Ryba, N. J. (2005, March 10). The receptors and coding logic for bitter taste. Nature, 434, 225–229.
Margolskee, R. F., Dyer, J., Kokrashvili, Z., Salmon, K. S., Ilegems, E., Daly, K., et al. (2007). T1R3 and gustducin in gut sense sugars to regulate expression of Na-glucose cotransporter 1. Proceedings of the National Academy of Sciences, USA, 104, 15075–15080.
Murphy, G. J., Darcy, D. P., & Isaacson, J. S. (2005). Intraglomerular inhibition: Signaling mechanisms of an olfactory microcircuit. Journal of Neuroscience, 8, 354–364.
Margrie, T. W., & Schaefer, A. T. (2003). Theta oscillation coupled spike latencies yield computational vigour in a mammalian sensory system. Journal of Physiology, 546(Pt. 2), 363–374.
Nagai, T., Mistretta, C. M., & Bradley, R. M. (1988). Developmental decrease in size of peripheral receptive fields of single chorda tympani nerve fibers and relation to increasing NaCl taste sensitivity. Journal of Neuroscience, 8, 64–72.
Mark, G. P., Scott, T. R., Chang, F. C., & Grill, H. J. (1988). Taste responses in the nucleus tractus solitarius of the chronic decerebrate rat. Brain Research, 443(1–2), 137–148.
Nakamura, K., & Norgren, R. (1991). Gustatory responses of neurons in the nucleus of the solitary tract of behaving rats. Journal of Neurophysiology, 66, 1232–1248.
Martin, C., Gervais, R., Hugues, E., Messaoudi, B., & Ravel, N. (2004). Learning modulation of odor-induced oscillatory responses in the rat olfactory bulb: A correlate of odor recognition? Journal of Neuroscience, 24, 389–397.
Nelson, G., Chandrashekar, J., Hoon, M. A., Feng, L., Zhao, G., Ryba, N. J., et al. (2002, March 14). An amino-acid taste receptor. Nature, 416, 199–202.
Matsunami, H., & Buck, L. B. (1997). A multigene family encoding a diverse array of putative pheromone receptors in mammals. Cell, 90, 775–784. Max, M., & Meyerhof, W. (2008). Taste receptors. In S. Firestein & G. K. Beauchamp (Eds.), The senses: A comprehensive reference (Vol. 4, pp. 197–217). San Diego, CA: Academic Press. May, O. L., Erisir, A., & Hill, D. L. (2007). Ultrastructure of primary afferent terminals and synapses in the rat nucleus of the solitary tract: Comparison among the greater superficial petrosal, chorda tympani, and glossopharyngeal nerves. Journal of Comparative Neurology, 502, 1066–1078. May, O. L., & Hill, D. L. (2006). Gustatory terminal field organization and developmental plasticity in the nucleus of the solitary tract revealed through triple-fluorescence labeling. Journal of Comparative Neurology, 497, 658–669. McCaughey, S. A., & Scott, T. R. (2000). Rapid induction of sodium appetite modifies taste-evoked activity in the rat nucleus of the solitary tract. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 279, R1121–R1131. McDonald, A. J. (1991). Topographical organization of amygdaloid projections to the caudatoputamen, nucleus accumbens, and related striatallike areas of the rat brain. Neuroscience, 44, 15–33. Menashe, I., Abaffy, T., Hasin, Y., Goshen, S., Yahalom, V., Luetje, C. W., et al. (2007). Genetic elucidation of human hyperosmia to isovaleric acid. PLoS Biology, 5(11), e284. Meredith, M. (1994). Chronic recording of vomeronasal pump activation in awake behaving hamsters. Physiology and Behavior, 56, 345–354. Miller, F. R., & Sherrington, C. S. (1916). Some observations on the buccopharyngeal stage of reflex deglutition in the cat. Quarterly Journal of Experimental Physiology, 9, 147–186.
Nelson, G., Hoon, M. A., Chandrashekar, J., Zhang, Y., Ryba, N. J., & Zuker, C. S. (2001). Mammalian sweet taste receptors. Cell, 106, 381–390. Neville, K. R., & Haberly, L. B. (2003). Beta and gamma oscillations in the olfactory system of the urethane-anesthetized rat. Journal of Neurophysiology, 90, 3921–3930. Neville, K. R., & Haberly, L. B. (2004). Olfactory cortex. In G. M. Shepherd (Ed.), The synaptic organization of the brain (5th ed., pp. 415–454). New York: Oxford University Press. Niimura, Y., & Nei, M. (2003). Evolution of olfactory receptor genes in the human genome. Proceedings of the National Academy of Sciences, USA, 100, 12235–12240. Niimura, Y., & Nei, M. (2005). Evolutionary dynamics of olfactory receptor genes in fishes and tetrapods. Proceedings of the National Academy of Sciences, USA, 102, 6039–6044. Ninomiya, Y., Sako, N., & Funakoshi, M. (1989). Strain differences in amiloride inhibition of NaCl responses in mice, mus musculus. Journal of Comparative Physiology: A, 166, 1–5. Ninomiya, Y., Sako, N., & Imai, Y. (1995). Enhanced gustatory neural responses to sugars in the diabetic db/db mouse. Journal of General Physiology, 269, R930–R937. Nishijo, H., & Norgren, R. (1990). Responses from parabrachial gustatory neurons in behaving rats. Journal of Neurophysiology, 63, 707–724. Nishijo, H., Uwano, T., Tamura, R., & Ono, T. (1998). Gustatory and multimodal neuronal responses in the amygdala during licking and discrimination of sensory stimuli in awake rats. Journal of Neurophysiology, 79, 21–36. Norgren, R. (1974). Gustatory afferents to ventral forebrain. Brain Research, 81, 285–295. Norgren, R., & Leonard, C. M. (1971, September 17). Taste pathways in rat brainstem. Science, 173, 1136–1139.
Mitchell, S. C. (2001). Food idiosyncrasies: Beetroot and asparagus. Drug Metabolism and Disposition, 29, 539–543.
Norgren, R., & Leonard, C. M. (1973). Ascending central gustatory pathways. Journal of Comparative Neurology, 150, 217–237.
Mitchell, S. C., Waring, R. H., Land, D., & Thorpe, W. V. (1987). Odorous urine following asparagus ingestion in man. Experientia, 43, 382–383.
Nusser, Z., Kay, L. M., Laurent, G., Homanics, G. E., & Mody, I. (2001). Disruption of GABA(A) receptors on GABAergic interneurons leads to increased oscillatory power in the olfactory bulb network. Journal of Neurophysiology, 86, 2823–2833.
Mombaerts, P. (2004a). Genes and ligands for odorant, vomeronasal and taste receptors. Nature Reviews: Neuroscience, 5, 263–278.
c13.indd 301
Mombaerts, P. (2004b). Love at first smell: The 2004 Nobel Prize in physiology or medicine. New England Journal of Medicine, 351, 2579–2580.
8/17/09 2:09:16 PM
302
Chemical Senses
O’Doherty, J., Rolls, E. T., Francis, S., Bowtell, R., McGlone, F., Kobal, G., et al. (2000). Sensory-specific satiety-related olfactory activation of the human orbitofrontal cortex. NeuroReport, 11, 399–403. Ogawa, H., Hasegawa, K., & Murayama, N. (1992). Difference in taste quality coding between two cortical taste areas, granular and dysgranular insular areas, in rats. Experimental Brain Research, 91, 415–424. Ogawa, H., Hayama, T., & Ito, S. (1982). Convergence of input from tongue and palate to the parabrachial nucleus neurons of rats. Neuroscience Letters, 28, 9–14. Ogawa, H., Ito, S., Murayama, N., & Hasegawa, K. (1990). Taste area in granular and dysgranular insular cortices in the rat identified by stimulation of the entire oral cavity. Neurosciences Research, 9, 196–201. Ongur, D., An, X., & Price, J. L. (1998). Prefrontal cortical projections to the hypothalamus in macaque monkeys. Journal of Comparative Neurology, 401, 480–505. Pecina, S., & Berridge, K. C. (2000). Opioid site in nucleus accumbens shell mediates eating and hedonic “liking” for food: Map based on microinjection fos plumes. Brain Research, 863(1–2), 71–86. Pfaffmann, C. (1941). Gustatory afferent impulses. Journal of Cellular and Comparative Physiology, 17, 243–258. Pittman, D., Crawley, M. E., Corbin, C. H., & Smith, K. R. (2007). Chorda tympani nerve transection impairs the gustatory detection of free fatty acids in male and female rats. Brain Research, 1151, 74–83. Plata-Salaman, C. R., Smith-Swintosky, V. L., & Scott, T. R. (1996). Gustatory neural coding in the monkey cortex: Mixtures. Journal of Neurophysiology, 75, 2369–2379. Potter, H., & Chorover, S. L. (1976). Response plasticity in hamster olfactory bulb: Peripheral and central processes. Brain Research, 116, 417–429. Price, J. L. (1973). An autoradiographic study of complementary laminar patterns of termination of afferent fibers to the olfactory cortex. Journal of Comparative Neurology, 150, 87–108. Pritchard, T. C., Hamilton, R. B., Morse, J. R., & Norgren, R. (1986). Projections of thalamic gustatory and lingual areas in the monkey, macaca fascicularis. Journal of Comparative Neurology, 244, 213–228.
Rinberg, D., Koulakov, A., & Gelperin, A. (2006). Speed-accuracy tradeoff in olfaction. Neuron, 51, 351–358. Rodriguez, I., Feinstein, P., & Mombaerts, P. (1999). Variable patterns of axonal projections of sensory neurons in the mouse vomeronasal system. Cell, 97, 199–208. Roesch, M. R., Stalnaker, T. A., & Schoenbaum, G. (2007). Associative encoding in anterior piriform cortex versus orbitofrontal cortex during odor discrimination and reversal learning. Cerebral Cortex, 17, 643–652. Roitman, M. F., Wheeler, R. A., & Carelli, R. M. (2005). Nucleus accumbens neurons are innately tuned for rewarding and aversive taste stimuli, encode their predictors, and are linked to motor output. Neuron, 45, 587–597. Rolls, E. T., & Baylis, L. L. (1994). Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. Journal of Neuroscience, 14, 5437–5452. Rolls, E. T., Critchley, H. D., Browning, A. S., Hernadi, I., & Lenard, L. (1999). Responses to the sensory properties of fat of neurons in the primate orbitofrontal cortex. Journal of Neuroscience, 19, 1532–1540. Rolls, E. T., Scott, T. R., Sienkiewicz, Z. J., & Yaxley, S. (1988). The responsiveness of neurones in the frontal opercular gustatory cortex of the macaque monkey is independent of hunger. Journal of Physiology, 397, 1–12. Rolls, E. T., Sienkiewicz, Z. J., & Yaxley, S. (1989). Hunger modulates the responses to gustatory stimuli of single neurons in the caudolateral orbitofrontal cortex of the macaque monkey. European Journal of Medicine, 1(1), 53–60. Roper, S. D. (2006). Cell communication in taste buds. Cellular and Molecular Life Sciences, 63, 1494–1500. Rozengurt, E. (2006). Taste receptors in the gastrointestinal tract: Pt. I. Bitter taste receptors and alpha-gustducin in the mammalian gut. American Journal of Physiology: Gastrointestinal and Liver Physiology, 291, G171–G177. Rozengurt, E., & Sternini, C. (2007). Taste receptor signaling in the mammalian gut. Current Opinion in Pharmacology, 7, 557–562.
Pritchard, T. C., Hamilton, R. B., & Norgren, R. (2000). Projections of the parabrachial nucleus in the old world monkey. Experimental Neurology, 165, 101–117.
Saggu, S., & Lundy, R. F. (2008). Forebrain neurons that project to the gustatory parabrachial nucleus in rat lack glutamic acid decarboxylase. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 294, R52–R57.
Pritchard, T. C., & Norgren, R. (2004). Gustatory system. In G. Paxinos & J. Mai (Eds.), The human nervous system (pp. 000–000). Boston: Elsevier.
Sainz, E., Cavenagh, M. M., Gutierrez, J., Battey, J. F., Northup, J. K., & Sullivan, S. L. (2007). Functional characterization of human bitter taste receptors. Biochemical Journal, 403, 537–543.
Pronin, A. N., Xu, H., Tang, H., Zhang, L., Li, Q., & Li, X. (2007). Specific alleles of bitter receptor genes influence human sensitivity to the bitterness of aloin and saccharin. Current Biology, 17, 1403–1408.
Sandell, M. A., & Breslin, P. A. (2006). Variability in a taste-receptor gene determines whether we taste toxins in food. Current Biology, 16, R792–R794.
Quignon, P., Kirkness, E., Cadieu, E., Touleimat, N., Guyon, R., Renier, C., et al. (2003). Comparison of the canine and human olfactory receptor gene repertoires. Genome Biology, 4, R80.
Schaefer, A. T., & Margrie, T. W. (2007). Spatiotemporal representations in the olfactory system. Trends in Neurosciences, 30, 92–100.
Rennaker, R. L., Chen, C. F., Ruyle, A. M., Sloan, A. M., & Wilson, D. A. (2007). Spatial and temporal distribution of odorant-evoked activity in the piriform cortex. Journal of Neuroscience, 27, 1534–1542.
Schaefer, M. L., Yamazaki, K., Osada, K., Restrepo, D., & Beauchamp, G. K. (2002). Olfactory fingerprints for major histocompatibility complex-determined body odors: Pt. II. Relationship among odor maps, genetics, odor composition, and behavior. Journal of Neuroscience, 22, 9513–9521.
Ressler, K. J., Sullivan, S. L., & Buck, L. B. (1994). Information coding in the olfactory system: Evidence for a stereotyped and highly organized epitope map in the olfactory bulb. Cell, 79, 1245–1255.
Schoppa, N. E., & Westbrook, G. L. (2001). Glomerulus-specific synchronization of mitral cells in the olfactory bulb. Neuron, 31, 639–651.
Reynolds, S. M., & Zahm, D. S. (2005). Specificity in the projections of prefrontal and insular cortex to ventral striatopallidum and the extended amygdala. Journal of Neuroscience, 25, 11757–11767.
Scott, G., Westberg, K. G., Vrentzos, N., Kolta, A., & Lund, J. P. (2003). Effect of lidocaine and NMDA injections into the medial pontobulbar reticular formation on mastication evoked by cortical stimulation in anaesthetized rabbits. European Journal of Medicine, 17, 2156–2162.
Richter, C. (1956). Salt appetite of mammals: Its dependence on instinct and metabolism. In L’Instinct dans le comportement des animaux et de l’homme (pp. 577–629). Paris: Masson et cie editeurs.
Scott, J. W. (2006). Sniffing and spatiotemporal coding in olfaction. Chemical Senses, 31, 119–130.
Richter, T. A., Caicedo, A., & Roper, S. D. (2003). Sour taste stimuli evoke Ca2 and pH responses in mouse taste cells. Journal of Physiology, 547, 475–483.
Scott, T. R., Jr., & Erickson, R. P. (1971). Synaptic processing of tastequality information in thalamus of the rat. Journal of Neurophysiology, 34, 868–883.
c13.indd 302
8/17/09 2:09:16 PM
References 303 Scott, T. R., Karadi, Z., Oomura, Y., Nishino, H., Plata-Salaman, C. R., Lenard, L., et al. (1993). Gustatory neural coding in the amygdala of the alert macaque monkey. Journal of Neurophysiology, 69, 1810–1820. Scott, T. R., Yaxley, S., Sienkiewicz, Z. J., & Rolls, E. T. (1986a). Gustatory responses in the frontal opercular cortex of the alert cynomolgus monkey. Journal of Neurophysiology, 56, 876–890. Scott, T. R., Yaxley, S., Sienkiewicz, Z. J., & Rolls, E. T. (1986b). Gustatory responses in the nucleus tractus solitarius of the alert cynomolgus monkey. Journal of Neurophysiology, 55, 182–200. Seeley, R. J., Grill, H. J., & Kaplan, J. M. (1994). Neurological dissociation of gastrointestinal and metabolic contributions to meal size control. Behavioral Neuroscience, 108, 347–352. Shepherd, G. M. (2006, January 12). Behaviour: Smells, brains and hormones. Nature, 439, 149–151. Shepherd, G. M., Chen, W. R., & Greer, C. A. (2004). Olfactory bulb. In G. M. Shepherd (Ed.), The synaptic organization of the brain (5th ed., pp. 165–216). New York: Oxford University Press. Shi, C. J., & Cassell, M. D. (1998). Cascade projections from somatosensory cortex to the rat basolateral amygdala via the parietal insular cortex. Journal of Comparative Neurology, 399, 469–491. Shigemura, N., Ohta, R., Kusakabe, Y., Miura, H., Hino, A., Koyano, K., et al. (2004). Leptin modulates behavioral responses to sweet substances by influencing peripheral taste structures. Endocrinology, 145, 839–847. Shimura, T., Komori, M., & Yamamoto, T. (1997). Acute sodium deficiency reduces gustatory responsiveness to NaCl in the parabrachial nucleus of rats. Neuroscience Letters, 236, 33–36. Shimura, T., Tanaka, H., & Yamamoto, T. (1997). Salient responsiveness of parabrachial neurons to the conditioned stimulus after the acquisition of taste aversion learning in rats. Neuroscience, 81, 239–247. Shimura, T., Tokita, K., & Yamamoto, T. (2002). Parabrachial unit activities after the acquisition of conditioned taste aversion to a non-preferred HCl solution in rats. Chemical Senses, 27, 153–158. Shipley, M. T., Ennis, M., & Puche, A. C. (2004). Olfactory system. In G. Paxinos (Ed.), The rat nervous system (3rd ed., pp. 923–964). San Diego, CA: Elsevier Academic Press. Simon, S. A., de Araujo, I. E., Gutierrez, R., & Nicolelis, M. A. (2006). The neural mechanisms of gustation: A distributed processing code. Nature Reviews: Neuroscience, 7, 890–901. Slotnick, B., & Bisulco, S. (2003). Detection and discrimination of carvone enantiomers in rats with olfactory bulb lesions. Neuroscience, 121, 451–457. Small, D. M., Bender, G., Veldhuizen, M. G., Rudenga, K., Nachtigal, D., & Felsted, J. (2007). The role of the human orbitofrontal cortex in taste and flavor processing. Annals of the New York Academy of Sciences, 1121, 136–151. Small, D. M., Voss, J., Mak, Y. E., Simmons, K. B., Parrish, T., & Gitelman, D. (2004). Experience-dependent neural integration of taste and smell in the human brain. Journal of Neurophysiology, 92, 1892–1903. Small, D. M., Zatorre, R. J., Dagher, A., Evans, A. C., & Jones-Gotman, M. (2001). Changes in brain activity related to eating chocolate: From pleasure to aversion. Brain, 124, 1720–1733. Smith, D. V., & Ossebaard, C. A. (1995). Amiloride suppression of the taste intensity of sodium chloride: Evidence from direct magnitude scaling. Physiology and Behavior, 57, 773–777. Smith, D. V., Van Buskirk, R. L., Travers, J. B., & Bieber, S. L. (1983). Coding of taste stimuli by hamster brain stem neurons. Journal of Neurophysiology, 50, 541–558. Smith, D. V., Ye, M. K., & Li, C. S. (2005). Medullary taste responses are modulated by the bed nucleus of the stria terminalis. Chemical Senses, 30, 421–434. Sollars, S. I., & Hill, D. L. (2005). In vivo recordings from rat geniculate ganglia: Taste response properties of individual greater superficial
c13.indd 303
petrosal and chorda tympani neurones. Journal of Physiology, 564, 877–893. Spector, A. C. (2003). The functional organzation of the gustatory system: Lessons from behavior. Progress in Psychobiology and Physiological Psychology, 18, 101–161. Spector, A. C., & Travers, S. P. (2005). The representation of taste quality in the mammalian nervous system. Behavioral and Cognitive Neuroscience Reviews, 4, 143–191. Spehr, M., Kelliher, K. R., Li, X. H., Boehm, T., Leinders-Zufall, T., & Zufall, F. (2006). Essential role of the main olfactory system in social recognition of major histocompatibility complex peptide ligands. Journal of Neuroscience, 26, 1961–1970. Spehr, M., Spehr, J., Ukhanov, K., Kelliher, K. R., Leinders-Zufall, T., & Zufall, F. (2006). Parallel processing of social signals by the mammalian main and accessory olfactory systems. Cellular and Molecular Life Sciences, 63, 1476–1484. St. John, S. J., & Spector, A. C. (1998). Behavioral discrimination between quinine and KCl is dependent on input from the seventh cranial nerve: Implications for the functional roles of the gustatory nerves in rats. Journal of Neuroscience, 18, 4353–4362. Stapleton, J. R., Lavine, M. L., Wolpert, R. L., Nicolelis, M. A., & Simon, S. A. (2006). Rapid taste responses in the gustatory cortex during licking. Journal of Neuroscience, 26, 4126–4138. Staubli, U., Fraser, D., Faraday, R., & Lynch, G. (1987). Olfaction and the “data” memory system in rats. Behavioral Neuroscience, 101, 757–765. Steiner, J. E. (1973). The gustofacial response: Observation on normal and anencephalic newborn infants. Symposium on Oral Sensation and Perception, 4, 254–278. Steiner, J. E., & Glaser, D. (1984). Differential behavioral responses to taste stimuli in hon-human primates. Journal of Human Evolution, 13, 709–723. Stewart, W. B., Kauer, J. S., & Shepherd, G. M. (1979). Functional organization of rat olfactory bulb analysed by the 2- deoxyglucose method. Journal of Comparative Neurology, 185, 715–734. Stopfer, M., Bhagavan, S., Smith, B. H., & Laurent, G. (1997, November 6). Impaired odour discrimination on desynchronization of odourencoding neural assemblies. Nature, 390, 70–74. Stratford, J. M., Curtis, K. S., & Contreras, R. J. (2006). Chorda tympani nerve transection alters linoleic acid taste discrimination by male and female rats. Physiology and Behavior, 89, 311–319. Sugai, T., Miyazawa, T., Fukuda, M., Yoshimura, H., & Onoda, N. (2005). Odor-concentration coding in the guinea-pig piriform cortex. Neuroscience, 130, 769–781. Sullivan, S. L., Adamson, M. C., Ressler, K. J., Kozak, C. A., & Buck, L. B. (1996). The chromosomal distribution of mouse odorant receptor genes. Proceedings of the National Academy of Sciences, USA, 93, 884–888. Suzuki, N., & Bekkers, J. M. (2007). Inhibitory interneurons in the piriform cortex. Clinical and Experimental Pharmacology and Physiology, 34, 1064–1069. Taha, S. A., & Fields, H. L. (2005). Encoding of palatability and appetitive behaviors by distinct neuronal populations in the nucleus accumbens. Journal of Neuroscience, 25, 1193–1202. Tamura, R., & Norgren, R. (1997). Repeated sodium depletion affects gustatory neural responses in the nucleus of the solitary tract of rats. Journal of General Physiology, 273, R1381–R1391. Tamura, R., & Norgren, R. (2003). Intracranial renin alters gustatory neural responses in the nucleus of the solitary tract of rats. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 284, R1108–R1118. Tindell, A. J., Smith, K. S., Pecina, S., Berridge, K. C., & Aldridge, J. W. (2006). Ventral pallidum firing codes hedonic reward: When a bad taste turns good. Journal of Neurophysiology, 96, 2399–2409.
8/17/09 2:09:16 PM
304
Chemical Senses
Tokita, K., Karadi, Z., Shimura, T., & Yamamoto, T. (2004). Centrifugal inputs modulate taste aversion learning associated parabrachial neuronal activities. Journal of Neurophysiology, 92, 265–279. Tokita, K., Shimura, T., Nakamura, S., Inoue, T., & Yamamoto, T. (2007). Involvement of forebrain in parabrachial neuronal activation induced by aversively conditioned taste stimuli in the rat. Brain Research, 1141, 188–196. Tomchik, S. M., Berg, S., Kim, J. W., Chaudhari, N., & Roper, S. D. (2007). Breadth of tuning and taste coding in mammalian taste buds. Journal of Neuroscience, 27, 10840–10848. Topolovec, J. C., Gati, J. S., Menon, R. S., Shoemaker, J. K., & Cechetto, D. F. (2004). Human cardiovascular and gustatory brainstem sites observed by functional magnetic resonance imaging. Journal of Comparative Neurology, 471, 446–461. Travers, J. B. (1988). Efferent projections from the anterior nucleus of the solitary tract of the hamster. Brain Research, 457, 1–11. Travers, J. B., Grill, H. J., & Norgren, R. (1987). The effects of glossopharyngeal and chorda tympani nerve cuts on the ingestion and rejection of sapid stimuli: An electromyographic analysis in the rat. Behavioural Brain Research, 25, 233–246. Travers, S. P. (2002). Quinine and citric acid elicit distinctive fos-like immunoreactivity in the rat nucleus of the solitary tract. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 282, R1798–R1810. Travers, S. P., & Norgren, R. (1995). Organization of orosensory responses in the nucleus of the solitary tract of rat. Journal of Neurophysiology, 73, 2144–2162. Travers, S. P., Pfaffmann, C., & Norgren, R. (1986). Convergence of lingual and palatal gustatory neural activity in the nucleus of the solitary tract. Brain Research, 365, 305–320. Uchida, N., Kepecs, A., & Mainen, Z. F. (2006). Seeing at a glance, smelling in a whiff: Rapid forms of perceptual decision making. Nature Reviews: Neuroscience, 7, 485–491. Uchida, N., & Mainen, Z. F. (2003). Speed and accuracy of olfactory discrimination in the rat. Journal of Neuroscience, 6, 1224–1229. Vandenbeuch, A., Clapp, T. R., & Kinnamon, S. C. (2008). Amiloridesensitive channels in type I fungiform taste cells in mouse. BMC Neuroscience, 9, 1. van der Klaauw, N. J., & Smith, D. V. (1995). Taste quality profiles for fifteen organic and inorganic salts. Physiology and Behavior, 58, 295–306. Vassar, R., Chao, S. K., Sitcheran, R., Nunez, J. M., Vosshall, L. B., & Axel, R. (1994). Topographic organization of sensory projections to the olfactory bulb. Cell, 79, 981–991. Venugopal, S., Travers, J. B., Terman D. H. (2007 April). A computational model for motor pattern switching between taste-induced ingestion and rejection oromotor behaviors. Journal Comput Neurosci, 22, 223–38. Voshart, K., & van der Kooy, D. (1981). The organization of the efferent projections of the parabrachial nucleus of the forebrain in the rat: A retrograde fluorescent double-labeling study. Brain Research, 212, 271–286. Vucinic, D., Cohen, L. B., & Kosmidis, E. K. (2006). Interglomerular center-surround inhibition shapes odorant-evoked input to the mouse olfactory bulb in vivo. Journal of Neurophysiology, 95, 1881–1887. Wachowiak, M., & Shipley, M. T. (2006). Coding and synaptic processing of sensory information in the glomerular layer of the olfactory bulb. Seminars in Cell and Developmental Biology, 17, 411–423. Walsh, R. R. (1956). Single cell spike activity in the olfactory bulb. Journal of General Physiology, 186, 255–257. Wang, L., & Bradley, R. M. (1995). In vitro study of afferent synaptic transmission in the rostral gustatory zone of the rat nucleus of the solitary tract. Brain Research, 702(1–2), 188–198.
c13.indd 304
Wheeler, R. A., Twining, R. C., Jones, J. L., Slater, J. M., Grigson, P. S., & Carelli, R. M. (2008). Behavioral and electrophysiological indices of negative affect predict cocaine self-administration. Neuron, 57, 774–785. Whissell-Buechy, D., & Amoore, J. E. (1973, March 23). Odour-blindness to musk: Simple recessive inheritance. Nature, 242, 271–273. Whitehead, M. C. (1986). Anatomy of the gustatory system in the hamster: Synaptology of facial afferent terminals in the solitary nucleus. Journal of Comparative Neurology, 244, 72–85. Whitehead, M. C. (1988). Neuronal architecture of the nucleus of the solitary tract in the hamster. Journal of Comparative Neurology, 276, 547–572. Whitehead, M. C. (1990). Subdivisions and neuron types of the nucleus of the solitary tract that project to the parabrachial nucleus in the hamster. Journal of Comparative Neurology, 301, 554–574. Whitehead, M. C. (1993). Distribution of synapses on identified cell types in a gustatory subdivision of the nucleus of the solitary tract. Journal of Comparative Neurology, 332, 326–340. Whitehead, M. C., Bergula, A., & Holliday, K. (2000). Forebrain projections to the rostral nucleus of the solitary tract in the hamster. Journal of Comparative Neurology, 422, 429–447. Whitehead, M. C., & Finger, T. (2008). Gustatory pathways in fish and mammals. In S. Firestein & G. K. Beauchamp (Eds.), Handbook of the senses: A comprehensive reference (Vol. 4, pp. 237–258). Whitehead, M. C., & Frank, M. E. (1983). Anatomy of the gustatory system in the hamster: Central projections of the chorda tympani and the lingual nerve. Journal of Comparative Neurology, 220, 378–395. Wilson, D. A., Kadohisa, M., & Fletcher, M. L. (2006). Cortical contributions to olfaction: Plasticity and perception. Seminars in Cell and Developmental Biology, 17, 462–470. Wu, S. V., Rozengurt, N., Yang, M., Young, S. H., Sinnett-Smith, J., & Rozengurt, E. (2002). Expression of bitter taste receptors of the T2R family in the gastrointestinal tract and enteroendocrine STC1 cells. Proceedings of the National Academy of Sciences, USA, 99, 2392–2397. Yamamoto, T., Matsuo, R., & Kawamura, Y. (1980). Localization of cortical gustatory area in rats and its role in taste discrimination. Journal of Neurophysiology, 44, 440–455. Yamamoto, T., Matsuo, R., Kiyomitsu, Y., & Kitamura, R. (1989). Taste responses of cortical neurons in freely ingesting rats. Journal of Neurophysiology, 61, 1244–1258. Yamamoto, T., & Sawa, K. (2000). Comparison of c-fos-like immunoreactivity in the brainstem following intraoral and intragastric infusions of chemical solutions in rats. Brain Research, 866(1/2), 144–151. Yamamoto, T., Shimura, T., Sakai, N., & Ozaki, N. (1994). Representation of hedonics and quality of taste stimuli in the parabrachial nucleus of the rat. Physiology and Behavior, 56, 1197–1202. Yamamoto, T., Yuyama, N., Kato, T., & Kawamura, Y. (1985). Gustatory responses of cortical neurons in rats: Pt. II. Information processing of taste quality. Journal of Neurophysiology, 53, 1356–1369. Yan, J., & Scott, T. R. (1996). The effect of satiety on responses of gustatory neurons in the amygdala of alert cynomolgus macaques. Brain Research, 740(1–2), 193–200. Yaxley, S., Rolls, E. T., Sienkiewicz, Z. J., & Scott, T. R. (1985). Satiety does not affect gustatory activity in the nucleus of the solitary tract of the alert monkey. Brain Research, 347, 85–93. Yokoi, M., Mori, K., & Nakanishi, S. (1995). Refinement of odor molecule tuning by dendrodendritic synaptic inhibition in the olfactory bulb. Proceedings of the National Academy of Sciences, USA, 92, 3371–3375.
8/17/09 2:09:17 PM
References 305 Yoon, H., Enquist, L. W., & Dulac, C. (2005). Olfactory inputs to hypothalamic neurons controlling reproduction and fertility. Cell, 123, 669–682.
Zhang, X., Zhang, X., & Firestein, S. (2007). Comparative genomics of odorant and pheromone receptor genes in rodents. Genomics, 89, 441–450.
Yoshida, I., & Mori, K. (2007). Odorant category profile selectivity of olfactory cortex neurons. Journal of Neuroscience, 27, 9105–9114.
Zhang, Y., Hoon, M. A., Chandrashekar, J., Mueller, K. L., Cook, B., Wu, D., et al. (2003). Coding of sweet, bitter, and umami tastes: Different receptor cells sharing similar signaling pathways. Cell, 112, 293–301.
Young, J. M., Shykind, B. M., Lane, R. P., Tonnes-Priddy, L., Ross, J. A., Walker, M., et al. (2003). Odorant receptor expressed sequence tags demonstrate olfactory expression of over 400 genes, extensive alternate splicing and unequal expression levels. Genome Biology, 4, R71. Young, J. M., & Trask, B. J. (2007). V2R gene families degenerated in primates, dog and cow, but expanded in opossum. Trends in Genetics, 23, 212–215. Zahm, D. S. (2000). An integrative neuroanatomical perspective on some subcortical substrates of adaptive responding with emphasis on the nucleus accumbens. Neuroscience and Biobehavioral Reviews, 24, 85–105. Zaidi, F. N., & Whitehead, M. C. (2006). Discrete innervation of murine taste buds by peripheral taste neurons. Journal of Neuroscience, 26, 8243–8253.
c13.indd 305
Zhao, F. L., Shen, T., Kaya, N., Lu, S. G., Cao, Y., & Herness, S. (2005). Expression, physiological action, and coexpression patterns of neuropeptide Y in rat taste-bud cells. Proceedings of the National Academy of Sciences, USA, 102, 11100–11105. Zhao, G. Q., Zhang, Y., Hoon, M. A., Chandrashekar, J., Erlenbach, I., Ryba, N. J., et al. (2003). The receptors for mammalian sweet and umami taste. Cell, 115, 255–266. Zozulya, S., Echeverri, F., & Nguyen, T. (2001). The human olfactory receptor repertoire. Genome Biology, 2, RESEARCH0018, Epub June 1, 2001. Zufall, F., & Leinders-Zufall, T. (2007). Mammalian pheromone sensing. Current Opinion in Neurobiology, 17, 483–489.
8/17/09 2:09:17 PM
Chapter 14
Somatosensory Processes STEVEN S. HSIAO AND PRAMODSINGH H. THAKUR
rapid. Blind subjects can read Braille at approximately 100 words per minute that translates to a recognition rate of about 100 ms for a single Braille character (Foulke, 1991). This is extremely fast considering the slow rate at which action potentials are generated and transmitted. One hundred milliseconds includes the time for the impulses to be encoded in the peripheral responses, conveyed to the cortex (which takes about 25 ms alone), transformed at several processing stages and matched against all stored memories. This suggests that the mechanisms underlying tactile pattern perception involve relatively few serial processing stages and that the cortical processing of sensory information uses parallel processing mechanisms. In this chapter, we first give a brief discussion on the psychophysics of tactile perception. We concentrate our discussion on the perception of tactile shape and texture from the hand. We then discuss our present understanding of how stimuli that activate the mechanoreceptive afferents are represented in the periphery and in the responses of neurons in primary and secondary somatosensory cortical areas, briefly discuss how these representations are affected by higher cognitive functions such as selective attention, and conclude by speculating on how sensory inputs are integrated to form coherent representations of the size and shape of objects.
A fundamental question across all sensory systems is to understand the neural mechanisms that underlie the perceptions that we experience as we interact and move about in our environment. The question is particularly interesting and challenging in the somatosensory system, which dynamically integrates, in a seamless manner, a wide range of sensory inputs that guide motor outputs. The sensory inputs include the appreciation of exteroceptive stimuli or perceptions that are produced by environmental stimuli, which include the perception of temperature, pain, itch, and tactile inputs such as the form and texture of objects. The sensory inputs also include the appreciation of interoceptive stimuli or perceptions, which include the perception of body position, body movement, and body force. Anatomical and neurophysiologic evidence suggests that each of these sensory inputs are initially segregated, with each modality having a unique set of peripheral input receptors that are initially processed along separate ascending and cortical pathways. However, perceptually, these inputs are processed in parallel, which results in a single unified percept of the environment. An example of the close interplay between the different modalities of touch and motor actions is the act of making of a snowball. While the task seems simple, in reality it is quite complex and involves recognizing the local shape, texture, and temperature of the snow at each location where the fingers contact the snow and then recognizing how the global and local shape of the snow changes as the snow is compressed into a ball. The recognition process begins with the activation of arrays of peripheral receptors embedded in the skin, muscles, tendons, and joints which provide an initial peripheral representation of the sensory input. This initial representation is in the form of spatial and temporal patterns of action potentials that travel along the spinal cord, thalamus, and to various processing stages in cortex where the information is transformed and integrated into a central representation that is matched against stored memories. The sensory information is then used to guide the motor outputs. The entire process is extremely
PSYCHOPHYSICS Tactile Perception Our ability to perceive the world through our hands is rich and diverse and in many ways analogous to the visual and auditory perceptual experience. The richness of touch is eloquently described by Helen Keller, “My world is built of touch-sensations, devoid of physical color and sound; but not without color and sound it breathes and throbs with life. Every object is associated in my mind with tactual qualities which, combined in countless ways, give me a 306
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c14.indd 306
8/17/09 2:09:37 PM
Psychophysics
sense of power, of beauty, or of incongruity: for with my hand I can feel the comic as well as the beautiful in the outward appearance of things.” (Keller, 1919, p. 7). Two kinds of psychophysical studies have typically been performed. The first are objective perceptions, which include attributes of the stimuli that can be objectively quantified and as such can be objectively rated as being correct or wrong. One example of an objective study is subjects performing a tactile letter recognition task. In these studies, the subjects scan their fingers over an embossed letter and state what letter they feel. In this experiment, the subjects’ performance can be objectively judged by the experimenter to be right or wrong. The second are subjective perceptions for which there is no right or wrong answers. An example of a subjective task is one where subjects feel a textured surface and give a subjective magnitude estimate of the roughness of the surface. The task is subjective because the response that the subjects give cannot be judged as right or wrong by the experimenter.
307
of confusions that are made in vision and touch, when the letters are scaled in height to the receptor densities in the two systems, are similar. The performance is independent of whether the letters are indented or scanned across the finger pad or whether the letters are actively or passively presented to the skin (Phillips, Johnson, & Browne, 1983; Vega-Bermudez, Johnson, & Hsiao, 1991). Recent studies (Bensmaia, Denchev, Dammann, Craig, & Hsiao, 2008), however, suggest that although perception may be similar, the acuity for distinguishing complex shapes may be different. In those studies, subjects were asked to discriminate the relative orientations of bars and edges presented to the distal finger pad. They find that while human observers are visually able to discriminate the orientation of bars that differ by fractions of a degree, the comparable threshold in touch is closer to 20 degrees. This tactile orientation discrimination threshold is independent of whether the bars are indented or scanned across the finger pad. These results demonstrate that although the mechanisms of 2D form process are similar between the two systems, they are not identical.
Psychophysics of Two-Dimensional Spatial Form
100
100
90
80
80
60
70
40
60
20
Chance
50
Percent Correct (letter identification)
Percent Correct (gap and grating tasks)
A number of studies have shown that we have a high capacity to discriminate two-dimensional (2D) patterns that are scanned or indented into the distal finger pads. In touch, the threshold for perception of 2D patterns is determined by the density of the receptors innervating the skin. In the finger pad, this translates to a spatial acuity of about 1 mm (Figure 14.1). Furthermore, studies in which subjects are presented with complex 2D patterns suggest that the mechanisms of form processing in vision and touch are similar (see Hsiao, 1998, for a review). The similarity in form processing between the two systems is exemplified by studies where subjects are asked to discriminate embossed letters of the alphabet. Those studies show that the patterns
0 0.0
0.5
1.0 1.5 2.0 Element Width (mm)
2.5
Figure 14.1 Tactile spatial acuity is about 1.0 mm on the distal finger pad measured using three psychophysical tasks: gap detection, grating orientation, and letter recognition.
c14.indd Sec1:307
Psychophysics of Three-Dimensional Size and Shape While the mechanisms underlying 2D form processing are similar between vision and touch, the mechanisms underlying three-dimensional (3D) form are different. The ability to recognize 3D shapes improves as the number of digits used to contact objects increases (Davidson, 1972; Kappers & Koenderink, 1996), demonstrating that 3D form processing involves the integration of tactile inputs across fingers. Further, the ability of subjects to rapidly recognize common objects without visual input is greater than 96% (Klatzky, Lederman, & Metzger, 1985). Three-dimensional object recognition typically involves dynamic exploratory movements where subjects enclose the object in their hands or systematically moving their fingers around the object (Lederman & Klatzky, 1987). The results from a variety of studies suggest that the mechanisms differ for discriminating small and large objects. The discrimination of small shapes depends on cutaneous inputs and the discrimination of large shapes depends on the integration of cutaneous inputs with inputs from proprioceptors that signal hand conformation. Davidson (1972) showed that subjects are less successful at identifying 3D curved surfaces when tracing them with a single finger than when they contact surfaces simultaneously with multiple fingers, demonstrating that simultaneous input from the different digits is important for object recognition. The importance of proprioceptive input to object recognition is illustrated by the Aristotle illusion. In this illusion, a continuous edge feels like two separate edges (with different orientations) when touched with the fingers crossed
8/17/09 2:09:37 PM
Somatosensory Processes
Psychophysics of Texture Psychophysical studies show that the perception of texture is multidimensional in nature with the main components captured along three dimensions. The first dimension is determined by the degree that surfaces vary in height and is related to the percepts of rough and smooth. The second dimension is determined by the degree that surfaces conform to pressures normal to the surface and is related to the percepts of hard and soft. The third dimension is determined by surface friction and is related to the percepts of sticky and slippery. These dimensions capture over 90% of the variance in multidimensional scaling studies (Hollins, Faldowski, Rao, & Young, 1993). Studies where surfaces are scanned with the bare finger and with probes that are held in the hand show that the three textural dimensions are similar but not identical in the two scanning modes, which suggests that the neural mechanisms underlying texture perception with a bare finger and probe are different (Yoshioka, Bensmaia, Craig, & Hsiao, 2007).
c14.indd Sec1:308
Synergy 9 (S9)
Synergy 8 (S8)
Synergy 7 (S7)
Synergy 6 (S6)
Synergy 5 (S5)
Synergy 4 (S4)
Synergy 3 (S3)
Synergy 2 (S2)
Synergy 1 (S1)
(A)
(B)
Postural changes along a synergy
Actual
Constructed for synergies Synergy 1
Synergies 1⫹6 Synergies 1⫹2⫹6 Synergy 1 to 6
Front view
(Benedetti, 1985; Craig, 2003). These results demonstrate that proprioceptive inputs modify the way that cutaneous inputs are interpreted and provide us with a working hypothesis that the coding of 3D shape depends on the interpretation of local cutaneous features in the context of hand conformation and movements. If this hypothesis is correct, then there should be specific hand conformations that are used when grasping shapes. The human hand is a complex structure with multiple joints with more than 20 degrees of freedom. However, the movements of the different joints are not independent and tend to be correlated. A recent study (Thakur, Bastian, & Hsiao, 2008) revealed that during haptic manipulation of everyday objects, subjects evoke hand postures along a limited set of nine patterns of hand motions, referred to as synergies (Figure 14.2). These synergies are consistent across various subjects and across different sets of objects explored. We believe that these synergies provide a reduced basis set over which proprioceptive information from the hand is processed in the cortex. Moreover, these synergies are also used during simpler tasks such as reach-to-grasp, which may provide a simple experimental paradigm to parametrically sample the space of proprioception used during haptic manipulation. The perception of object size also requires the integration of cutaneous and proprioceptive inputs. Berryman, Yau, and Hsiao (2006) showed that the perceived size of an object is independent of contact force or contact area and is based on an integration of information about the compliance of the objects that is derived from the cutaneous inputs with information of the spread between the fingers that is derived from proprioceptive inputs.
Side view
308
Figure 14.2 A: Postures along the nine synergies. Each row illustrates the postural changes as the hand moves along a particular synergy. B: Examples of how synergies are combined to form a pinch grip between digits 1 and 2. Note: Column 1 is the actual pinch movement. Columns 2, 3, 4 are the contribution of synergy 1 alone, synergy 1 and 6, and synergies 1, 2 and 5. Column 5 is the sum of the first 6 synergies. From “Multi-Digit Movement Synergies of the Human Hand in an Unconstrained Haptic Exploration Task,” by P. H. Thakur, A. J. Bastian, and S. S. Hsiao, 2008, Journal of Neuroscience, Volume 28, p. 1275. Reprinted with permission.
PERIPHERAL RECEPTORS The peripheral afferent system consists of 13 types of afferents traditionally classified based on the class of information they carry (Table 14.1). These include four mechanoreceptive afferents, four proprioceptive afferents,
8/17/09 2:09:38 PM
Peripheral Receptors 309 Table 14.1
Peripheral receptor types.
Receptor
Fiber Group
Receptors Respond To
Function
Cutaneous, low-threshold mechanoreceptors Merkel (SAI)
Aβ
Steady deformation and motion (10x greater)
Form perception (e.g., Braille); texture perception (roughness, hardness)
Ruffini (SAII)
Aβ
Skin stretch
Perception of skin stretch, hand shape, force acting on an object in the hand
Meissner (RA1)
Aβ
Skin movement (glabrous skin only)
Perception of local movement, detection of slip, grip control
Pacinian (RAII)
Aβ
High frequency vibration
Perception of distant events through objects held or grasped in the hand
I
Muscle length and velocity
Joint angle
Proprioceptors Muscle spindle (Ia) Golgi tendon organ (Ib)
I
Muscle force
Muscle force? (not confirmed)
Muscle spindle (II)
II
Muscle length
Joint angle? (not confirmed)
Joint
II
Joint angle, movement?
Most studies show little or no effect on perception
Cold
Aδ
Drop in skin temperature
Thermal sense: temperature of object relative to skin temperature
Warm
C
Warmth
Thermal sense: object warmth
Small Myelinated
Aδ
Noxious stimuli
Sharp, pricking pain
Unmyelinated
C
Noxious stimuli
Dull, burning pain
Itch receptors
C
Pruritic stimuli
Itch
Thermoreceptors
Nociceptors
two thermoreceptive afferents (for cooling and warming, respectively), two nociceptive or pain afferents, and one afferent related to itch. The receptors underlying temperature, pain, and itch send their outputs to the cortex via axons that are small, approximately 1 to 2 microns in diameter that are either unmyelinated (warm, burning pain, and itch) or myelinated (cold, pricking pain). The neural mechanisms underlying inputs from these afferents are complex and will not be discussed further in this chapter. Peripheral Receptors Underlying Mechanoreception The four mechanoreceptive afferents are further classified based on their response to a sustained indentation. Two types of mechanoreceptors respond with a brief burst followed by a sustained discharge that decays gradually over time and are named slowly adapting types I and II (SAI and SAII). The other two types are characterized by responding with a transient burst of impulses to the onset and offset of the stimulus that decays rapidly to zero and hence are rapidly adapting and are called RA and Pacinian (PC). Extensive neurophysiological studies in humans and nonhuman primates of these afferents (except the SAII afferents, which do not exist in nonhuman primates) have
c14.indd Sec2:309
shown that there are minimal interspecies differences in their response properties. SAI afferents branch repeatedly toward their end, lose their myelin, and end in several closely packed dermal ridges at the base of the epidermis where the branch endings are enclosed by epidermal merkel cells (Iggo & Andres, 1982). Although, the merkel cells contact the afferent ending via glutamergic synapses, it is unclear what role they play in the transduction of mechanical stimuli. Action potentials in the afferents appear to arise due to mechanosensitive channels at the tips of the axonal endings (Diamond, Mills, & Mearow, 1988; Ogawa, 1996). SA1 afferents have high innervation density and small receptive field (RF) sizes (about 100 afferents per cm2 at the fingertip for both man and monkey, diameters of about 2.5 mm; see Johnson, 2001, for a review). As a result they transmit a high-resolution spatial image of the stimulus indenting the skin (Figure 14.3) and are able to resolve spatial details of stimuli down to about 1.0 mm (Phillips & Johnson, 1981a). Indenting the RF of an SAI afferent reveals a suppressive region surrounding the excitatory core of the RF (Vega-Bermudez & Johnson, 1999). Surround suppression provides feature selectivity to edges in the neuronal responses. A given stimulus elicits differential responses depending on the relative strengths of excitation and suppression encountered in different
8/17/09 2:09:40 PM
310
Somatosensory Processes
Figure 14.3 Spatial event plots of peripheral SA1 (Top), RA (Middle), and PC (Bottom) neurons for embossed letters 6.0 mm high. Note: SA1 afferents show high spatial acuity, RA show poorer acuity and PCs are unable to discriminate between the different letters. From Phillips et al. (1988). Reprinted with permission.
directions. The suppression also makes the neuron insensitive to uniform indentation. Unlike the sensory cortical neurons, where surround inhibition is thought to arise due to lateral inhibition, surround inhibition in SAI afferents is caused by skin mechanical effects, which render the neuron sensitive to specific component of strain or a closely related variable (Dandekar, Raju, & Srinivasan, 2003; Phillips & Johnson, 1981b; Sripati, Bensmaia, & Johnson, 2006). RA afferents also branch repeatedly as they approach the skin and end in broadly stacked discs in the Meissner ’s corpuscles. The Meissner ’s corpuscles occur in the dermal pockets between the sweat glands and the adhesive ridges, thus located as close to the epidermis as possible (Guinard, Usson, Guillermet, & Saxod, 2000). Their proximity to the surface, together with their extremely large density, makes them very sensitive to minute deformations of the skin. The effective operating range of indentations for an RA afferent is 4 to 400 microns, whereas the equivalent range for SAI afferents is 15 to 1,500 microns (Blake, Johnson, & Hsiao, 1997; Johansson & Vallbo, 1979). Although extremely sensitive, RAs resolve the spatial details of stimuli poorly as compared to SAI afferents (Figure 14.3), partly due to larger receptive field sizes owing to the greater degree of convergence and divergence between the Meissner corpuscles and the afferent fiber to which they project, which results in these afferents having relatively uniform receptive field profiles. In addition, the filtering provided by the stacks of corpuscles renders the RAs insensitive to sustained deformations. Thus, RA responses are best suited to signal responses related to small changes in inputs, such as those encountered in low amplitude, low frequency vibrations (flutter), or very fine slips of objects held in the hand. The second class of rapidly adapting mechanoreceptors end in the deeper layers of dermis in bodies comprising of concentric layers of fluid filled sacs known as Pacinian corpuscles (Bell, Bolanowski, & Holmes, 1994). The multiple layers concentric shells act as a high pass filter making
c14.indd Sec2:310
PC afferents insensitive to static indentation but extremely sensitive to high frequency vibrations (Talbot, DarianSmith, Kornhuber, & Mountcastle, 1968). These afferents can detect indentations as small as 1 nm applied directly to the corpuscle or 10nm applied to the skin (Brisben, Hsiao, & Johnson, 1999). Because of their extreme sensitivity, the receptive field boundaries for PC afferents are hard to define and they sometimes encompass the entire hand or arm. Because of their large receptive fields and extreme sensitivity to high frequencies, they are effective at transmitting vibrations through hand-held objects and play a dominant role in transmitting information regarding surfaces explored through hand-held probes. The SAII afferents have larger receptive field sizes than SAI afferents. Both SAI and SAII respond to forces orthogonal and parallel to the skin surface. However, between the two, SAI are more sensitive to orthogonal forces, while SAII are sensitive to forces parallel to the skin surface (Macefield, Hager-Ross, & Johansson, 1996). This makes SAII afferents insensitive to simple indentations and more sensitive to skin stretch as compared to SAI. Due to their larger receptive field sizes and sensitivity to skin stretch, SAII afferents are thought to transmit a dynamic neural image of hand conformation. A striking peculiarity of SAII afferents is their absence in some of the mammals. Studies have documented the presence of SAII afferents in cats and humans, but they are absent in nonhuman primates and mice. In cats, SAII afferent axons terminate in Ruffini endings. However in humans, Ruffini endings have been reported only in the bed of the fingernails and as such there is at present uncertainty as to the receptor ending for these afferent fibers. Evidence from studies that combine psychophysics and neurophysiology suggest that each of these afferent fibers play a different role in perception. SAI afferents are the spatial system. They are the only afferents with a spatial acuity that is sufficient to support the psychophysical studies described previously. Further, in a series of studies (Figure 14.4), Hsiao and Johnson showed that only these afferents can account for subjective estimates of the roughness of surfaces when scanned with the bare finger. These studies show that roughness is coded by a central mechanism that computes the spatial variation in firing rates among the population of SAI afferents separated by about 2.0 mm. Studies by LaMotte and his colleagues suggest that SAI afferents are also responsible for the perception of surface hardness and softness (Srinivasan & LaMotte, 1995). The working hypothesis is that hardness is related to the overall spatial pattern of activation of the SAI afferents. The peripheral neural coding of stickiness has not been investigated. The SAI system is analogous to the parvocellular system in the visual system (Hsiao, 1998).
8/17/09 2:09:41 PM
Peripheral Receptors 311 r ⫽ 0.98 700 µm
2.0
12 mm dot diameter
Perceived roughness
75 50
1.0
25 0
2.0
2.0 4.0 6.0 Dot spacing (mm)
6.0
4.0
1.3
2.4
3.2
4.3
2.0
5.2
620 µm dot height
370 µm
280 µm
6.0
6.2
r ⫽ 0.97
(B) Blake et al. (1997)
4.0
Perceived roughness
2.0
75 50
1.0
25 0 0
1.0
0
2.0
0.25
0.70
1.0 2.0 Dot diameter (mm)
1.15
0
1.60
1.0
2.0
2.05
r ⫽ 0.98
(C) Connor and Johnson (1992)
SA1 spatial variation (ips)
500 µm
2.50
Yoshioka et al. (2001)
1.5
SA1 spatial variation (ips)
(A) Connor et al. (1990)
r ⫽ 0.97 75
2.0
1.0
50
75 1.0 50
0.5
25
25 0 1.0
0
0 2.0
3.0
4.0
1.0
2.0
Horizontal spacing (mm)
3.0
4.0
0
Vertical spacing (mm)
0.5
2.0
2.5
3.0
3.5
4.0
1.5
2.0
2.5
3.0
3.5
1.5
Groove width (mm) 0.1 0.2 0.4
1.5
1.0
SA1 spatial variation (ips)
Perceived Roughness
100
0.8 1.0
1.5
4.0
Dot spacing (mm)
Note: Results from four studies showing the relationship between psychophysical study where subjects were asked to give subjective magnitude estimates of the roughness of surfaces and neurophysiological recordings from peripheral SAI afferents of the monkey. The four studies are labeled A–D. At the bottom of each graph is a picture of the stimulus pattern.
Plotted in each graph are the normalized psychophysical estimates and the mean spatial variation in firing rates of SAI afferents separated by about 2.0 mm. The correlations for all four studies are greater than .97. a Based on Blake, Hsiao, and Johnson (1997). b Based on Connor and Johnson (1992). c Based on Connor, Hsiao, Phillips, and Johnson (1990). d Based on Yoshioka, Gibb, Dorsch, Hsiao, and Johnson (2001).
The RA afferents are the motion system. The brief transient onset and offset responses to transient stimuli make them particularly sensitive to low frequency vibrations (called flutter) and the detection of moving stimuli (Gardner & Palmer, 1990; Talbot et al., 1968). In addition, studies in humans suggest that these afferents play an important role in signaling slip on the
skin, which is particularly important for adjusting grip force when grasping objects (Westling & Johansson, 1987). As stated earlier, the PC system is the vibration system and it plays an important role in using tools and signaling information corresponding texture information with tools. The SAII afferents signal local skin stretch and
Figure 14.4 Peripheral neural coding of roughness.
c14.indd Sec2:311
8/17/09 2:09:41 PM
312
Somatosensory Processes
may be important for signaling joint angle (Edin & Abbs, 1991). Peripheral Receptors Underlying Proprioception The four types of proprioceptive afferents are classified based on their targets in the periphery. Two of them, namely the group Ia afferents and group II afferents terminate in the primary and secondary muscle spindle receptors. The other two terminate in the Golgi tendon organs and receptors located in the joints. Muscle spindles are located in the fleshy part of the muscle and innervate 3 to 10 intrafusal muscle fibers. They are positioned in parallel to the extrafusal muscle fibers, such that the ends of the intrafusal muscle fibers make lateral connection with the perimyesium of the muscle fascicle. Since the length of muscle spindles is constant across different muscles, a change in muscle length induces a stretch in the spindle receptors that is proportional to the relative change in the length or velocity of the muscle fascicle irrespective of the length or size of the muscle (Proske, Wise, & Gregory, 2000). Thus, spindle afferents signal the relative change in the length and velocity of the muscle during movements. Unlike spindle afferents, which are placed in parallel with the muscles, Golgi tendon receptors are arranged in series with the muscle fascicles. Over 90% of the tendon receptors are located at the musculotendinous junctions, while the remainder are located in the tendons (Barker, Emonet-Denand, Laporte, Proske, & Stacey, 1973). Due to their serial location, tendon receptors are stretched in response to muscle fascicle contraction, and the tendon afferents are sensitive to the force generated by the contracting muscle fascicles. Together with the spindle afferents, the tendon afferents form a complementary signaling system. Muscle contraction causes an increase in force that results in the activation of the tendon afferent simultaneously, the spindle afferents fall silent due to the decrease in the muscle length. Similarly, when the muscle relaxes, spindle afferents are activated due to the lengthening of the muscle, while the tendon afferents reduce their firing due to the reduction in the muscular force. Joint afferents include both small and large diameter afferents that innervate all intra-articular structures such as ligaments, discs, and menisci. The slowly adapting type of joint afferents terminate in Golgi organs or Ruffini endings, while the rapidly adapting type terminate in Pacinian or smaller Paciniform endings. As compared to the cutaneous mechanoreceptors, joint receptives are less sensitive to light cutaneous stimuli applied to the skin but respond well to compressive forces. Golgi afferents respond to compressive forces normal to the capsular surface, while the Ruffini afferents respond to planar forces (Grigg & Hoffman, 1982).
c14.indd Sec2:312
ASCENDING PATHWAYS The information from these peripheral afferent fibers ascends along two main pathways to the cortex. One pathway is the spinalthalamic tract that carries information from the small diameter fibers and conveys information from the nociceptive, thermoreceptive, and itch afferents. The other pathway is the dorsal-column medial leminiscal pathway that conveys information from the large diameter afferents, namely the mechanoreceptive and proprioceptive afferents. Neurophysiological studies suggest that intermediate neurons along this pathway function mainly as relay stations and that there is little convergence or divergence of information. Those studies show that there is a tight correspondence between the firing of neurons in the dorsal column nuclei (DCN) and their peripheral afferent counterparts (Coleman, Zhang, & Rowe, 2003; Gynther, Vickery, & Rowe, 1995) with afferent fibers able to directly generate spikes in DCN neurons. Similarly, studies in neurons in the ventroposterior lateral neurons of the thalamus (VPL) suggest that these neurons also have small receptive fields that suggest that there is little convergence of afferent input in VPL as well (Wang, Merzenich, Sameshima, & Jenkins, 1995). Thus, the first processing station of tactile information appears to be in areas 3a, 3b, 1, and 2, which are the four areas that comprise primary somatosensory (SI) cortex (Figures 14.5, 14.6, and 14.7).
Insula SII
7b Ri
SIIa SIIc SIIp
5
SI
3a
3b
Proprio-
Thalamus
1
2
Mechano-
VPI
Proprio-
VPL
Dorsal Horn NociThermoMechano-
VPS
DCN MechanoProprio-
Figure 14.5 Block diagram of the anatomical pathways of the somatosensory cortex.
8/17/09 2:09:42 PM
Cortical Processing of Tactile Information 313 (A)
(B) Postcentral gyrus (first somatosensory cortex) Intralaminar and posterior groups of thalamic nuclei
Ventral posterolateral nucleus of thalamus
Second somatosensory cortex
Periaqueductal gray matter
Spinal lemniscus
Medial lemniscus
Gracile nucleus Cuneate nucleus Internal arcuate Fibers
Pontine reticular formation Decussation of the medial lemnisci Medullary reticular formation Cuneate Fasciculus Cervical level
Dorsal part of lateral funiculus
Spinothalamic tract Ventral white commissure
Gracile Fasciculus Thoracic level
Lumbosaccral level
Figure 14.6 Ascending pathways of the dorsal-column-medial leminiscal pathway (A) that carries information about mechanoreceptive and proprioceptive inputs and the spinal thalamic pathway (B) that carries information from pain, temperature, and itch afferents.
CORTICAL PROCESSING OF TACTILE INFORMATION Primary Somatosensory Cortex The neural responses of the SI cortex, which is the main target of thalamic VPL neurons, are in the initial stages of being understood. Neurons in 3a receive their inputs from the shell region surrounding VPL and respond primarily to deep input, which suggests that 3a is responsible for processing proprioceptive information, particularly joint position and joint velocity (Gardner, 1988). This area may play an important role in sensory prosthetic devices where electrical stimulation is used to signal limb and hand position. Perhaps the most extensively studied region of SI is area
c14.indd Sec3:313
Note: From “Somatic Sensation” (p. 775), in Fundamental Neuroscience Academic, M. J. Zigmond, F. E. Bloom, S. C. Landis, J. L. Roberts, and L. R. Squire (Eds.), 1999, San Diego, CA: Academic Press. Reprinted with permission.
3b, which receives its primary input from the core region of the thalamic neurons in VPL. Area 3b is an important processing stage for tactile information and is responsible for extracting fundamental features of the cutaneous inputs. This is demonstrated in ablation studies in which animals are unable to perform tactile tasks that require cutaneous inputs, including texture and form discrimination tasks (Randolph & Semmes, 1974). The responses of neurons in 3b suggest that it may function like neurons in VI cortex. As shown in Figure 14.8, neurons in 3b typically have receptive fields composed of a central excitatory region flanked by one or more inhibitory regions which, like simple cells in the visual system, provide these neurons with feature selectivity to spatial stimuli. As shown, the excitatory and
8/17/09 2:09:42 PM
314
Somatosensory Processes Postcentral gyrus Intraparietal sulcus Posterior Parietal lobule
(A) Central sulcus
(C)
SI SII
Cortex 1
4
2
SII
3b Lateral sulcus
(B)
3a Ventral lateral
Postcentral gyrus
1
7 2
3b
4
Thalamus
Intraparietal sulcus
Central sulcus
Ventral posterior lateral
Deep Cutaneous
5 3a Cutaneous input
Deep input
Figure 14.7 A: Lateral view of the brain showing the locations of primary (SI) and secondary (SII) cortex. B: Cross section of the postcentral gyrus showing the locations of the four areas that make up the SI cortex. (C) Projection pattern from the ventrolateral complex of the thalamus to SI cortex. (A) Area 3b
n⫽89 (36%) (D)
n⫽39 (16%) (E)
n⫽35 (14%) (G)
(C)
n⫽38 (15%) (F)
n⫽22 (9%) (H)
n⫽2 (1%)
Figure 14.8 cortex.
(B)
n⫽2 (1%)
n⫽8 (3%) (I)
n⫽12 (5%)
Linear receptive fields of neurons in area 3b of SI
Note: Black represents excitatory regions, white inhibitory regions. From “Structure of Receptive Fields in Area 3b of Primary Somatosensory Cortex in the Alert Monkey,” by J. J. DiCarlo, K. O. Johnson, and S. S. Hsiao, 1998, Journal of Neuroscience, 18, pp. 2626–2645. Adapted with permission.
c14.indd Sec3:314
Note: From “Somatic Sensation” (p. 781), in Fundamental Neuroscience Academic, M. J. Zigmond, F. E. Bloom, S. C. Landis, J. L. Roberts, and L. R. Squire (Eds.), 1999, San Diego, CA: Academic Press. Reprinted with permission.
inhibitory subregions of the receptive field are oriented and, as such, many neurons in 3b show selectivity to oriented bars and edges (Figure 14.9). Neurometric functions that were computed by estimating the population response to oriented bars that differ by different degrees suggest that neurons in 3b have thresholds of about 20 degrees which is similar to human psychophysical thresholds (Bensmaia, Hsiao, Denchev, Killebrew, & Craig, 2008; Figure 14.10). These results suggest that cutaneous form processing begins with the extraction of information about orientation in area 3b of somatosensory cortex. Neurons in 3b have an additional inhibitory component that lags behind the initial response. Figure 14.11 illustrates the spatial-temporal receptive field of typical tactile neurons and demonstrate that the initial spatial response is replaced by in-field inhibition that inhibits the response after about 34 to 40 ms. This replacing inhibition is thought to play an important role in providing these neurons with velocity invariance to scanned stimuli (DiCarlo & Johnson, 1999). Neurons in area 1 have responses that in many ways are similar to those found in area 3b. For example the responses in both areas are almost exclusively cutaneous,
8/17/09 2:09:43 PM
Cortical Processing of Tactile Information 315
Total STRF area (mm2)
(A) 25 SA1 RA 3b 1
20 15 10 5 0
0
40
20
60
Average area (mm2)
(B) 30
250 ms
Scanned (dashed) Indented (solid)
100
SA1 RA
Excitation 20 Inhibition 10
0
0
20
40
60
80
(C) 30
50
0
0
50 100 Orientation (°)
150
Figure 14.9 Orientation tuning properties of area 3b neuron. Note: Top left shows raster plots of the response of a neuron to scanned and indented bars. Bottom left shows that the orientation tuning is similar for scanned and indented bars. From “The Representation of Stimulus Orientation in the Early Stages of Somatosensory Processing,” by S. J. Bensmaia, P. V. Denchev, J. F. Dammann III, J. C. Craig, and S. S. Hsiao, 2008, Journal of Neuroscience, 28, pp. 776–786. Adapted with permission.
Indented bars 1
Average area (mm2)
Firing rate (ips)
150
100
80
100
3b 1
20
Excitation Inhibition
10
0
0
20
60 40 Delays (ms)
80
100
Figure 14.11 Spatiotemporal Receptive fields of peripheral SAI, RA and cortical neurons in areas 3b and 1. Note: Shows the evolution of areas of excitation and inhibition over time following stimulus onset. There is a initial spatial response followed by a period of replacing inhibition. The inhibition in the initial SAI response is due to skin mechanics. From “Spatiotemporal Receptive Fields of Peripheral Afferents and Cortical Area 3b and 1 Neurons in the Primate Somatosensory System,” by A. P. Sripati, T. Yoshioka, P. Denchev, S. S. Hsiao, S. S. and K. O. Johnson, 2006, Journal of Neuroscience, 26, pp. 2101–2114. Reprinted with permission.
p p⫾⌬
0.9 0.8 0.7 0.6 0.5 0.4
Figure 14.10 neurons.
0
20
40 ⌬⬚
60
80
Neurometric functions for a population of 3b
Note: Threshold (75%) discriminating two bars at different orientations is about 20 degrees. From “The Representation of Stimulus Orientation in the Early Stages of Somatosensory Processing,” by S. J. Bensmaia, P. V. Denchev, J. F. Dammann III, J. C. Craig, and S. S. Hsiao, 2008, Journal of Neuroscience, 28, pp. 776–786. Adapted with permission.
c14.indd Sec3:315
and neurons in area 1 have linear receptive fields consisting of excitatory and inhibitory subregions followed by replacing inhibition. Further, neurons in area 1 show similar tuning to orientation as area 3b neurons (Bensmaia, Denchev, et al., 2008). However, there are several studies suggesting that it plays a different role in perception. First, animals seem to show different behavioral deficits following ablation of area 1. In contrast to area 3b where animals are devastated by the loss of area 3b input, behavioral effects following the loss of area 1 are relatively mild and seem to mainly affect the ability of animals to detect changes in texture. Furthermore, neurons in area 1 have larger RFs that span multiple fingers and tend to have larger inhibitory areas. Perhaps most significantly, area 1 neurons have responses that are poorly explained by linear mechanisms.
8/17/09 2:09:45 PM
316
Somatosensory Processes
These findings suggest that area 1 is at a higher stage of processing than 3b and plays a role in extracting information about more complex spatially invariant features of tactile stimuli. Area 2 is the first place where proprioceptive and mechanoreceptive inputs converge. Although there have been relatively few studies of area 2, our current understanding suggests that it plays an important role in extracting features related to 3D object recognition. Animals who have area 2 ablated are unable to discriminate large shapes that require integration of cutaneous and proprioceptive inputs (Murray & Mishkin, 1984), and neurophysiological and imaging studies suggest that neurons in area 2 respond selectively to objects that differ in 3D shape (Bodegard, Geyer, Grefkes, Zilles, & Roland, 2001; Iwamura & Tanaka, 1978). Parietal Operculum (SII cortex) There are two main projections from SI cortex. One is directed caudally to areas 5 and 7, and the other is directed dorsally to the SII cortex, which is located in the upper bank of the parietal operculum. Several studies suggest that areas 5 and 7 may not be related directly to tactile discrimination. Murray and Mishkin (1984) demonstrated
c field M M a field
M M M
D1
p field M M A Digits
that animals with lesions of these areas show only mild deficits in making tactile discrimination judgments, suggesting that areas 5 and 7 may not be directly involved in object shape discrimination. An alternative hypothesis is that these areas are involved in the perception of the immediate extra-personal space and are responsible for directing attention to where the body is located in space and with guiding the body and hand to targets in that space (Gardner et al., 2007, Gardner, Ro, Babu, & Ghosh, 2006; Jeannerod, Arbib, Rizzolatti, & Sakata, 1995; Mountcastle, Lynch, Georgopoulos, Sakata, & Acuna, 1975; Sakata & Iwamura, 1978). The other projection is directed ventrally toward the second somatosensory cortex (SII). Although SII also receives a direct projection from VPL, there is strong evidence suggesting that it is also important for object discrimination and is the next processing stage beyond SI cortex. As with area 3b, animals with SII ablated are unable to discriminate the shapes and textures of objects. Neurophysiological studies of SII show that it is not a single area but, like SI cortex, is composed of a minimum of three areas in monkeys and four areas in humans (Figure 14.12; Eickhoff, Schleicher, Zilles, & Amunts, 2006; Fitzgerald, Lane, Thakur, & Hsiao, 2004; Hinkley, Krubitzer,
0.4
A
0.3
A A
0.2 0.1
A A A
0.0
Proportion
Palm A D5 M UBLS A Trunk M Digits Palm A M A M A HL M Palm A HL A A A A A Trunk HL M
1 2 3 4 5 6 7 8 9 10 11
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Insula
LBLS
Proportion
1 2 3 4 5 6 7 8 9
12 11 10 9 8 7 6 5 4 3 2 1 0 ⫺1 Horsley-Clarke AP coordinate (mm)
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 1 2 3 4 5 6 7 8 9 Horsley-Clarke AP coordinate (mm)
Figure 14.12 Somatotopic map of the SII cortex. Note: SII is composed of three fields that contain three complete maps of the body—an anterior field (SIIa), central field (SIIc), and a posterior field (SIIp). The left graph shows an unfolded map of the upper bank of the lateral sulcus (UBLS). From “Receptive field properties
c14.indd Sec3:316
of the macaque second somatosensory cortex: Evidence for multiple functional representations,” by P. J. Fitzgerald, J. W. Lane, P. H. Thakur, and S. S. Hsiao, 2004, Journal of Neuroscience, 24, p.11202. Reprinted with permission.
8/17/09 2:09:45 PM
Cortical Processing of Tactile Information 317
Nagarajan, & Disbrow, 2006). While the roles and functions of these areas in SII are unknown, evidence from studies in nonhuman primates suggests that they play different roles in perception. Two areas, namely the anterior and posterior parts of SII (SIIa and SIIp), have neurons that respond well to both cutaneous and proprioceptive inputs, which suggests that these areas are involved in 3D object perception. In contrast, the central area of SII (SIIc) contains mainly neurons that respond to cutaneous inputs, suggesting that it may be important for processing inputs related to 2D form and texture (Fitzgerald et al., 2004). Neurons in SII tend to have much larger and more elaborate receptive fields than neurons in the SI cortex and in contrast to area 3b and 1, where 50% to 70% of the neurons show orientation tuned responses (Bensmaia, Denchev, et al., 2008; Hsiao, Lane, & Fitzgerald, 2002), only about 20% to 30% of the neurons
show orientation tuning with more tuned neurons in SIIc (Fitzgerald, Lane, Thakur, & Hsiao, 2006b). Showing that there are fewer orientation-tuned neurons, however, does not imply that SII is less important for processing object shape. The RF structures of SII neurons are particularly intriguing. Although some neurons have simple receptive fields confined to a single finger pad, most neurons have receptive fields with complex combinations of finger pads with some pads showing orientation tuned responses, others having untuned excitatory responses and still others having untuned inhibitory responses (Figure 14.13; Fitzgerald, Lane, Thakur, & Hsiao, 2006a). These receptive fields are not spatially homogeneous, and we believe that they are well suited for coding features of large objects that span multiple fingers and may underlie the neural representation of 3D shape (Fitzgerald, Lane, Thakur, & Hsiao, 2006b; Haggard, 2006).
CJ02O_8
Type TRF Diagrams D2
D5
D2
p field
d am p
D3
D4
D5
d b
c
m
c field
d
e
p
f
1 sec
40 35 30 h
Spikes/sec
a field
g
i
1
2
3
4
5
⫽ Untuned Excitatory Pad ⫽ Untuned Inhibitory Pad ⫽ Orientation Tuned Pad ⫽ Unresponsive Pad
Figure 14.13 (Figure C.24 in color section) Receptive fields of neurons in the SII cortex. Note: The receptive fields of neurons that had one or more tuned pads (left). Each set of squares represents the RF of a single neuron. Within each set are the responses from the distal, middle, and proximal pads of digits 2 through 5 (see top left neuron for an example). Red are pads that showed excitation, blue-inhibition. Pads that were orientation tuned have a bar oriented at the preferred orientation. Right graph shows the raster plot from a single SII neuron, illustrating that the tuning is similar across
c14.indd Sec3:317
25 20 15 10 5 0 Orientation
pads. From “Receptive Field (RF) Properties of the Macaque Second Somatosensory Cortex: RF Size, Shape, and Somatotopic Organization,” by P. J. Fitzgerald, J. W., Lane, P. H., Thakur, and S. S. Hsiao, 2006a, Journal of Neuroscience, 26, p. 6490; and “Receptive Field Properties of the Macaque Second Somatosensory Cortex: Representation of Orientation across Finger Pads,” by P. J. Fitzgerald, J. W. Lane, P. H. Thakur, and S. S. Hsiao, 2006b, Journal of Neuroscience 26, p. 6475. Reprinted with permission.
8/17/09 2:09:46 PM
318
Somatosensory Processes
There are two additional lines of evidence supporting the hypothesis that SII is important for 2D and 3D shape processing. One comes from studies showing that the orientation tuning of neurons with receptive fields that contain multiple tuned pads are similar across pads, suggesting that oriented edges of objects that span multiple fingers are integrated by these neurons. The others are studies showing that most neurons in SII show position invariant responses to oriented edges placed on a single finger pad (Thakur, Fitzgerald, Lane, & Hsiao, 2006). This is important because invariant responses are a hallmark of higher stages of sensory processing where the neural representations are matched to stored memories. Processing of Touch beyond the SII Cortex The processing of tactile information beyond SII is not well understood: however, other areas of the cortex have been shown to be important for processing information about form (Prather, Votaw, & Sathian, 2004) and vibration (Romo & Salinas, 2003). Of particular importance are studies by Romo and his colleagues who have systematically mapped the neural pathways that underlie the decision process in animals making vibratory discriminations. Further, neural imaging and studies using transmagnetic stimulation (TMS) in humans suggest that other areas of the cortex may play important roles in processing tactile form and texture. Candidate areas include the anterior parts of the intraparietal cortex (IPA), supramarginal gyrus (Bodegard et al., 2001), right intraparietal sulcus (pIPS; Stilla, Deshpande, LaConte, Hu, & Sathian, 2007), and possibly visual areas (Merabet et al., 2004; Sadato et al., 1996; Zangaladze, Epstein, Grafton, & Sathian, 1999).
EFFECTS OF ATTENTION A discussion of the mechanism underlying tactile perception would be incomplete without reference to the perception of touch in relation to higher cognitive functions. Several studies have now shown selective attention has a major affect on the way that tactile stimuli are perceived. Furthermore, studies in monkeys have shown that neurons in the thalamus, and primary and especially secondary somatosensory cortex are affected by selective attention (see Hsiao & Vega-Bermudez, 2002, for a review). In the SII cortex, about 90% of the neurons are affected by attention. Attention effects on firing rate are both additive and multiplicative (Chapman & Meftah, 2005; Sripati & Johnson, 2006). Furthermore, attention affects not only the firing rate but also the degree of synchronous firing across the population of neurons (Roy, Steinmetz, Hsiao,
c14.indd Sec3:318
Johnson, & Niebur, 2007; Steinmetz et al., 2000; Figure 14.14). Studies in humans using ECoG recordings suggest that the effects are particularly pronounced in the high gamma range (Ray, Niebur, Hsiao, Sinai, & Crone, 2008) and that neurons in the frontal lobe play a role in the cognitive but not the sensory aspects of touch. The results suggest that sensory inputs are continuously being modulated by higher cognitive functions as they are processed along the pathways leading to sensation and perception.
SUMMARY Tactile perception is rich and multidimensional in nature. It is through the sense of touch that we directly experience and interact with the environment around us. For example, consider what happens when you reach down to pick up a baseball. Initially, the hand is guided by visual inputs that tells the somatosensory system where the ball is in relation to your body. This information is used to guide the arm to the correct location of the ball. As the hand reaches out to grasp the ball, the synergies between the muscles, nerves, and joints define a set of movements that shape the hand in a way that is appropriate for grasping it securely. As the fingers close around the ball, contact at each of the finger pads activates all of the mechanoreceptors and thermoreceptors in the skin. The nociceptors are silent as are the itch receptors because of a lack of an adequate stimulus. The thermoreceptors provide information about the temperature while the SAI afferents provide an isomorphic representation of the local spatial patterns of stimulation. The activity from the SAI afferent discharge is used by cortical circuits in areas 3b, 1, and SII that extract information about the local features of the surface at the points of contact. As the ball is lifted, microslips of the ball relative to the skin are detected by the RA afferents that act as feedback and result in increases in grip force until it is above the safety margin to prevent slipping. The PC afferents are firing but most likely provide no useful information for the task and as such the inputs from these afferents are suppressed in the cortex. Meanwhile, inputs from the proprioceptive inputs from the skin, joints, and muscles provide information about hand conformation. These inputs from the different digits are combined, perhaps in area 3a, to form representations of the synergistic movements of the hand and digits. The cutaneous input and proprioceptive inputs are then integrated by neurons in area 2 (Bodegard et al., 2001; Ostry & Romo, 2001). The processing then proceeds to the three areas that make up SII cortex where information about local features from the different fingers are combined to form representations of the size and shape of the ball. Neurons in SII then project to
8/17/09 2:09:46 PM
References 319
X (A) Tactile hit Tactile miss Visual task
40
(B) 40
Trial number
Impulse rate (ips)
60
20
20
195
205
215
225
Horizontal location (mm)
40 0
0
1 Time (s) (C)
Figure 14.14 (Figure C.25 in color section) Effects of attention on the responses of neurons in SII cortex. Note: Left graph shows the change in firing rate in an animal performing a tactile letter discrimination task (letter is 6.0 mm high, scanned at 20 mm/sec across the finger). Bottom solid line represents the rate when the animals attention is distracted away from the letter and is performing a visual task, Middle dashed line and top solid line are the rates evoked when the animal performed the tactile task and correctly (hit) or incorrectly (miss) identified the letter. Right graph shows the change in synchronous firing between pairs of neurons recorded simultaneously while
multimodal areas in the parietal, occipital, and inferotemporal cortex where the tactile representations are compared and matched against previously stored representations of objects. It is after the match is made with the stored memories that the observer perceives the shape and texture of the baseball.
REFERENCES Barker, D., Emonet-Denand, F., Laporte, Y., Proske, U., & Stacey, M. J. (1973). Morphological identification and intrafusal distribution of the endings of static fusimotor axons in the cat. Journal of Physiology, 230, 405–427.
c14.indd Sec4:319
Coincidences s⫺1
0
Spikes s⫺1
0
2
0
1 Time (s)
2
20 10 0
0
1 Time (s)
2
the animal performed the visual task (right) or tactile task (left). Blue dots are when the neurons fired synchronously. Below each plot are the firing rates of the two neurons and a plot or the expected (red) and actual (blue) number of synchronous events one would expect based on the changes in firing rate. From “Effects of Selective Attention of Spatial Form Processing in Monkey Primary and Secondary Somatosensory Cortex,” by S. S. Hsiao, D. M. O’Shaughnessy, and K. O. Johnson, 1993, Journal of Neurophysiology, 70, p. 446; and “Attention Modulates Synchronized Neuronal Firing in Primate Somatosensory Cortex,” by P. N. Steinmetz et al., 2000, Nature, 404, p. 187.
Bensmaia, S. J., Denchev, P. V., Dammann, J. F., III, Craig, J. C., & Hsiao, S. S. (2008). The representation of stimulus orientation in the early stages of somatosensory processing. Journal of Neuroscience, 28, 776–786. Berryman, L. J., Yau, J. M., & Hsiao, S. S. (2006). Representation of object size in the somatosensory system. Journal of Neurophysiology, 96, 27–39. Blake, D. T., Hsiao, S. S., & Johnson, K. O. (1997). Neural coding mechanisms in tactile pattern recognition: The relative contributions of slowly and rapidly adapting mechanoreceptors to perceived roughness. Journal of Neuroscience, 17, 7480–7489. Blake, D. T., Johnson, K. O., & Hsiao, S. S. (1997). Monkey cutaneous SAI & RA responses to raised and depressed scanned patterns: Effects of width, height, orientation, and a raised surround. Journal of Neurophysiology, 78, 2503–2517. Bodegard, A., Geyer, S., Grefkes, C., Zilles, K., & Roland, P. E. (2001). Hierarchical processing of tactile shape in the human brain. Neuron, 31, 317–328.
Bell, J., Bolanowski, S. J., & Holmes, M. H. (1994). The structure and function of pacinian corpuscles: A review. Progress in Neurobiology, 42, 79–128.
Brisben, A. J., Hsiao, S. S., & Johnson, K. O. (1999). Detection of vibration transmitted through an object grasped in the hand. Journal of Neurophysiology, 81, 1548–1558.
Benedetti, F. (1985). Processing of tactile spatial information with crossed fingers. Journal of Experimental Psychology: Human Perception and Performance, 11, 517–525.
Chapman, C. E., & Meftah, E. (2005). Independent controls of attentional influences in primary and secondary somatosensory cortex. Journal of Neurophysiology, 94, 4094–4107.
Bensmaia, S., Hsiao, S. S., Denchev, P., Killebrew, J. H., & Craig, J. C. (2008). The tactile perception of stimulus orientation. Somatosensory and Motor Research 25, 49–59.
Coleman, G. T., Zhang, H. Q., & Rowe, M. J. (2003). Transmission security for single kinesthetic afferent fibers of joint origin and their target cuneate neurons in the cat. Journal of Neuroscience, 23, 2980–2992.
8/17/09 2:09:46 PM
320
Somatosensory Processes
Connor, C. E., Hsiao, S. S., Phillips, J. R., & Johnson, K. O. (1990). Tactile roughness: Neural codes that account for psychophysical magnitude estimates. Journal of Neuroscience, 10, 3823–3836. Connor, C. E., & Johnson, K. O. (1992). Neural coding of tactile texture: Comparisons of spatial and temporal mechanisms for roughness perception. Journal of Neuroscience, 12, 3414–3426. Craig, J. C. (2003). The effect of hand position and pattern motion on temporal order judgments. Perception and Psychophysics, 65, 779–788. Dandekar, K., Raju, B. I., & Srinivasan, M. A. (2003). 3-D finite-element models of human and monkey fingertips to investigate the mechanics of tactile sense. Journal of Biomechanical Engineering, 125, 682–691. Davidson, P. W. (1972). Haptic judgments of curvature by blind and sighted humans. Journal of Experimental Psychology, 93, 43–55. Diamond, J., Mills, L. R., & Mearow, K. M. (1988). Evidence that the merkel cell is not the transducer in the mechanosensory merkel cellneurite complex. Progress in Brain Research, 74, 51–56. DiCarlo, J. J., & Johnson, K. O. (1999). Velocity invariance of receptive field structure in somatosensory cortical area 3b of the alert monkey. Journal of Neuroscience, 19, 401–419. DiCarlo, J. J., Johnson, K. O., & Hsiao, S. S. (1998). Structure of receptive fields in area 3b of primary somatosensory cortex in the alert monkey. Journal of Neuroscience, 18, 2626–2645. Edin, B. B., & Abbs, J. H. (1991). Finger movement responses of cutaneous mechanoreceptors in the dorsal skin of the human hand. Journal of Neurophysiology, 65, 657–670. Eickhoff, S. B., Schleicher, A., Zilles, K., & Amunts, K. (2006). The human parietal operculum: Pt. I. Cytoarchitectonic mapping of subdivisions. Cerebral Cortex, 16, 254–267. Fitzgerald, P. J., Lane, J. W., Thakur, P. H., & Hsiao, S. S. (2004). Receptive field properties of the macaque second somatosensory cortex: Evidence for multiple functional representations. Journal of Neuroscience, 24, 11193–11204. Fitzgerald, P. J., Lane, J. W., Thakur, P. H., & Hsiao, S. S. (2006a). Receptive field (RF) properties of the macaque second somatosensory cortex: RF size, shape, and somatotopic organization. Journal of Neuroscience, 26, 6485–6495. Fitzgerald, P. J., Lane, J. W., Thakur, P. H., & Hsiao, S. S. (2006b) Receptive field properties of the macaque second somatosensory cortex: Representation of orientation on different finger pads. Journal of Neuroscience, 26, 6473–6484.
Gynther, B. D., Vickery, R. M., & Rowe, M. J. (1995). Transmission characteristics for the 1:1 linkage between slowly adapting type II fibers and their cuneate target neurons in cat. Experimental Brain Research, 105, 67–75. Haggard, P. (2006). Sensory neuroscience: From skin to object in the somatosensory cortex. Current Biology, 16, R884–R886. Hendry, S. H. C., Hsiao, S. S., & Bushnell, M. C. (1999). Somatic sensation. In M. J. Zigmond, F. E. Bloom, S. C. Landis, J. L. Roberts, & L. R. Squire (Eds.), Fundamental neuroscience (pp. 761–789). San Diego, CA: Academic Press. Hinkley, L. B., Krubitzer, L., Nagarajan, S., & Disbrow, E. A. (2006). Sensorimotor integration in S2, PV, and the parietal rostroventral areas of the human Sylvian fissure. Journal of Neurophysiology, 97, 1288–1297. Hollins, M., Faldowski, R., Rao, S., & Young, F. (1993). Perceptual dimensions of tactile surface texture: A multidimensional-scaling analysis. Perception and Psychophysics, 54, 697–705. Hsiao, S. S. (1998). Similarities between touch and vision. In J. W. Morley (Ed.), Neural aspects of tactile sensation (pp. 131–165). Amsterdam: Elsevier. Hsiao, S. S., Lane, J. W., & Fitzgerald, P. (2002). Representation of orientation in the somatosensory system. Behavioural Brain Research, 135, 93–103. Hsiao, S. S., O’Shaughnessy, D. M., & Johnson, K. O. (1993). Effects of selective attention of spatial form processing in monkey primary and secondary somatosensory cortex. Journal of Neurophysiology, 70, 444–447. Hsiao, S. S., & Vega-Bermudez, F. (2002). Attention in the somatosensory system. In N. J. Nelson (Ed.), The somatosensory system: Deciphering the brain’s own body image (pp. 197–217). Boca Raton: CRC Press. Iggo, A., & Andres, K. H. (1982). Morphology of cutaneous receptors. Annual Review of Neuroscience, 5, 1–31. Iwamura, Y., & Tanaka, M. (1978). Postcentral neurons in hand region of area 2: Their possible role in the form discrimination of tactile objects. Brain Research, 150, 662–666. Jeannerod, M., Arbib, M. A., Rizzolatti, G., & Sakata, H. (1995). Grasping objects: The cortical mechanisms of visuomotor transformation. [Review]. Trends in Neurosciences, 18, 314–320.
Foulke, E. (1991). Braille. In M. A. Heller & W. Schiff (Eds.), The psychology of touch (pp. 219–233. Hillsdale, NJ: Erlbaum.
Johansson, R. S., & Vallbo, Å. B. (1979). Detection of tactile stimuli: Thresholds of afferent units related to psychophysical thresholds in the human hand. Journal of Physiology, 297, 405–422.
Gardner, E.P. (1988). Somatosensory cortical mechanisms of feature detection in tactile and kinesthetic discrimination. Can J Physiol Pharmacol, 66, 439–454
Johnson, K. O. (2001). The roles and functions of cutaneous mechanoreceptors. Current Opinion in Neurobiology, 11, 455–461.
Gardner, E. P., Babu, K. S., Reitzen, S. D., Ghosh, S., Brown, A. S., Chen, J., et al. (2007). Neurophysiology of prehension: Pt. I. Posterior parietal cortex and object-oriented hand behaviors. Journal of Neurophysiology, 97, 387–406. Gardner, E. P., & Palmer, C. I. (1990). Simulation of motion on the skin: Pt. III. Mechanisms used by rapidly adapting cutaneous mechanoreceptors in the primate hand for spatiotemporal resolution and two-point discrimination. Journal of Neurophysiology, 63, 841–859. Gardner, E. P., Ro, J. Y., Babu, K. S., & Ghosh, S. (2007). Neurophysiology of prehension: Pt. II. Response diversity in primary somatosensory (S-I) and motor (M-I) cortex. Journal of Neurophysiology, 97, 1656–1670. Grigg, P., & Hoffman, A. H. (1982). Properties of ruffini afferents revealed by stress analysis of isolated sections of cat knee capsule. Journal of Neurophysiology, 47, 41–54. Guinard, D., Usson, Y., Guillermet, C., & Saxod, R. (2000). PS-100 and NF 70–200 double immunolabeling for human digital skin meissner corpuscle 3D imaging. Journal of Histochemistry and Cytochemistry, 48, 295–302.
c14.indd Sec4:320
Kappers, A. M. L., & Koenderink, J. J. (1996). Haptic unilateral and bilateral discrimination of curved surfaces. Perception, 25, 739–749. Keller, H. (1919). The world I live in. New York: Century. Klatzky, R. L., Lederman, S. J., & Metzger, V. A. (1985). Identifying objects by touch: An “expert system.”Perception and Psychophysics, 37, 299–302. Lederman, S. J., & Klatzky, R. L. (1987). Hand movements: A window into haptic object recognition. Cognitive Psychology, 19, 342–368. Macefield, V. G., Hager-Ross, C., & Johansson, R. S. (1996). Control of grip force during restraint of an object held between finger and thumb: Responses of cutaneous afferents from the digits. Experimental Brain Research, 108, 155–171. Merabet, L., Thut, G., Murray, B., Andrews, J., Hsiao, S., & Pascual-Leone, A. (2004). Feeling by sight or seeing by touch? Neuron, 42, 173–179. Mountcastle, V. B., Lynch, J. C., Georgopoulos, A. P., Sakata, H., & Acuna, C. (1975). Posterior parietal association cortex of the monkey: Command functions for operations within extrapersonal space. Journal of Neurophysiology, 38, 871–908.
8/17/09 2:09:47 PM
References 321 Murray, E. A., & Mishkin, M. (1984). Relative contributions of SII and area 5 to tactile discrimination in monkeys. Behavioural Brain Research, 11, 67–85.
Sripati, A. P., Bensmaia, S. J., & Johnson, K. O. (2006). A continuum mechanical model of mechanoreceptive afferent responses to indented spatial patterns. Journal of Neurophysiology, 95, 3852–3864.
Ogawa, H. (1996). The merkel cell as a possible mechanoreceptor cell. Progress in Neurobiology, 49, 317–334.
Sripati, A. P., & Johnson, K. O. (2006). Dynamic gain changes during attentional modulation. Neural Computation, 18, 1847–1867.
Ostry, D. J., & Romo, R. (2001). Tactile shape processing. Neuron, 31, 173–174.
Sripati, A. P., Yoshioka, T., Denchev, P., Hsiao, S. S., & Johnson, K. O. (2006). Spatiotemporal receptive fields of peripheral afferents and cortical area 3b and 1 neurons in the primate somatosensory system. Journal of Neuroscience, 26, 2101–2114.
Phillips J.R., Johnson, K.O., Hsiao, S.S. (1988). Spatial pattern representation and transformation in monkey somatosensory cortex. Proc Natl Acad Sci, 85, 1317–1321. Phillips, J. R., & Johnson, K. O. (1981a). Tactile spatial resolution: Pt. II. Neural representation of bars, edges, and gratings in monkey primary afferents. Journal of Neurophysiology, 46, 1192–1203. Phillips, J. R., & Johnson, K. O. (1981b). Tactile spatial resolution: Pt. III. A continuum mechanics model of skin predicting mechanoreceptor responses to bars, edges, and gratings. Journal of Neurophysiology, 46, 1204–1225. Phillips, J. R., Johnson, K. O., & Browne, H. M. (1983). A comparison of visual and two modes of tactual letter resolution. Perception and Psychophysics, 34, 243–249. Prather, S. C., Votaw, J. R., & Sathian, K. (2004). Task-specific recruitment of dorsal and ventral visual areas during tactile perception. Neuropsychologia, 42, 1079–1087. Proske, U., Wise, A. K., & Gregory, J. E. (2000). The role of muscle receptors in the detection of movements. Progress in Neurobiology, 60, 85–96. Randolph, M., & Semmes, J. (1974). Behavioral consequences of selective ablations in the postcentral gyrus of macaca mulatta. Brain Research, 70, 55–70. Ray, S., Niebur, E., Hsiao, S. S., Sinai, A., & Crone, N. E. (2008). High-frequency gamma activity (80–150Hz) is increased in human cortex during selective attention. Clinical Neurophysiology, 119, 116–133.
Stilla, R., Deshpande, G., LaConte, S., Hu, X., & Sathian, K. (2007). Posteromedial parietal cortical activity and inputs predict tactile spatial acuity. Journal of Neuroscience, 27, 11091–11102. Talbot, W. H., Darian-Smith, I., Kornhuber, H. H., & Mountcastle, V. B. (1968). The sense of flutter-vibration: Comparison of the human capacity with response patterns of mechanoreceptive afferents from the monkey hand. Journal of Neurophysiology, 31, 301–334. Thakur, P. H., Bastian, A. J., & Hsiao, S. S. (2008). Multi-digit movement synergies of the human hand in an unconstrained haptic exploration task. Journal of Neuroscience. 28, 1271–1281. Thakur, P. H., Fitzgerald, P. J., Lane, J. W., & Hsiao, S. S. (2006). Receptive field properties of the macaque second somatosensory cortex: Nonlinear mechanisms underlying the representation of orientation within a finger pad. Journal of Neuroscience, 26, 13567–13575. Vega-Bermudez, F., & Johnson, K. O. (1999). Surround suppression in the responses of primate SA1 and RA mechanoreceptive afferents mapped with a probe array. Journal of Neurophysiology, 81, 2711–2719. Vega-Bermudez, F., Johnson, K. O., & Hsiao, S. S. (1991). Human tactile pattern recognition: Active versus passive touch, velocity effects, and patterns of confusion. Journal of Neurophysiology, 65, 531–546.
Romo, R., & Salinas, E. (2003). Flutter discrimination: Neural codes, perception, memory and decision making. Nature Reviews: Neuroscience, 4, 203–218.
Wang, X., Merzenich, M. M., Sameshima, K., & Jenkins, W. M. (1995, November 2). Remodelling of hand representation in adult cortex determined by timing of tactile stimulation [see comments]. Nature, 378, 71–75.
Roy, A., Steinmetz, P. N., Hsiao, S. S., Johnson, K. O., & Niebur, E. (2007). Synchrony: A neural correlate of somatosensory attention. Journal of Neurophysiology, 98, 1645–1661.
Westling, G. & Johansson, R. S. (1987). Responses inglabrous skin mechano receptors during precision grip in humans. Experimental Brain Rearch, 66, 128–140.
Sadato, N., Pascual-Leone, A., Grafman, J., Ibanez, V., Delber, M. P., Dold, G. R., et al. (1996, April 11). Activation of the primary visual cortex by Braille reading in blind subjects. Nature, 380, 526–528.
Yoshioka, T., Bensmaia, S. J., Craig, J. C., & Hsiao, S. S. (2007). Texture perception through direct and indirect touch: An analysis of perceptual space for tactile textures in two modes of exploration. Somatosensory and Motor Research, 24, 53–70.
Sakata, H., & Iwamura, Y. (1978). Cortical processing of tactile information in the first somatosensory and parietal association areas in the monkey. In G. Gordon (Ed.), Actice touch: The mechanism of recognition of objects by manipulation: A multi-disciplinary approach (pp. 55–72). Oxford: Pergamon Press. Srinivasan, M. A., & LaMotte, R. H. (1995). Tactual discrimination of softness. Journal of Neurophysiology, 73, 88–101.
c14.indd Sec4:321
Steinmetz, P. N., Roy, A., Fitzgerald, P. J., Hsiao, S. S., Johnson, K. O., & Niebur, E. (2000, March 9). Attention modulates synchronized neuronal firing in primate somatosensory cortex. Nature, 404, 187–190.
Yoshioka, T., Gibb, B., Dorsch, A. K., Hsiao, S. S., & Johnson, K. O. (2001). Neural coding mechanisms underlying perceived roughness of finely textured surfaces. Journal of Neuroscience, 21, 6905–6916. Zangaladze, A., Epstein, C. M., Grafton, S. T., & Sathian, K. (1999, October 21). Involvement of visual cortex in tactile discrimination of orientation. Nature, 401, 587–590.
8/17/09 2:09:48 PM
Chapter 15
Personal and Extrapersonal Spatial Perception GIUSEPPE VALLAR AND ANGELO MARAVITA
fundamental classes of representations of objects in space, including the subjects’ own body. In egocentric coordinate frames, the position of objects is coded with reference to the whole body of the subject, or of body parts (e.g., the arm, the hand), giving rise to representations, which may be head-centered (in the visual domain, resulting from the combination of the retinotopic map with information about eye position), trunk-centered (based also on information about the position of the head, and about posture), arm-centered, and so forth (Lacquaniti, 1997). In allocentric coordinate frames, objects are primarily coded with reference to their spatial and configurational properties, such as the relationships between their component parts, and among different objects present in the environment. Egocentric representations may be used for the organization of goal-directed movements, such as reaching a target or avoiding a harmful stimulus (Figure 15.1). Allocentric representations, encoding the configurational properties of objects and the relationships among them, may be useful for their identification and for navigation in space. In ecological conditions, objects are typically perceived from a variety of egocentric (observer-based) perspectives, suggesting a close interaction between these two types of frames of reference (Vallar, 2003). The neuropsychological findings from brain-damaged patients provide, through selective patterns of impairment, definite evidence for a fractionation of the internal map of space into a number of discrete, though interrelated, components. In humans, there is a well-established hemispheric asymmetry that attaches to the right hemisphere a main role for spatial processing, with the left hemisphere being mainly concerned with language (Milner, 1971). As for spatial cognition, this is definitely suggested by the neuropsychological evidence that spatial impairments, such as unilateral neglect, are most frequently associated with right brain damage (Bisiach & Vallar, 2000). The hemispheric asymmetry in spatial processing has been characterized as a “left hemisphere deficit rather than as a right hemisphere specialization”: during evolution, language and other
MULTIPLE REPRESENTATIONS AND FRAMES OF SPATIAL REFERENCE Human beings, as well as animals, live in a complex environment. They continuously receive and process signals concerning objects in space and the spatial position of their body through different sensory modalities (visual, auditory, somatosensory, and vestibular). They continuously move and are able to keep track of the position of their body and of the location of objects in the space around them. These complex skills, essential to survival, comprise the perceptual processing of different sensory inputs from a continuously changing environment and the programming and execution of motor acts. These include pointing to and reaching for objects through grasping and locomotion (Vallar, 2003). The subjective, phenomenal, experience of space is largely unitary (Rizzolatti, Fadiga, Fogassi, & Gallese, 1997). However, when the experience of the world, namely of the space around us, is considered with reference to a person, who perceives objects and makes movements, the space may be conceived as the medium whereby the position of things, including the body, becomes possible (MerleauPonty, 1945). Accordingly, our body, with the objects around us, gives rise to relationships such as “top” and “bottom,” “left” and “right,” “near” and “far.” Our unitary phenomenal experience of space involves the integration of sensory and motor information that builds up internal representations of the body-in-space and of the space around us. First, the processing of sensory inputs produces representations of the stimulus in primary sensory cortices, that are specific to each sensory modality, retinotopic in vision (see Chapter 11), somatotopic in the tactile domain (see Chapter 14) (while for audition the primary sensory cortex has a tonotopic, nonspatial representation; Chapter 12). The integration of visual, auditory, and somatosensory information with signals (eye position, vestibular, proprioceptive) concerned with the position of the body and of body parts in space results in two 322
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c15.indd 322
8/18/09 5:27:14 PM
Personal and Extrapersonal Space 323 Z axis (Yaw) Midfrontal (coronal) plane
Top
Midsagittal plane Back
Midtransverse plane Z
Y
Right
Figure 15.1 The midsagittal plane of the body, which divides the extrapersonal space and the body into the left and the right sides, and other body coordinate systems and axes of rotation. Note. An object with its intrinsic axes. c.g. ⫽ center of gravity. From figure 1.3, pg. 7, “Human Spatial Orientation,” by I. P. Howard and W. B. Templeton, 1966, London: Wiley. Adapted with permission.
c.g.
X Front
Left
X axis (roll)
Y axis (pitch)
Bottom
cognitive processes may have co-opted left-hemisphere neural tissue, previously devoted to visuo-spatial processing (Funnell, Corballis, & Gazzaniga, 2000).
PERSONAL AND EXTRAPERSONAL SPACE In the context of the spatial reference frames illustrated in Figure 15.1, a main distinction has been drawn between a personal space, namely the space of the self (ego-space), that is normally located within the limits of the body space, and an extrapersonal space. Phenomenologically, and with reference to the actions that can be performed in it, the extrapersonal space has been further subdivided into a number of components. The grasping space may include, in turn, a number of subcomponents, with reference to the motor effector that is used, in order to perform actions: the whole body (general grasping space), the mouth (perioral and intraoral space), and the hand (manual space). The grasping space, by using instruments such as a rake, can be extended (instrumental grasping space, see section on “tools and mirrors”). The space beyond grasping has been further fractionated. The near-distant action space would amount to a few meters (6 to 8) around the
c15.indd Sec1:323
body. Its limits are shown by the finding that subjects, after a few steps when blindfolded, become more and more unsure about their position in space. Beyond the near-distant action space, the far-distant space is mainly a visual space that is phenomenologically perceived as non-Euclidean, with a nonliner scaling of distance (Grüsser & Landis, 1991). On the basis of visual parameters, such as ordinal depththreshold functions, Cutting and Vishton (1995) hypothesize three different classes of distance around an observer: personal space (generally within arm’s reach and slightly beyond), action space, and vista space (beyond about 30 m). A distinction between discrete representations of peripersonal space (mainly concerned with visuomotor operations in near-body space), and of more distant extrapersonal spaces also characterizes the three-dimensional (3D) model of Previc (1998). The representations of extrapersonal spaces, in turn, include a focal component (mainly concerned with visual search and object recognition), an action component (orienting in topographically defined space, such as in navigation), and a most distant ambient extrapersonal component (mainly concerned with orienting in earth-fixed space). Perhaps, the first scientific piece of evidence of a dissociation between personal and extrapersonal space comes
8/18/09 5:27:15 PM
324
Personal and extrapersonal Spatial Perception
from the observation that patients suffering from peripheral disorders of the vestibular system (labyrinthitis) may experience a disproportionate increase (hyperschematia) or decrease (hyposchematia) of the subjective size of their whole body, or of body parts, or a pathological displacement of them (paraschematia) (Bonnier, 1905; see a reappraisal of Bonnier ’s observations in Vallar & Papagno, 2003). A few years later, on the basis of observations in brain-damaged patients, the suggestion was made that an internal representation of the body exists (the body schema), largely nonconscious, a combined standard, resulting from previous postures and movements, and mainly concerned with keeping track and updating the position of the body and of body parts, for the purpose of forthcoming postures and movements. Another schema, more superficial, was supposed be involved in the localization of tactile stimuli (Head & Holmes, 1911). To summarize, there is consensus that the internal representation of space is not unitary. A main drawn distinction concerns personal (i.e., the body) versus extrapersonal (i.e., the space around us) representations. Within extrapersonal space, a peripersonal (i.e., within hand/arm reach) near space is distinguished from a more distant space. This far space, in turn, has been further subdivided into subcomponents, that may differ in some specific aspects, according to each particular model.
NEUROPSYCHOLOGICAL DISSOCIATIONS Extrapersonal versus Personal Space The syndrome of unilateral spatial neglect has provided definite information as to the existence and independence of discrete representations of different sectors of space. Spatial neglect is a multicomponent disorder, whereby patients fail to explore the side of space contralateral to the cerebral lesion (contralesional), and do not report sensory events (e.g., visual stimuli, touches delivered to the contralesional hand) occurring in that sector of space. The disorder is more frequent and severe after damage to the right cerebral hemisphere and concerns the left side of personal and extrapersonal space (Bisiach & Vallar, 2000). Typically, patients show left spatial neglect for both extrapersonal and personal space (i.e., their body). However, dissociations have been reported between these two manifestations of the disorder. Early observations showed that right-brain-damaged patients may present with extrapersonal visual neglect, without neglect for the left side of their body (Paterson & Zangwill, 1944, patient #1). Other patients may present with neglect for the left side of the body, with no evidence of extrapersonal neglect
c15.indd Sec1:324
(Bisiach, Perani, Vallar, & Berti, 1986). Guariglia and Antonucci (1992) studied in more detail a right-braindamaged patient who had no extrapersonal neglect in cancellation, reading, drawing, and perceptual tasks, but “was unable to look at his own left leg while walking with a cane and was unable to utilize the residual movement skills of the left side of the body.” These selective patterns of impairment (double dissociation, see Vallar, 2000) suggest that discrete neural systems involved in the representation of personal and extrapersonal space exist in the brain. These representations are likely to be built up, modulated, and updated though the integration of different sensory inputs. One piece of illustrative evidence for this modulation and integration comes from studies showing that muscular vibration may illusorily modify the perceived image of the body, such as the so-called Pinocchio’s illusion (Lackner, 1988), or the displacement of visual targets (Biguer, Donaldson, Hein, & Jeannerod, 1988). In the neuropsychological domain, a variety of sensory stimulations may improve or worsen many manifestations of the neglect syndrome (Kerkhoff, 2003; Rossetti & Rode, 2002; Vallar, Guariglia, & Rusconi, 1997). The selective impairments described here, however, cannot be traced back to sensory disorders, which may nevertheless contribute to shape the deficit (Bisiach & Vallar, 2000). Extrapersonal neglect may occur without any associated visual or somatosensory impairment (Bisiach et al., 1986). Personal neglect, conversely, is much more closely associated with sensory (vision, tactile perception, position sense) impairments (Bisiach et al., 1986). However, individual case studies show that sensory deficits may be absent, or mild, in patients with personal neglect (Guariglia & Antonucci, 1992; Ortigue, Mégevand, Perren, Landis, & Blanke, 2006). These findings suggest that neglect for the left side of the body cannot be entirely traced back to defective sensory inputs, but reflects the impairment of higher-order representations of the body. Extrapersonal Space: The Far versus Near Distinction Brain (1941) described three right-brain-damaged patients who were impaired in localizing objects by pointing in the contralesional half-field, both within arm reach, and, two of them, at a greater distance. One patient (case #3), however, did not run into objects, and “his defective localization appeared to be limited to objects within arm’s length.” On the basis of these observations, Brain suggested a distinction between processes involved in the estimation of “walking distance,” and processes concerned with the estimation of “grasping distance,” with possible discrete neural correlates. With explicit reference to a fractionation
8/18/09 5:27:15 PM
Neuropsychological Dissociations
of spatial representations related to the motor effectors used to perform actions, Brain distinguished a “manual,” and a “brachial space,” as well as a far space, where objects can be reached through locomotion. Many years later, Halligan and Marshall (1991) reported a right-brain-damaged patient who showed contralesional left neglect in near peripersonal space, and, particularly, in line bisection, both manual, and with a projection light pen. The patient’s rightward error was however greatly reduced when the bisection by the light pen took place in far space. In fact, the patient was well able to play a traditional pub game, namely throwing darts at a circular target hung on a wall, from a distance of about 2.5 m. Actually, the patient was consistently more accurate than Peter Halligan, and always hit the dart board. The spatial pattern of the darts thrown did not show any discernable spatial deviation and they were often close to the center (John Marshall and Peter Halligan, personal communication). Cowey, Small, and Ellis (1994) in five right-brain-damaged patients found the opposite dissociation, namely a greater impairment for lines well beyond reach. To summarize, the study of right-brain-damaged patients with left neglect suggests that different right-hemispherebased neural systems are involved in the representation of near or peripersonal (within hand/arm reach) versus far, distant, sectors of extrapersonal space. One interpretation of this dichotomy, that prima facie clashes with our phenomenal experience of the unity of extrapersonal space, is related to the different effectors (e.g., an arm-reaching movement, a saccade), that may be recruited to perform actions toward specific objects, located in different sectors of extrapersonal space, with respect to the body (Berti & Rizzolatti, 2002). Neuropsychological studies, however, have also shown that the near/far dichotomy may be revealed through perceptual paradigms that do not require motor actions toward a target (Pitzalis, Di Russo, Spinelli, & Zoccolotti, 2001). Furthermore, another fractionation of sectors of space (top versus bottom, see Figure 15.1) is revealed by brain damage: vertical or altitudinal neglect for the lower (Rapcsak, Cimino, & Heilman, 1988), or the upper (Shelton, Bowers, & Heilman, 1990) peripersonal sectors of space. The more frequent report of neglect for the lower sector of space is in line with the finding that left spatial neglect also has an altitudinal component, being more severe in the left lower sector of extrapersonal space (Halligan & Marshall, 1989; Pitzalis & Di Russo, 2001). The top/bottom dichotomy is more difficult to accommodate with reference to different effectors such as the arm-hand system, even though different directions of eye movements may be a relevant factor. A similar argument is provided by the observation of neglect confined to front or back space (Vallar, Guariglia,
c15.indd Sec2:325
325
Nico, & Bisiach, 1995). Finally, the representations of these considered sectors of extrapersonal space (far versus near and upper versus lower) may be distinguished in terms of both motor, effector-related (Berti & Rizzolatti, 2002), and sensory, perceptual, factors (Cutting & Vishton, 1995; Previc, 1998), that contribute to their building up and updating. Spatial Coding of Touch: Localizing Tactile Sensations When Crossing the Hands When we look straight ahead, there is an exact correspondence between the left and the right for both the objects in the space around us, and our own left and right body parts. For example, the left shoulder corresponds to the left side of extrapersonal space relatively to the egocentric reference frame (see Figure 15.1). However, for mobile body parts, such as the hands, that are typically displaced in different spatial positions, during most actions, this exact correspondence does not always hold. In particular, when the hands cross the midline, there may be a left hand in right peripersonal space and vice versa. This means that a somatosensory stimulus may need to be coded in different spatial reference frames at the same time. One such frame is an egocentric code basically corresponding, in the example mentioned earlier, to the right- or to the left-hand-side with respect to the midsagittal plane of the trunk. Other frames are centered on the position of each single body part in space, with respect to both other body parts and near objects. In this view, the localization of a tactile stimulus on a hand takes into account not only the somatosensory input (which impinges directly on the somatosensory cortex), but also the position of the hand in space. As we discuss later, the integration of signals coming from vision, proprioception, and touch is essential to localize somatosensory stimuli delivered to mobile body parts. While in most instances of the daily life this integration is perfectly efficient, hand crossing may not be fully compensated by the brain in some specific situations. In a seminal study, Yamamoto and Kitazawa (2001) showed that temporal order judgment of tactile stimuli on the hands was much impaired when the hands were crossed. This result suggests that the localization of the stimulus takes into account not only the hand that is being stimulated, but also the spatial location of the stimulus in external space. Gathering this information may have a cost when the hands are crossed and the anatomical and spatial features of the stimulus do not coincide. The effect of hand crossing in patients affected by disorders of tactile perception may also disclose the relative role of somatosensory and spatial mapping of external stimuli for touch perception. Right brain-damaged patients with left spatial neglect or extinction may show defective
8/18/09 5:27:16 PM
326
Personal and extrapersonal Spatial Perception
awareness of contralesional touches. However, when the hands are crossed, so that the contralesional hand is moved in the ipsilesional side of space, the perception of contralesional touches improves (Aglioti, Smania, & Peru, 1999; but see Bartolomeo, Perri, & Gainotti, 2004, for partially different results, although with a rather different task; Smania & Aglioti, 1995). Indeed, this effect may be useful in clinical practice, where it may help distinguish primary sensory deficits (not, or only scarcely, modulated by postural changes) from contralesional inattention. In a logically related experiment with a patient with right parietal damage and left neglect, Valenza, Murray, Ptak, and Vuilleumier (2004) found that a double stimulation of the right ipsilesional forearm (touches delivered to the elbow and the hand) induced a correct perception of both stimuli when the arm was in an anatomical, uncrossed posture, namely to the right of the body’s midline. However, when the right forearm was placed across the midline, the patient failed to report touches given to the hand. Apparently, the patient extinguished the more distal stimulus that was delivered in a left spatial position, with respect to the body’s midsagittal plane. These results suggest that the perception of a tactile stimulus may be determined not only by somatosensory factors, but also by the spatial position of the touched skin, with reference to egocentric coordinate frames, centered on the individual body parts. This cross-talk between different coordinate systems allows us to keep track of our body parts in space and of any single stimulus delivered to them, supporting a high degree of flexibility during motor acts. The role of egocentric spatial coordinate frames in perceptual awareness of sensory events is not confined to the somatosensory domain. In the visual domain, Kooistra and Heilman (1989) found that the left hemianopia of one right-brain-damaged patient improved when her eyes were directed toward the right side. In this condition, where left visual half-field testing fell in the right half-space, the patient’s left hemianopia improved significantly. In sum, spatial coding of sensory events appears to play an important role for perceptual awareness, possibly indicating a close relationship between detection and localization in space (Gallace & Spence, 2008; Vallar, 2007a).
MULTISENSORY CODING OF SPACE A key aspect of space representation has received much interest from the vantage point of the cognitive neurosciences. In natural conditions, most stimuli in external space, both far and near to the body, present to our senses as a combination of multisensory information (e.g., Calvert, Spence, & Stein, 2004). Often, when we see a dog or a car,
c15.indd Sec2:326
the visual stimulation is accompanied by a barking or roaring sound. Furthermore, when we want to catch a Frisbee flying toward us, the vision of the approaching object is completed by the somatosensory information that accompanies its contact with our hand, when we finally catch it. Our brain is equipped to integrate multisensory inputs in order to give us a seamless, unitary perception of the outside world. The importance of such an effective multisensory integration is that of increasing our efficiency in orienting toward, detecting, and manipulating external objects. For example, it is faster and easier to visually detect and orient toward a bird hidden among tree branches if we can also hear it chattering. Here, the integration of vision and audition is critical. Similarly, when we want to reach and manipulate an object, tactile and proprioceptive inputs are critically integrated with visual ones. In general, our perceptual system is designed to take into account all the possible information coming from the space around us. When we orient our attention to a particular object of interest, we automatically start to process any information coming from that object, regardless of the sensory modality we attend to. In fact, a good framework to investigate the effects of crossmodal interaction on human behavior is to refer to the typical paradigms studying the orienting of spatial attention (see a collection of review papers in Spence & Driver, 2004). Attention can be deployed toward a certain stimulus, either involuntarily (exogenous attention) or voluntarily (endogenous attention). In both situations, a stimulus in one sensory modality can interfere with, or improve, the response to a stimulus in another modality. A typical exogenous attention situation is illustrated by the experiment by Spence and Driver (1997), who presented auditory pure-tone cues to the left or to the right of central fixation. After an interval, either a visual or an auditory target is presented ipsilaterally or contralaterally to the cue. Ipsilateral cues improve not only the responses to auditory stimuli (intramodal facilitation), but also those to visual targets (crossmodal facilitation). An automatic crossmodal enhancement of perception has been obtained even with subthreshold stimuli: spatially congruent unattended sounds can improve the perception of below-threshold visual targets (Frassinetti, Bolognini, & Làdavas, 2002). An elegant example of crossmodal endogenous orienting of attention is shown by an experiment in which a stream of auditory speech at one spatial location is better decoded if participants actively attend to a video monitor, showing lip movements exactly matching words pronounced in the auditory stream: critically, this facilitation only occurs if the monitor showing the visual stimuli is spatially coincident to the sound (Driver & Spence, 1994). A critical determinant of multisensory integration is the spatial distance between a given sensory event and the
8/18/09 5:27:16 PM
Multisensory Coding of Space 327
observer. Because our body can reach objects only within a limited space extension, vision and touch are the critical sensory modalities to be integrated when we deal with stimuli close to our body and vision and audition when we deal with stimuli in far space. Several examples of crossmodal integration (between vision and touch in near space and vision and audition in far space) can be illustrated through the crossmodal cuing paradigm (reviews in Spence, Pavani, Maravita, & Holmes, 2004, 2008), largely used to study crossmodal integration. In a typical experimental setting for studying visual-tactile integration, participants have to make a spatial tactile judgment, reporting whether a tactile vibration is delivered to the index or thumb finger of either hand, which is equivalent to an “upper” or “lower” judgment, for the posture that is typically used (Driver & Spence, 1998; Maravita & Driver, 2004). At the same time, visual distracters (LEDs) can be illuminated at any of four positions, one near the index finger (i.e., upper position), and one near the thumb (lower position) for each hand, with all these visual possibilities being equally likely, regardless of where any concurrent tactile target is presented. The tactile judgments are typically slower, and less accurate, if the visual distracter is “incongruent” with the location of the concurrent tactile target (e.g., an upper light near the index finger is combined with a lower vibration at the thumb). Critically, this crossmodal interference effect is reliably larger if the visual distracter appears closer to the tactile target (e.g., in the same hemifield, or, within that hemifield, closer to the current location of the tactually stimulated hand). As discussed next, the hands are highly mobile body parts and the brain must take into account the absolute position of the body in space, in order to allow efficient visual-tactile integration. In particular, when the hands are crossed over in space, then the crossmodal interference effect re-maps accordingly, so that an incongruent visual distracter, closest to the current location of the stimulated hand, brings about the largest interference (see, e.g., Driver & Spence, 1998). This remapping effect may require the activity of the posterior parietal cortex (PPC; Bolognini & Maravita, 2007; Lloyd, Shore, Spence, & Calvert, 2003), and the integrity of the interhemispheric connections via the corpus callosum (Spence, Kingstone, Shore, & Gazzaniga, 2001). Bolognini and Maravita (2007) have shown that interfering with transcranial magnetic stimulation (TMS) over the PPC greatly affects the crossmodal facilitation of TMS-induced visual sensations (phosphenes) by spatially coincident touches with hands crossed: a defective functioning of the PPC makes it difficult for the brain to keep track of the position of the hands when they are crossed over, and thus limits the facilitatory crossmodal effect of touch over visual perception. The PPC may be a critical
c15.indd Sec3:327
site of multisensory integration (see also the discussion of its neural basis that follows), for orienting unimodal attention through crossmodal cues via feedback mechanisms (Macaluso & Driver, 2003), and for modulating, together with the temporal cortex, the multisensory responses of subcortical structures (Stein, 2005). Furthermore, the tight mutual links of the PPC with the premotor cortex make it a key structure for action planning and execution by keeping personal (somatosensory) and extrapersonal (visual and auditory) sensory maps in a spatial register, in line with modern views of sensory-motor organization (Rizzolatti, Luppino, & Matelli, 1998). The mutual link between vision and proprioception for multisensory spatial representation of the body has been shown using dummy rubber hands (Pavani, Spence, & Driver, 2000). Some dominance of vision was found, with visual distracters near the dummy rubber hands producing the greatest spatial interference with tactile judgments, provided that the rubber hands were in a plausible posture. Moreover, the extent of crossmodal interference from visual distracters near the dummy rubber hands correlated with the extent to which subjects “felt” that they actually experienced touch in the location of those dummy hands. Finally, in the neuropsychological literature, a great deal of attention has been devoted to the phenomenon of crossmodal extinction. This disorder provides a clue into the existence of a multisensory coding of space. In a seminal study, di Pellegrino, Làdavas, and Farnè (1997) showed that a visual stimulus close to the ipsilesional hand can cause the extinction of a contralesional touch. The deficit is critically reduced when the distance between the visual stimulus and the ipsilesional hand increases. This finding emphasizes, once again, the role of spatial proximity to the body for crossmodal interactions to occur (review in Làdavas, 2002). In conclusion, the brain constantly integrates all ongoing sensory stimulation coming from personal and extrapersonal space environments into a seamless reality. This integration takes place regardless of the position of the body, and of its individual parts, in space, and of all of the continuous, intentional, or unexpected, changes of posture or modifications of the position of the sensory inputs in the space around us. Modulation of Multisensory Space Representation: Tools and Mirrors As discussed earlier, space comprises different functional sectors. In particular, there is a sector of space within which we can act directly with our hands. However, in daily life activities, our actions are not always performed directly with bodily effectors, such as the hands, but may
8/18/09 5:27:17 PM
Personal and extrapersonal Spatial Perception
be mediated by different kinds of tools. These different tools may guide, or extend, the range of our motor performances. One view of such extension of motor skills by tool use is the idea that tools may be somewhat “incorporated” within our body representation, thus becoming a functional extension of it (for early accounts of this theoretical perspective, see Critchley, 1979; Head & Holmes, 1911; Paillard, 1971). Neurophysiological and neuropsychological studies have now given support to this view, showing that the use of different tools can change the way in which we interact with stimuli in far space. The general idea is that a visual stimulus in far space, which, as noted previously, typically undergoes a weak integration with somatosensory inputs delivered to the body, may become more effective for multisensory integration when it is reached by a tool. A far (visual) stimulus, when reached through the tip of a tool, may start to act as a stimulus near to the body (in near peripersonal space), and increase its influence over tactile processing. In the seminal study by Iriki, Tanaka, and Iwamura (1996), macaque monkeys were shown how to use rakes to retrieve pieces of food in the space out of hand reach. After a few weeks of training, the activity of parietal bimodal visual-tactile neurons was recorded. These neurons had tactile receptive fields representing the hand or the shoulder, responded to visual stimuli only if delivered close to the skin of these body parts, and showed scarce or no responses to far visual stimuli. Crucially, after short training sessions with the tool, the responses of these bimodal visual-tactile neurons to those far visual stimuli, reached by the tool, increased substantially, as if they were functionally considered as laying in near space, and therefore suitable for multisensory interactions. In a similar vein, tool use modulates the behavior of patients affected by spatial neglect or extinction. In a single patient study, visuospatial neglect selective for near space extended to far space (as shown by the worsening of line bisection performance in far space), when the far lines were bisected with a stick, instead of a laser pointer (Berti & Frassinetti, 2000; see also Pegna et al., 2001, for a related account). Other relevant evidence comes from patients with crossmodal extinction. The logic is similar to the study by Iriki et al. (1996), discussed earlier. If the patient actively uses a tool with the ipsilesional hand to reach for visual stimuli in far space, far visual stimuli (that are nonetheless close to the tool tip) could become more effective in inducing the loss of contralesional touches. Interesting modulations of crossmodal extinction have recently been obtained with tool use. In these experiments, crossmodal extinction was used as a paradigm to test the effect of prolonged activity with tools in modulating crossmodal body representations (for reviews, see Maravita, 2006; Maravita & Iriki, 2004). The general outcome of these studies is
c15.indd Sec3:328
that wielding or using a tool in extrapersonal far space can increase the impact of far ipsilesional visual stimuli on contralesional tactile extinction, as if far space were rendered akin to near space, by becoming reachable with the tool (see Farnè & Làdavas, 2000; Maravita, Husain, Clarke, & Driver, 2001). In another single patient study, the related procedure of using a tool with the contralesional hand to operate in the ipsilesional space with a rake may reduce tactile extinction to left touches, presented simultaneously with right-sided visual flashes. This recovery from left crossmodal extinction may be possibly achieved by constructing a common visual-tactile representation of the two sides of space, hence reducing the competition between multiple stimuli (Maravita, Spence, Sergent, & Driver, 2002, see Figure 15.2). Pretraining
Posttraining
(A)
(B) 60 Extinction of Left Touch (%)
328
40
20
0 (C)
Baseline
Postrake use: 1’
Postrake use: 30’
60’
Figure 15.2 Schematic illustration of the experimental setup and results of an experiment by Maravita. Note. A: Crossmodal extinction is tested with invisible touches at the left hand and visual stimuli (starry symbol) close to the right hand. The patient is holding a tool, but no training has been performed. B: The same testing procedure, after 120 minutes of training, consisting of collecting objects in the right side of space, with a tool held with the left hand. C: Extinction rate is higher before the training (leftmost column), significantly decreased one and 30 minutes after the training (two middle columns), going back to the baseline level after 60 minutes (rightmost column). This result is compatible with an extension of the multisensory, visuo-tactile integration from the near, peripersonal left side, before the training (circle in A), to the tip of the tool (circles in B), after the training. From "Active tool use with the contralesional hand can reduce crossmodal extinction of touch on that hand," by A. Maravita, K. Clarke, M. Husain, and J. Driver (2002). Neurocase, 8, 411–416. Based on figures 2 and 3, pp. 413–414.
8/18/09 5:27:17 PM
Neural Basis in Man and Monkey 329
A further set of studies has dealt with another tool (mirrors) that, with a logic similar to that illustrated earlier, may act by mediating our interactions with external space, through a modulation of multisensory integration. Mirrors have indeed become common tools in everyday life activities, from grooming to driving (evidence of disrupted interactions with mirror reflections are found after parietal lesions, see, e.g., Binkofski, Buccino, Dohle, Seitz, & Freund, 1999). In a series of experiments, Maravita and coworkers (Maravita et al., 2002; Maravita, Spence, Clarke, Husain, & Driver, 2000) have shown that, when people observe visual stimuli close to their hands, but reflected in a mirror, those stimuli are automatically interpreted as being in near peripersonal space, no matter the distance suggested by the reflective properties of the mirror, that would account for a stimulus in far space, as if “through the looking glass” (precisely at double the distance between the stimulus and the mirror). Those visual stimuli produce a strong interference in a crossmodal congruency task (Maravita et al., 2000). Similarly, an ipsilesional flash produces more left crossmodal tactile extinction when observed as the distant mirror-reflection of an LED close to the ipsilesional hand, than a distant LED flash projecting an equivalent visual image directly (Maravita et al., 2002), compatible with the visual stimulus being correctly interpreted as being in near space. These results show that the mere knowledge of the physical characteristics of the environment (such as the properties of reflecting surfaces and their effect on the localization of visual stimuli) is enough to bias the subject’s processing of visual stimuli, as well as crossmodal interactions. More generally, these findings suggest that the coding of stimuli in space occurs with an eye on their functional meaning for both perception and action. Accordingly, no matter how a visual stimulus appears to be in terms of retinal projection (e.g., “far” in the mirror reflection), the subject’s interaction with it occurs according to the knowledge acquired by the brain about its actual position in space.
NEURAL BASIS IN MAN AND MONKEY Different neural structures code space in different perspectives and work in parallel to give us a unitary representation of space. This representation takes into account visual and auditory stimuli that are present in extrapersonal space, as well as their integration with somatosensory stimuli delivered to the body, and information concerning body posture gathered through proprioception. The neural substrate of space representation should be regarded as a network of interconnected structures that manages its different aspects, including the discrete components of personal and extrapersonal space, and their multisensory integration.
c15.indd Sec4:329
A time-honored view associates the representation of personal and extrapersonal space with the PPC, particularly in the right hemisphere (Critchley, 1953; Jewesbury, 1969). In the past decades, the network concerned with spatial cognition has broadened to include the frontal premotor cortex (PMC; see review in Vallar, 2001), and the white matter connections of these frontal regions with the PPC (see review in Bartolomeo, Thiebaut de Schotten, & Doricchi, 2007). Within the PPC, the supramarginal and angular gyri of the inferior parietal lobule appear to be regions relevant for spatial cognition, as well as the temporoparietal junction, and the superior temporal gyrus (see review in Karnath, 2001), although the role of this area is more controversial (see review in Marshall, Fink, Halligan, & Vallar, 2002). These conclusions are mainly based on studies in which brain-damaged patients are engaged in tasks, such as target cancellation, drawing, or line bisection, performed in near, within hand reach, peripersonal space. Figure 15.3 summarizes the main neural structures concerned with spatial attention and representation as suggested by lesion studies in right-brain-damaged patients with left spatial neglect (A), and in a schematic flow chart, also including subcortical structures (B) (see also Chapters 10 and 18). Reference Frames As far as the representation of visual extrapersonal space is concerned, at the first elementary level, in the primary visual cortex V1 there is the so-called retinotopic representation, whereby any stimulus in the visual field is projected to a given location in the cortex. In this case, the anatomical correspondence of the left and right visual fields is relative to the different hemispheres, with a contralateral representation of each visual field. Critically, the retinotopic representation is completely linked to eye position, since the first map which is formed, and then transferred to V1, is that on the retina. However, other neurons represent space relatively to the head, while totally (or partially) ignoring the absolute spatial projection of the stimulus on the retina. For example, neurons in the PPC and in the PMC may represent left and right space relatively to the animal’s head, with eye position being largely irrelevant (Duhamel, Bremmer, BenHamed, & Graf, 1997; Fogassi et al., 1992; Galletti, Battaglini, & Fattori, 1993; see also Rizzolatti et al., 1997), or only partially modulating the neuronal response (see Andersen, Snyder, Bradley, & Xing, 1997). In this case, the space around us is coded through a more egocentric, nonretinotopic space representation. At a higher level, neuroimaging studies in humans have explored the role of different brain structures for the
8/18/09 5:27:18 PM
330
Personal and extrapersonal Spatial Perception (A) 7a
4
2 13 5
(B)
6
7b
Posterior parietal TP junction
Frontal premotor
8 9
19 40 18
10
44
39
17
Thalamus Basal Ganglia Superior Colliculus
11 18
19
38
37
Cingulate Gyrus
20
Figure 15.3 A: The neural correlates of left spatial neglect. B: Cortico-subcortical networks for spatial attention and representation. Note. (A) Most anatomoclinical correlation studies show that the responsible lesion involves the right inferior parietal lobule in the PPC (angular gyrus: BA 39; supramarginal gyrus: BA 40, Intermediate gray area), and the temporoparietal junction (white-gray area). Neglect after right frontal damage is less frequent and usually associated with lesions to the PMC, particularly to its more ventral parts (BA 44 and ventral BA 6, dark gray area). Neglect may also be associated with damage to the more dorsal and medial regions of the PMC, and to the superior temporal gyrus (light
representation of space in different reference frames, clarifying the neural networks supporting egocentric versus allocentric coordinate systems. Galati et al. (2000) asked neurologically unimpaired subjects to judge the position (left or right) of a vertical segment drawn over a horizontal line (lines could be overall shifted to the left, or to the right, relatively to the observer ’s midsagittal plane, and segments could be placed to the left or to the right of the line’s objective midpoint). In one condition, subjects had to evaluate the spatial position of the vertical segment relatively to the observer ’s body midline (egocentric frame of reference). In the second task, the position of the vertical segment had to be computed relatively to the objective midpoint of the horizontal line (allocentric frame of reference). The patterns of brain activation showed the existence of partly overlapping, though different, cortical networks. In the egocentric condition, activations included the PPC (superior and inferior parietal lobules) from the medial surface down to the temporoparietal junction, and the lateral PMC (superior and inferior frontal gyri; see also Vallar et al., 1999). Bilateral activations were found, although those in the right hemisphere were much wider. In the allocentric condition, again a frontoparietal network was activated, this time centered on the superior parietal and intraparietal regions, and the superior frontal sulcus of the right hemisphere. A broadly similar activation of a frontoparietal network, more extensive in the right hemisphere, including the PPC around the intraparietal sulcus, the frontal regions around the precentral and superior frontal sulci, and the inferior
c15.indd Sec4:330
gray areas). From figure 3, p. 129 “Spatial Cognition: Evidence from Visual Neglect,” by P. W. Halligan, G. R. Fink, J. C. Marshall, and G. Vallar, 2003, Trends in Cognitive Sciences, 7, pp. 125–133. Reprinted with permission. (B) The frontal PMC, the PPC/temporoparietal (TP) junction, the subcortical gray nuclei and their connections. From figure 1, p. 34, “Functional Anatomy of Attention and Neglect: from Neurons to Networks” (pp. 33–45), by M.-M. Mesulam, in The cognitive and neural bases of spatial neglect, H.-O. Karnath, A. D. Milner, and G. Vallar (Eds.), 2002 (Oxford: Oxford University Press). Adapted with permission.
and superior frontal gyri, was later found in a tactile task (Galati, Committeri, Sanes, & Pizzamiglio, 2001). Personal versus Extrapersonal Space Individual case reports suggest that personal neglect is associated with right hemispheric damage, involving particularly the PPC, but also the frontal cortex and subcortical structures (Bisiach et al., 1986; Guariglia & Antonucci, 1992), with a lesion pattern broadly overlapping with that found in patients showing extrapersonal neglect. A recent study in 52 right-brain-damaged stroke patients, using lesion density plots and subtraction analysis, has shown an anatomical dissociation between personal and extrapersonal neglect. Personal neglect was assessed by a task requiring the use of familiar objects in the body space, rather than by tasks requiring the exploration of the body, and the reaching of body parts. For extrapersonal neglect, a standard battery, comprising visuo-motor exploratory and perceptual tasks, was used. The suggestion is made that a circuit including the right frontal (ventral premotor cortex and middle frontal gyrus) and superior temporal regions is concerned with the representation of extrapersonal space, while the right inferior parietal regions (supramarginal gyrus, postcentral gyrus, and, particularly, the underlying white matter) would support the representation of personal space (Committeri, Pitzalis, Galati, Patria, Pelle, Sabatini et al., 2007). Data from patients showing selective impairments of pointing to own body parts versus body parts of others, such
8/18/09 5:27:19 PM
Neural Basis in Man and Monkey 331
as the examiner, suggest a different neurofunctional pattern. In one patient with a neurodegenerative disorder, defective pointing to the patient’s own body parts was associated with dysfunction, as assessed by single photon emission tomography (SPET), of the superior parietal lobule (BA 7) in the left hemisphere. In another patient, the deficit concerned the body parts of the examiner, with the dysfunction involving the left inferior parietal lobule (Felician, Ceccaldi, Didic, ThinusBlanc, & Poncet, 2003). In a successive fMRI experiment performed in neurologically unimpaired subjects, Felician, Romaiguère, et al. (2004) confirmed the role of the superior parietal lobule in the task of pointing to own body parts.
MIP
V6a V6 PIP V2
S1
VIP LIP
V3a DP V3
FEF
S2
7b
MST
SEF (F7)
(F2)
ips AIP
7a
cs
9
PMv
ps
(F4)
MT V4
PMd
M1
(F5) FST
10
as STP
ls
sts
Extrapersonal Space: Near, Peripersonal, versus Far Space
Figure 15.4 Representation of a macaque cerebral hemisphere with the arcuate, intraparietal, superior temporal, and lunate sulci (thick lines) opened up.
Not all sectors of the space around us may be represented by the same neural structures. In the past 20 years, the notion of space representation has been progressively integrated with that of action planning and execution. From this perspective, space representation has been considered less and less as the mere construction of a “map” of external space, where we can represent objects of interest, and more and more as the locus of integration between perception, action, and awareness. In this view, the neural coding of an object in external space is closely linked to the neural processing necessary to grasp or manipulate that particular object (Maravita, 2006; Rizzolatti et al., 1998). To this aim, a critical distinction has to be made between a near or peripersonal space, where objects can be reached and manipulated and a far extrapersonal space, which is beyond hand’s grasp.
Note. Boundaries of major functional subdivisions within the frontal and the parietal lobe. AIP = anterior intraparietal area; as ⫽ arcuate sulcus; cs ⫽ central sulcus; DP ⫽ dorsal prelunate area; FEF ⫽ frontal eye fields; FST ⫽ fundus of superior temporal area; ips ⫽ intraparietal sulcus; LIP ⫽ lateral intraparietal area; ls ⫽ lunate sulcus; M1 = motor cortex; MIP = medial intraparietal area; MST = medial superior temporal area; MT ⫽ middle temporal area; PIP ⫽ posterior intraparietal area; PM d/v ⫽ frontal premotor cortex, dorsal/ventral; ps = principal sulcus; S1, S2 ⫽ somatosensory cortex; SEF ⫽ supplementary eye fields; SMA ⫽ supplementary motor area; sts ⫽ superior temporal sulcus; V2, 3, 3a, 4, 6, 6a ⫽ visual areas; VIP ⫽ ventral intraparietal area. See also Colby and Goldberg (1999), and Rizzolatti and Matelli (2003). From figure 1, “Neglect in Monkeys: Effect of Permanent and Reversible Lesions” (pp. 47–58), by C. Wardak, E. Olivier, and J.-R. Duhamel, in The Cognitive and Neural Bases of Spatial Neglect, H. O. Karnath, A. D. Milner, & G. Vallar (Eds.), 2002, Oxford: Oxford University Press. Reprinted with permission.
Monkey Studies Since the 1980s, the difference between these representations of space has been made clear by cortical ablation studies in the monkey (see Figure 15.4). For example, while the ablation of the frontal prearcuate area 8 of the macaque monkey (frontal eye fields, FEF, in Figure 15.4) produces a lack of awareness and reaction to contralateral stimuli in far space, ablation of the postarcuate area 6 (ventral PMC, including area F4, see Figure 15.4) brings about similar deficits, but limited to the space near the animal’s body (Rizzolatti, Matelli, & Pavesi, 1983; Schieber, 2000). These frontal areas represent space, according to the kind of actions (reaching, grasping, eye movements) that can be performed in different sectors of it and are richly interconnected with specific portions of the parietal cortex (see, e.g., Rizzolatti et al., 1998). Similarly, different portions of the PPC support the representation of different sectors of extrapersonal space. Neurons in the ventral intraparietal (VIP) area are visually responsive, but most can also be
c15.indd Sec4:331
SMA 5
excited by tactile stimuli, with the tactile receptive fields being generally restricted to the head and the face, and the visual and the tactile receptive fields being matched in size and location. Neurons in the medial intraparietal (MIP) area respond to stimuli within reaching distance and their response properties range from purely somatosensory, to bimodal, and to purely visual. Neurons in the anterior intraparietal (AIP) area respond to visual stimuli that the monkey can manipulate, with the represented spatial dimension being the desired shape of the hand, rather than its position in egocentric space. Neurons in the lateral intraparietal (LIP) area respond to the onset of the stimulus, and may maintain activity during the delay, and/ or discharge around the time of the saccade: These neurons may represent the space explored by eye movements—the predominant means by which we explore the world beyond our reach (Colby & Goldberg, 1999). Human Studies In his seminal study, Brain (1941) suggested that different lesion sites in the right hemisphere are associated with
8/18/09 5:27:19 PM
332
Personal and extrapersonal Spatial Perception
selective impairments of defective localization of objects within arm reach (case #3: damage to the posterior temporal lobe), and at a farther distance. Neuroimaging activation studies in neurologically unimpaired subjects have confirmed and elucidated in more detail this early suggestion. A task requiring line bisection or pointing to dots in near space activates a number of left hemisphere structures (dorsal occipital, intraparietal, ventral PM cortices, and the thalamus). A similar task performed in far space (eye-to-screen distance 1.7 m) activates the ventral occipital cortex bilaterally, and the right medial temporal cortex (Weiss et al., 2000). A later study by the same group basically confirmed these findings and found that manual bisection activates the extrastriate, superior parietal, and premotor cortex bilaterally, while bisection judgments are associated with activations in the right inferior parietal and dorsolateral prefrontal cortices, the anterior cingulate, and the extra-striate and superior temporal cortices bilaterally, independent of the far versus near condition (Weiss, Marshall, Zilles, & Fink, 2003). An rTMS study has shown that stimulation of the right PPC disrupts the subjects’ performance in a perceptual bisection task (i.e., deciding whether the left or the right side of a line appears longer), in near (50 cm) space. Conversely, stimulation of the right ventral occipital lobe affects such judgments in far (150 cm) space (Bjoertomt, Cowey, & Walsh, 2002). To summarize, the available evidence suggests that different brain areas and neural networks are involved in the representation of near (within hand/arm reach) versus far extrapersonal space. Experiments with neurologically unimpaired subjects being required to bisect lines or to localize dots suggest a segregation of processing in terms of dorsal versus ventral pathways, with the former contributing to the representation of near space, the latter of far space. There is also some indication of a hemispheric asymmetry, with a right hemispheric major role, particularly in perceptual bisection. Multisensory Integration Our experience of the external world is typically gathered through more than one sensory modality at the same time. In this respect, it is critical that relevant neural structures in the brain are capable of integrating multiple, multisensory input into unitary, coherent percepts. Animal Studies This kind of integration starts very early in the brain. Work by Stein and Meredith (1993) has shown that many neurons in the deep layers of the cat’s superior colliculus (SC) respond to multiple sensory modalities (vision, touch, audition). For example, both acoustic and visual stimuli can
c15.indd Sec4:332
make the same SC neuron discharge (multisensory neuron). One critical feature of some such multisensory neurons is that their discharge can be critically enhanced when visual and auditory stimulation are in spatial register (the so-called spatial rule), or temporally synchronous (temporal rule). This condition is indeed compatible with the typical situations of the daily life, in which a given stimulus provides information through multiple sensory modalities at the same time. Therefore, these cells may critically enhance our responses to the multisensory events that characterize daily life. This enhanced neural discharge observed with spatially coincident multisensory stimuli corresponds to a faster and more accurate orienting of the animal toward the spatial location of a given stimulus. The multisensory responses found in SC neurons are coordinated with the activity of a number of cortical structures. For example, in the cat, the ectosylvian cortex and the lateral suprasylvian sulcus include neurons which discharge in response to both unimodal and multisensory stimuli, and, critically, exert a descending modulation of the activity of multisensory SC neurons (Stein, 2005; Wallace & Stein, 1994). In the monkey, a very intriguing set of studies has addressed the issue of multisensory space coding in near peripersonal space, thus providing a neural basis to the functional relationship between the body and the representation of extrapersonal space. Peripersonal acoustic (Graziano, Reiss, & Gross, 1999), visual and tactile (for a recent review, see Maravita, Spence, & Driver, 2003) stimuli are all specifically coded and integrated in the brain. In particular, in order to control object reaching and manipulation, vision and touch are highly interdependent. In the macaque monkey, around 50% of neurons in the ventral premotor cortex (area F4; Fogassi et al., 1996), 70% of neurons in the ventral intraparietal area (VIP; Duhamel, Colby, & Goldberg, 1998), 20% to 30% of neurons in the PPC (BA area 7b), and 24% of neuronal cells in the putamen (Graziano & Gross, 1994) show receptive fields (RF) in both the visual (vRF), and the somatosensory (sRF) modalities, meaning that they discharge in response to both visual and tactile stimuli. Furthermore, in some neurons, the visual and somatosensory responses are in spatial register. For instance, if a neuron responds to a touch or a joint displacement on the hand region, a visual response will also occur for stimuli nearby that hand. A similar pattern of discharge is found for most neurons in areas F4, 7b, the putamen, and for approximately half of the VIP neurons (Graziano & Gross, 1994; Rizzolatti, Scandolara, Matelli, & Gentilucci, 1981). Critically, each one of these cells codes for a region of peripersonal visual space, which is spatially aligned with the preferred somatosensory receptive field of that cell. For example, VIP neurons with somatosensory receptive fields
8/18/09 5:27:20 PM
Summary
on the right upper face, respond to stimuli presented to the right upper quadrant of the visual field. A premotor neuron with a tactile receptive field on the arm or hand discharges in response to visual stimuli approaching that body part. These cells constitute a functional network of bimodal neurons, supporting a common representation of the body surface, and the visual space nearby, which may be critical for guiding action. Another relevant finding is that the spatial selectivity of visual responses for some such multisensory neurons in area F4, VIP, and in the putamen is not merely retinotopic. For instance, for many PM neurons, and some neurons in the putamen, with a sRF on the arm, the corresponding vRF may shift along with the arm, if the arm is moved in space (Graziano, Taylor, Moore, & Cooke, 2002), while the effect of gaze shifting may be minimal (Rizzolatti et al., 1997). This neural system may be critical for coding space in egocentric coordinates centered on single body parts, thus putting each body part in strict spatial relationship to any visual event that may occur nearby. In this view, it is important that the brain keeps a constantly updated representation of the body surface and position, together with that of the space immediately around the body (Graziano & Botvinick, 2002; Maravita, 2006; Maravita et al., 2003). The buildup of this representation occurs through different sensory modalities. While vision is surely critical for coding the position of extrapersonal stimuli relative to the body, and of the body as well, proprioception is highly relevant for updating information about the body posture: in particular when the changes of the position of the hand, for example, when it crosses the midline, must be tracked without the aid of vision (Graziano, 1999; Obayashi, Tanaka, & Iriki, 2000). Human Studies Much work has been devoted to multisensory space representation in humans (Calvert et al., 2004; Spence & Driver, 2004). For example, Frassinetti et al. (2002) found that spatially congruent unattended sounds can improve the perception of below-threshold visual targets in a fashion that is reminiscent, though without any neurophysiologic support, of the response enhancement shown by SC audiovisual neurons in response to spatially coincident multisensory stimuli. Functional neuroimaging studies have shown that crossmodal binding of audiovisual stimuli may be critical for higher-order cognitive functions, other than spatial localization. In humans, delivering semantically congruent versus incongruent audiovisual speech stimuli modulates the activity in the superior temporal sulcus, as assessed by fMRI (Calvert, Campbell, & Brammer, 2000; Calvert et al., 2004). Within peripersonal space, the neuroimaging work of Macaluso and Driver (2003) has highlighted
c15.indd Sec5:333
333
the functional link between vision and touch for spatial attention. In extrastriate visual cortex, tactile unattended stimuli enhance responses to attended, spatially coincident, visual targets (Macaluso, Frith, & Driver, 2000). This finding suggests that multimodal stimuli may increase the response of unimodal cortical areas, via back projections from multisensory areas, thus enhancing spatial perceptual processing. Finally, a recent body of evidence supports the idea that, in order to maintain the correspondence between the visual and the somatosensory maps for multisensory integration, the PPC plays a crucial role. In addition to the findings from neuroimaging and TMS experiments mentioned earlier (Bolognini & Maravita, 2007; Lloyd et al., 2003), neuropsychological evidence has been provided by Valenza and coworkers (2004), who reported a disruption of visual-tactile spatial interactions (see the crossmodal congruency task illustrated earlier: Spence et al., 2008) in a patient with bilateral parietal damage and Balint’s syndrome [a complex deficit including impairments of reaching (optic ataxia), and of visuo-spatial attention and orientation (Rizzo & Vecera, 2002; Vallar, 2007b)]. In sum, the vast and multidisciplinary literature on multisensory processing suggests that the integration of information coming from multiple senses has a critical role in the building up of a complete representation of the extrapersonal environment, and of the body, and in implementing and controlling the sensorimotor interactions between the body and objects in space.
SUMMARY Evidence from neuropsychological studies in brain-damaged patients with disorders of spatial cognition, from experiments in neurologically unimpaired subjects, and from neurophysiological studies in the animal concur to suggest that the internal representation of space includes a number of independent components. Major divisions are between spatial representations of the body versus extrapersonal space and far versus near extrapersonal space. A main factor that accounts for these multiple spatial representations is the kind of action that is performed and the motor effector used (e.g., space within/outside hand reach). Our experience of space is however highly integrated and, phenomenologically, largely unitary. We can perceive distant objects and, at the same time, attend to the voice of a friend nearby, or walk to a distant target location, while we reach for something in our pocket or scratch our head. In all these examples, multiple frames of references and effectors can be used simultaneously, and many different inputs, located in different space sectors, or on our own body, can be perceived
8/18/09 5:27:20 PM
334
Personal and extrapersonal Spatial Perception
as a continuous and uniform perceptual experience. This is achieved largely though the integration of the diverse sensory inputs, which continuously reach the different sensory receptors of our body. Furthermore, in the intact brain, the multiple frames of reference are used simultaneously and with comparable efficiency, to achieve effectively any kind of perceptual/motor goal at any given time. The neurological underpinnings of these spatial representations, which link perception and action, are being progressively discovered and clarified by the modern neurosciences. They include the frontal PMC and the PPC, as well as some subcortical structures, such as the thalamus, the basal ganglia, and the SC. These different structures can exert their control over different aspects of space representation because they hold, on the one hand, a high level of specificity for some given perceptual/motor functions, and, on the other hand, are highly interconnected through cortico-cortical and cortico-subcortical connections. A unitary representation of space and of the body in space can be only produced by the integrity of such complex networks, as shown by the variegate clinical pictures described in the neuropsychological literature.
Bolognini, N., & Maravita, A. (2007). Proprioceptive alignment of visual and somatosensory maps in the posterior parietal cortex. Current Biology, 17, 1890–1895. Bonnier, P. (1905). L’aschématie. Revue Neurologique, 13, 605–609. Brain, W. R. (1941). Visual disorientation with special reference to lesions of the right cerebral hemisphere. Brain, 64, 244–272. Calvert, G. A., Campbell, R., & Brammer, M. J. (2000). Evidence from functional magnetic resonance imaging of crossmodal binding in the human heteromodal cortex. Current Biology, 10, 649–657. Calvert, G. A., Spence, C., &, Stein, B. E. (Eds.). (2004). The handbook of multisensory processes. Cambridge, MA: MIT Press. Colby, C. L., & Goldberg, M. E. (1999). Space and attention in parietal cortex. Annual Review of Neuroscience, 22, 319–349. Committeri, G., Pitzalis, S., Galati, G., Patria, F., Pelle, G., Sabatini, U., et al. (2007). Neural bases of personal and extrapersonal neglect in humans. Brain, 130, 431–441. Cowey, A., Small, M., & Ellis, S. (1994). Left visuo-spatial neglect can be worse in far than in near space. Neuropsychologia, 32, 1059–1066. Critchley, M. (1953). The parietal lobes. New York: Hafner. Critchley, M. (1979). The divine banquet of the brain and other essays. New York: Raven Press. Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.), Handbook of perception and cognition: Perception of space and motion (Vol. 5, pp. 69–117). San Diego, CA: Academic Press. di Pellegrino, G., Làdavas, E., & Farnè, A. (1997, August 21). Seeing where your hands are. Nature, 388, 730.
REFERENCES Aglioti, S., Smania, N., & Peru, A. (1999). Frames of reference for mapping tactile stimuli in brain-damaged patients. Journal of Cognitive Neuroscience, 11, 67–79. Andersen, R. A., Snyder, L. H., Bradley, D. C., & Xing, J. (1997). Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annual Review of Neuroscience, 20, 303–330. Bartolomeo, P., Perri, R., & Gainotti, G. (2004). The influence of limb crossing on left tactile extinction. Journal of Neurology, Neurosurgery, and Psychiatry, 75, 49–55. Bartolomeo, P., Thiebaut de Schotten, M., & Doricchi, F. (2007). Left unilateral neglect as a disconnection syndrome. Cerebral Cortex, 17, 2479–2490. Berti, A., & Frassinetti, F. (2000). When far becomes near: Remapping of space by tool use. Journal of Cognitive Neuroscience, 12, 415–420. Berti, A., & Rizzolatti, G. (2002). Coding near and far space. In H. O. Karnath, A. D. Milner, & G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 119–129). Oxford: Oxford University Press. Biguer, B., Donaldson, I. M. L., Hein, A., & Jeannerod, M. (1988). Neck muscle vibration modifies the representation of visual motion and direction in man. Brain, 111, 1405–1424. Binkofski, F., Buccino, G., Dohle, C., Seitz, R. J., & Freund, H. J. (1999). Mirror agnosia and mirror ataxia constitute different parietal lobe disorders. Annals of Neurology, 46, 51–61. Bisiach, E., Perani, D., Vallar, G., & Berti, A. (1986). Unilateral neglect: Personal and extrapersonal. Neuropsychologia, 24, 759–767. Bisiach, E., & Vallar, G. (2000). Unilateral neglect in humans. In F. Boller, J. Grafman, & G. Rizzolatti (Eds.), Handbook of neuropsychology (2 ed., Vol. 1, pp. 459–502). Amsterdam: Elsevier Science, B.V. Bjoertomt, O., Cowey, A., & Walsh, V. (2002). Spatial neglect in near and far space investigated by repetitive transcranial magnetic stimulation. Brain, 125, 2012–2022.
c15.indd Sec5:334
Driver, J., & Spence, C. (1994). Spatial synergies between spatial and visual attention. In C. Umiltà & M. Moscovitch (Eds.), Attention and performance: Conscious and nonconscious information processing (Vol. 15, pp. 311–331). Cambridge, MA: MIT Press. Driver, J., & Spence, C. (1998). Attention and the crossmodal construction of space. Trends in Cognitive Sciences, 2, 254–262. Duhamel, J.-R., Bremmer, F., BenHamed, S., & Graf, W. (1997). Spatial invariance of visual receptive fields in parietal cortex neurons. Nature Neuroscience, 389, 845–848. Duhamel, J.-R., Colby, C. L., & Goldberg, M. E. (1998). Ventral intraparietal area of the macaque: Congruent visual and somatic response properties. Journal of Neurophysiology, 79, 126–136. Farnè, A., & Làdavas, E. (2000). Dynamic size-change of hand peripersonal space following tool use. NeuroReport, 11, 1645–1649. Felician, O., Ceccaldi, M., Didic, M., Thinus-Blanc, C., & Poncet, M. (2003). Pointing to body parts: A double dissociation study. Neuropsychologia, 41, 1307–1316. Felician, O., Romaiguère, P., Anton, J. L., Nazarian, B., Roth, M., Poncet, M., et al. (2004). The role of human left superior parietal lobule in body part localization. Annals of Neurology, 55, 749–751. Fogassi, L., Gallese, V., di Pellegrino, G., Fadiga, L., Gentilucci, M., Luppino, G., et al. (1992). Space coding by premotor cortex. Experimental Brain Research, 89, 686–690. Fogassi, L., Gallese, V., Fadiga, L., Luppino, G., Matelli, M., & Rizzolatti, G. (1996). Coding of peripersonal space in inferior premotor cortex (area F4). Journal of Neurophysiology, 76, 141–157. Frassinetti, F., Bolognini, N., & Làdavas, E. (2002). Enhancement of visual perception by crossmodal visuo-auditory interaction. Experimental Brain Research, 147, 332–343. Funnell, M. G., Corballis, P. M., & Gazzaniga, M. S. (2000). Hemispheric interactions and specializations: Insights from the split brain. In F. Boller, J. Grafman, & G. Rizzolatti (Eds.), Handbook of neuropsychology (pp. 103–120). Amsterdam: Elsevier Science, B.V.
8/18/09 5:27:21 PM
References 335 Galati, G., Committeri, G., Sanes, J. N., & Pizzamiglio, L. (2001). Spatial coding of visual and somatic sensory information in body-centred coordinates. European Journal of Neuroscience, 14, 737–746.
Lacquaniti, F. (1997). Frames of reference in sensorimotor coordination. In F. Boller & J. Grafman (Eds.), Handbook of neuropsychology (Vol. 11, pp. 27–64). Amsterdam: Elsevier.
Galati, G., Lobel, E., Vallar, G., Berthoz, A., Pizzamiglio, L., & Le Bihan, D. (2000). The neural basis of egocentric and allocentric coding of space in humans: A functional magnetic resonance study. Experimental Brain Research, 133, 156–164.
Làdavas, E. (2002). Functional and dynamic properties of visual peripersonal space. Trends in Cognitive Sciences, 6, 17–22.
Gallace, A., & Spence, C. (2008). The cognitive and neural correlates of “tactile consciousness”: A multisensory perspective. Consciousness and Cognition, 17, 370–407. Galletti, C., Battaglini, P. P., & Fattori, P. (1993). Parietal neurons encoding spatial locations in craniotopic coordinates. Experimental Brain Research, 96, 221–229. Graziano, M. S. A. (1999). Where is my arm? The relative role of vision and proprioception in the neuronal representation of limb position. Proceedings of the National Academy of Sciences, USA, 96, 10418–10421. Graziano, M. S. A., & Botvinick, M. M. (2002). How the brain represents the body: Insights from neurophysiology and psychology. In W. Prinz & B. Hommel (Eds.), Common mechanisms in perception and action: Attention and performance XIX (pp. 136–157). Oxford: Oxford University Press.
Macaluso, E., & Driver, J. (2003). Multimodal spatial representations in the human parietal cortex: Evidence from functional imaging. Advances in Neurology, 93, 219–233. Macaluso, E., Frith, C. D., & Driver, J. (2000, August 18). Modulation of human visual cortex by crossmodal spatial attention. Science, 289, 1206–1208. Maravita, A. (2006). From “body in the brain” to “body in space.” Sensory and intentional components of body representation. In G. Knoblich, I. Thornton, M. Grosjean, & M. Shiffrar (Eds.), Human body perception from the inside out (pp. 65–88). New York: Oxford University Press. Maravita, A., Clarke, K., Husain, M., & Driver, J. (2002). Active tool use with the contralesional hand can reduce cross-modal extinction of touch on that hand. Neurocase, 8, 411–416.
Graziano, M. S. A., & Gross, C. G. (1994). The representation of extrapersonal space: A possible role for bimodal, visual-tactile neurons. In M. S. Gazzaniga (Ed.), The cognitive neurosciences (pp. 1021–1034). Cambridge, MA: MIT Press.
Maravita, A., & Driver, J. (2004). Crossmodal integration and spatial attention in relation to tool-use and mirror-use: Representing and extending multimodal space near the hand. In G. A. Calvert, C. Spence, & B. E. Stein (Eds.), Handbook of multisensory processes (pp. 819–835). Cambridge, MA: MIT Press.
Graziano, M. S. A., Reiss, L. A., & Gross, C. G. (1999, February 4). A neuronal representation of the location of nearby sounds. Nature, 397, 428–430.
Maravita, A., Husain, M., Clarke, K., & Driver, J. (2001). Reaching with a tool extends visual-tactile interactions into far space: Evidence from cross-modal extinction. Neuropsychologia, 39, 580–585.
Graziano, M. S. A., Taylor, C. S. R., Moore, T., & Cooke, D. F. (2002). The cortical control of movement revisited. Neuron, 36, 349–362.
Maravita, A., & Iriki, A. (2004). Tools for the body (schema). Trends in Cognitive Sciences, 8, 79–86.
Grüsser, O.-J., & Landis, T. (1991). Visual agnosias and other disturbances of visual perception and cognition (Vol. 12). Houndmills, Basingstoke: Macmillan. Guariglia, C., & Antonucci, G. (1992). Personal and extrapersonal space: A case of neglect dissociation. Neuropsychologia, 30, 1001–1009. Halligan, P. W., Fink, G. R., Marshall, J. C., & Vallar, G. (2003). Spatial cognition: Evidence from visual neglect. Trends in Cognitive Sciences, 7, 125–133. Halligan, P. W., & Marshall, J. C. (1989). Is neglect (only) lateral? A quadrant analysis of line cancellation. Journal of Clinical and Experimental Neuropsychology, 11, 793–798. Halligan, P. W., & Marshall, J. C. (1991, April 11). Left neglect for near but not far space in man. Nature, 350, 498–500. Head, H., & Holmes, G. (1911). Sensory disturbances from cerebral lesions. Brain, 34, 102–254. Howard, I. P., & Templeton, W. B. (1966). Human spatial orientation. London: Wiley.
Maravita, A., Spence, C., Clarke, K., Husain, M., & Driver, J. (2000). Vision and touch through the looking glass in a case of crossmodal extinction. NeuroReport, 11, 3521–3526. Maravita, A., Spence, C., & Driver, J. (2003). Multisensory integration and the body schema: Close to hand and within reach. Current Biology, 13, 531–539. Maravita, A., Spence, C., Sergent, C., & Driver, J. (2002). Seeing your own touched hands in a mirror modulates cross-modal interactions. Psychological Science, 13, 350–355. Marshall, J. C., Fink, G. R., Halligan, P. W., & Vallar, G. (2002). Spatial awareness: A function of the posterior parietal lobe? Cortex, 28, 253–257. Merleau-Ponty, M. (1945). Phenomenology of perception. London: Routledge, 2002. Mesulam, M.-M. (2002). Functional anatomy of attention and neglect: From neurons to networks. In H.-O. Karnath, A. D. Milner, & G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 33–45). Oxford: Oxford University Press.
Iriki, A., Tanaka, M., & Iwamura, Y. (1996). Coding of modified body schema during tool use by macaque postcentral neurones. NeuroReport, 7, 2325–2330.
Milner, B. (1971). Interhemispheric differences in the localization of psychological processes in man. British Medical Bulletin, 27, 272–277.
Jewesbury, E. C. O. (1969). Parietal lobe syndromes. In P. J. Vinken & G. W. Bruyn (Eds.), Handbook of clinical neurology (Vol. 2, pp. 680– 699). Amsterdam: North Holland.
Obayashi, S., Tanaka, M., & Iriki, A. (2000). Subjective image of invisible hand coded by monkey intraparietal neurons. NeuroReport, 11, 3499–3505.
Karnath, H.-O. (2001). New insights into the functions of the superior temporal cortex. Nature Reviews Neuroscience, 2, 568–576.
Ortigue, S., Mégevand, P., Perren, F., Landis, T., & Blanke, O. (2006). Double dissociation between representational personal and extrapersonal neglect. Neurology, 66, 1414–1417.
Kerkhoff, G. (2003). Modulation and rehabilitation of spatial neglect by sensory stimulation. Progress in Brain Research 142, 257–271. Kooistra, C. A., & Heilman, K. M. (1989). Hemispatial visual inattention masquerading as hemianopia. Neurology, 39, 1125–1127. Lackner, J. R. (1988). Some proprioceptive influences on the perceptual representation of body shape and orientation. Brain, 111, 281–297.
c15.indd Sec6:335
Lloyd, D. M., Shore, D. I., Spence, C., & Calvert, G. A. (2003). Multisensory representation of limb position in human premotor cortex. Nature Neuroscience, 6, 17–18.
Paillard, J. (1971). Les determinants moteurs de l’organisation de l’espace. Cahiers de Psycologie, 14, 261–316. Paterson, A., & Zangwill, O. L. (1944). Disorders of visual space perception associated with lesions of the right cerebral hemisphere. Brain, 67, 331–335.
8/18/09 5:27:22 PM
336
Personal and extrapersonal Spatial Perception
Pavani, F., Spence, C., & Driver, J. (2000). Visual capture of touch: Out-ofthe-body experiences with rubber gloves. Psychological Science, 11, 353–359. Pegna, A. J., Petit, L., Caldara-Schnetzer, A. S., Khateb, A., Annoni, J. M., Sztajzel, R., et al. (2001). So near yet so far: Neglect in far or near space depends on tool use. Annals of Neurology, 50, 820–822. Pitzalis, S., & Di Russo, F. (2001). Spatial anisotropy of saccadic latency in normal subjects and brain-damaged patients. Cortex, 37, 475–492. Pitzalis, S., Di Russo, F., Spinelli, D., & Zoccolotti, P. (2001). Influence of the radial and vertical dimensions on lateral neglect. Experimental Brain Research, 136, 281–294. Previc, F. H. (1998). The neuropsychology of 3-D space. Psychological Bulletin, 2, 123–164. Rapcsak, S. Z., Cimino, C. R., & Heilman, K. M. (1988). Altitudinal neglect. Neurology, 38, 277–281. Rizzo, M., & Vecera, S. P. (2002). Psychoanatomical substrates of Balint’s syndrome. Journal of Neurology, Neurosurgery, and Psychiatry, 72, 162–178. Rizzolatti, G., Fadiga, L., Fogassi, L., & Gallese, V. (1997, July 11). The space around us. Science, 277, 190–191. Rizzolatti, G., Luppino, G., & Matelli, M. (1998). The organization of the cortical motor system: New concepts. Electroencephalography and Clinical Neurophysiology, 106, 283–296. Rizzolatti, G., & Matelli, M. (2003). Two different streams form the dorsal visual system: Anatomy and functions. Experimental Brain Research, 153, 146–157. Rizzolatti, G., Matelli, M., & Pavesi, G. (1983). Deficits in attention and movement following the removal of postarcuate (area 6) and prearcuate (area 8) cortex in macaque monkeys. Brain, 106, 655–673. Rizzolatti, G., Scandolara, C., Matelli, M., & Gentilucci, M. (1981). Afferent properties of periarcuate neurons in macaque monkeys: Pt. I. Somatosensory responses. Behavioral Brain Research, 2, 125–146. Rossetti, Y., & Rode, G. (2002). Reducing spatial neglect by visual and other sensory manipulations: Non-cognitive (physiological) routes to the rehabilitation of a cognitive disorder. In H.-O. Karnath, A. D. Milner, & G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 375–396). Oxford: Oxford University Press. Schieber, M. H. (2000). Inactivation of the ventral premotor cortex biases the laterality of motoric choices. Experimental Brain Research, 130, 497–507. Shelton, P. A., Bowers, D., & Heilman, K. M. (1990). Peripersonal and vertical neglect. Brain, 113, 191–205. Smania, N., & Aglioti, S. (1995). Sensory and spatial components of somaesthetic deficits following right brain damage. Neurology, 45, 1725–1730. Spence, C., & Driver, J. (1997). Audiovisual links in exogenous covert spatial orienting. Perception and Psychophysics, 59, 1–22. Spence, C., & Driver, J. (Eds.). (2004). Crossmodal space and crossmodal attention. Oxford: Oxford University Press. Spence, C., Kingstone, A., Shore, D. I., & Gazzaniga, M. S. (2001). Representation of visuotactile space in the split brain. Psychological Science, 12, 90–93. Spence, C., Pavani, F., Maravita, A., & Holmes, N. (2004). Multisensory contributions to the 3-D representation of visuotactile peripersonal space in humans: Evidence from the crossmodal congruency task. Journal of Physiology Paris, 98, 171–189.
c15.indd Sec6:336
Spence, C., Pavani, F., Maravita, A., & Holmes, G. (2008). Multisensory contributions to the representation of peripersonal space in humans: Evidence from the crossmodal congruency task. In M. Lin & M. Otaduy (Eds.), Haptic rendering: Foundations, algorithms, and applications (pp. 21–52). Wellesley, MA: AK Peters. Stein, B. E. (2005). The development of a dialogue between cortex and midbrain to integrate multisensory information. Experimental Brain Research, 166, 305–315. Stein, B. E., & Meredith, M. A. (1993). The merging of the senses. Cambridge, MA: MIT Press. Valenza, N., Murray, M. M., Ptak, R., & Vuilleumier, P. (2004). The space of senses: Impaired crossmodal interactions in a patient with Balint syndrome after bilateral parietal damage. Neuropsychologia, 42, 1737–1748. Vallar, G. (2000). The methodological foundations of human neuropsychology: Studies in brain-damaged patients. In F. Boller, J. Grafman, & G. Rizzolatti (Eds.), Handbook of neuropsychology (2nd ed., Vol. 1, pp. 305–344). Amsterdam: Elsevier. Vallar, G. (2001). Extrapersonal visual unilateral spatial neglect and its neuroanatomy. Neuroimage, 14, S52–S58. Vallar, G. (2003). Spatial disorders. In L. Nadel (Ed.), Encyclopedia of cognitive science (Vol. 4, pp. 125–131). London: Macmillan Reference. Vallar, G. (2007a). A hemispheric asymmetry in somatosensory processing. Behavioral and Brain Sciences, 30, 223–224. Vallar, G. (2007b). Spatial neglect, Balint-Homes’ and Gerstmann’s syndrome, and other spatial disorders. CNS Spectrum, 12, 527–536. Vallar, G., Guariglia, C., Nico, D., & Bisiach, E. (1995). Spatial hemineglect in back space. Brain, 118, 467–472. Vallar, G., Guariglia, C., & Rusconi, M. L. (1997). Modulation of the neglect syndrome by sensory stimulation. In P. Thier & H.-O. Karnath (Eds.), Parietal lobe contributions to orientation in 3D space (pp. 555578). Heidelberg: Springer-Verlag. Vallar, G., Lobel, E., Galati, G., Berthoz, A., Pizzamiglio, L., & Le Bihan, D. (1999). A fronto-parietal system for computing the egocentric spatial frame of reference in humans. Experimental Brain Research, 124, 281–286. Vallar, G., & Papagno, C. (2003). Pierre Bonnier ’s. (1905). Cases of bodily “aschématie.” In C. Code, C.-W. Wallesch, Y. Joanette, & A. R. Lecours (Eds.), Classic cases in neuropsychology (Vol. 2, pp. 147–170). Hove, East Sussex: Psychology Press. Wallace, M. T., & Stein, B. E. (1994). Cross-modal synthesis in the midbrain depends on input from cortex. Journal of Neurophysiology, 71, 429–432. Wardak, C., Olivier, E., & Duhamel, J.-R. (2002). Neglect in monkeys: Effect of permanent and reversible lesions. In H. O. Karnath, A. D. Milner, & G. Vallar (Eds.), The cognitive and neural bases of spatial neglect (pp. 47–58). Oxford: Oxford University Press. Weiss, P. H., Marshall, J. C., Wunderlich, G., Tellmann, L., Halligan, P. W., Freund, H. J., et al. (2000). Neural consequences of acting in near versus far space: A physiological basis for clinical dissociations. Brain, 123, 2531–2541. Weiss, P. H., Marshall, J. C., Zilles, K., & Fink, G. R. (2003). Are action and perception in near and far space additive or interactive factors? NeuroImage, 18, 837–846. Yamamoto, S., & Kitazawa, S. (2001). Reversal of subjective temporal order due to arm crossing. Nature Neuroscience, 4, 759–765.
8/18/09 5:27:23 PM
Chapter 16
The Mirror Neuron System GIACOMO RIZZOLATTI AND MADDALENA FABBRI-DESTRO
Rizzolatti, 1992; Gallese, Fadiga, Fogassi, and Rizzolatti, 1996; Rizzolatti, Fadiga, Fogassi, & Gallese, 1996). Mirror neurons show a close relationship between the motor acts they code and the visual motor acts they respond to. Using as classification criterion the congruence between the executed and observed motor acts effective in triggering them, the mirror neurons have been subdivided into two broad classes: strictly congruent and broadly congruent neurons (Gallese et al., 1996). Mirror neurons are defined as strictly congruent when the observed and executed effective motor acts are identical
In the first part of the chapter, we review the functional organization of the mirror neuron system in the monkey; in the second part, we examine the mirror neuron system of humans. The distinction between monkey and human mirror neuron system is of great theoretical importance because, although the basic neural mechanism is the same in the two species, some of the properties of the human mirror neuron system are not present in the monkey. This difference often has not been recognized, with the properties of monkey mirror neuron system being uncritically attributed to humans. This sometimes led to wrong conclusions on the possible explanatory role of the mirror neurons system in some typically human cognitive functions.
(A)
MIRROR NEURON SYSTEM IN MONKEYS Mirror neurons are a distinct class of motor neurons that discharge both when individuals perform a specific motor act and when they observe the same motor act done by another individual (Figure 16.1). Mirror neurons have been originally discovered in the rostral sector of the ventral premotor cortex (area F5). Subsequently, neurons with the same characteristics were also found in the monkey inferior parietal lobule. Figure 16.2 shows the lateral view of the monkey cerebral cortex with the subdivisions of the parietal and premotor areas. The most detailed available description of the properties of mirror neurons is that of area F5. These neurons, as all neurons of this area (Rizzolatti & Luppino, 2001), discharge when the monkey makes specific object-directed motor acts, such as grasping, tearing, holding, and, more rarely, bringing food to the mouth. Unlike another category of visuo-motor neurons of area F5 (“canonical neurons,” Murata et al., 1997), they do not fire in response to a simple presentation of objects including food. The observation of intransitive actions, including mimed actions, is also ineffective (Di Pellegrino, Fadiga, Fogassi, Gallese, &
(B)
500 msec
Figure 16.1 Visual and motor responses of a grasping mirror neuron recorded from area F5. Note: The neuron fires during the observation of grasping done by the experimenter (A) and during grasping done by the monkey (B). From “Understanding Motor Events: A Neurophysiological Study,” by G. Di Pellegrino, L. Fadiga, L. Fogassi, V. Gallese, and G. Rizzolatti, 1992, Experimental Brain Research, 91, pp. 176–80. Reprinted with permission.
337
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c16.indd 337
8/18/09 5:10:10 PM
338 The Mirror Neuron System
Ca
23 PEci
F6
F3
F2 AS F7 P
F1
F1 C
F4
PE
PF
PFC
V6A
MIP
PEc
PE
IP F E F
PEc PGm
Cg
PEip
PE PEc Opt PG PGop
VIP
LIP
AIP PF
PFG
PG
PFop F5
Lu
Lu
AI ST
Figure 16.2 Mesial and lateral views of the macaque brain. Note: The figure shows the cytoarchitectonic parcellation of the frontal motor cortex (areas indicated with F followed by Arabic numbers) and of the parietal lobe (areas indicated with P and progressive letters). Areas buried within the intraparietal sulcus are shown in an unfolded view of the sulcus (right inset). AIP ⫽ anterior intraparietal area; AI ⫽ inferior
in terms of goal (e.g., grasping) and in terms of the way in which that goal is achieved (e.g., precision grip). In contrast, mirror neurons are defined as broadly congruent when there is a similarity, but not identity, between the observed and executed effective motor acts. Among the different types of broadly congruent neurons, the most common is constituted of neurons that become active during the execution of a specific motor act made by the monkey (e.g., grasping, holding, or manipulating), but visually respond to more than one motor act (e.g., manipulation and grasping). In the first studies on mirror neurons, it was reported that these neurons do not discharge during the observation of goal-directed actions done using tools (Gallese et al., 1996; Rizzolatti et al., 1996). Subsequently it was shown, however, that, following a relatively long period during which monkeys observed the experimenters perform actions using tools, some mirror neurons respond, although weakly, also to this type of actions (Rizzolatti & Arbib, 1998). More recently, Ferrari, Rozzi, and Fogassi (2005) reported that in a specific lateral sector of F5, there are neurons that discharge very vigorously to the observation of tool use (e.g., a stick or a pair of pliers). It is not clear whether these neurons, as those previously observed, derived this property because of (prolonged action observation) or are a specific sets neurons coding grasping even when done by nonbiological effectors.
c16.indd 338
arcuate sulcus; AS ⫽ superior arcuate sulcus; C ⫽ central sulcus; Ca ⫽ calcarine fissure; Cg ⫽ cingulate cortex; FEF ⫽ frontal eye field; IP ⫽ intraparietal sulcus; L ⫽ lateral sulcus; LIP ⫽ lateral intraparietal area; Lu ⫽ lunate sulcus; MIP ⫽ medial intraparietal area; P ⫽ principal sulcus; ST ⫽ superior temporal sulcus.
Anatomical Organization of the Mirror Neuron System in the Monkey The mirror neuron system of the monkey consists of two main nodes: area F5 and the inferior parietal lobule (IPL). Area F5 is not homogeneous. Cytoarchitectonically, it consists of three sectors: a sector lying on the cortical convexity (F5c), a sector located on the posterior bank of the arcuate sulcus dorsally (F5p), and a sector located on the posterior bank of the same sulcus ventrally (F5a) (Figure 16.3; Luppino, Belmalih, Borra, Gerbella, & Rozzi, 2005; Nelissen, Luppino, Vanduffel, Rizzolatti, & Orban, 2005). Mirror neurons appear to be located mostly in F5c, while canonical neurons (those responding to mere object presentation) have been most frequently found in F5p. No neurophysiological data exist on sector F5a (Rizzolatti & Luppino, 2001). An functional magnetic resonance imaging (fMRI) study revealed an interesting different functional difference between F5c, on one side, and F5a on the other. Monkeys, trained to fixate a point of light, were presented with video clips showing hand grasping actions in two main conditions. In one, they saw a full view of the agent performing the action, while in the other only the hand grasping an object. The results showed that the full view of the grasping activated both F5c and F5a, while the view of the isolated grasping hand activated only F5a (Nelissen et al.,
8/18/09 5:10:10 PM
Mirror Neuron System in Monkeys 339
AS
F2
F7
F5p P
F4 F5a
AI
F5c
P
C
AS
IP AI
ST Lu L
Figure 16.3 Areas forming monkey ventral premotor cortex (PMv). Note: Area F5 consists of three architectonically subareas: F5c, F5p, and F5a. The large rectangle showing PMv areas is an enlarged projection of the rectangle indicated on the lateral view of the monkey brain. AIP ⫽ anterior intraparietal area; AI ⫽ inferior arcuate sulcus; AS ⫽ superior arcuate sulcus; C ⫽ central sulcus; Ca ⫽ calcarine fissure; Cg ⫽ Cingulate cortex; F5p and F5a are located on the posterior bank of the arcuate sulcus. F5c ⫽ F5 convexity; F5p ⫽ F5 posterior; F5a ⫽ F5 anterior; FEF ⫽ frontal eye field; IP ⫽ intraparietal sulcus; L ⫽ lateral sulcus; LIP ⫽ lateral intraparietal area; Lu ⫽ lunate sulcus; MIP ⫽ medial intraparietal area; P ⫽ principal sulcus; ST ⫽ superior temporal sulcus.
2005). It appears therefore that, while F5c requires an “embodied” situation to be activated, the more rostral F5a codes grasping in a more abstract way, becoming active also when only a part of the elements normally constituting the grasping scene are present. It is well known from the studies of Perrett and coworkers that neurons of STS region code motor acts performed by living individuals (Jellema, Baker, Wicker, & Perrett, 2000; Perrett et al., 1989). According to their data, neurons of the upper bank of STS respond to locomotion, axial movements, and movements of the head and eyes. Hand movements are located essentially in the lower bank of STS (Perrett, Mistlin, Harries, & Chitty, 1990). fMRI data confirmed a representation of hand movements in the lower bank (Nelissen et al., in press). They showed, however, that also STPm, an area located in the upper bank, responds to observation of hand grasping movement. It is very important, in the present context, to note that there is no evidence that the areas forming the STS region respond to active movements. This region cannot be considered, therefore, to be properly part of the mirror neuron system, but rather as a sector of the visual system devoted, besides other functions, to the description of the hand actions. As shown in Figure 16.2, the cortical convexity of inferior parietal lobule (IPL) consists of four areas: PF, PFG, PG, and Opt (Pandya & Seltzer, 1982; see for more recent data Gregoriou, Borra, Matelli, & Luppino, 2006). PF and PFG correspond essentially to area 7b of Vogt and Vogt (1919), while PG and Opt form area 7a. Evidence in
c16.indd 339
support of the validity of this parcellation was provided by Rozzi et al. (2006), who showed that each of these four areas has a distinct connectivity pattern. The parcellation of IPL convexity into four areas is also in accord with single neuron studies showing that the rostral most sector, area PF, contains essentially mouth-related neurons, whereas PFG is rich in neurons related to hand actions, although intermixed with others firing in association with arm movements, and PG is mostly related to arm movements (Ferrari, Gregoriou, et al., 2009; Hyvarinen, 1982). It is well known from the studies of Sakata Taira, Murata, and Mine (1995) that area AIP plays a fundamental role in visuo-motor transformation necessary for grasping objects. Mirror neurons were not reported in this area. However, evidence from a recent, detailed study of the mirror properties in IPL indicate that indeed there is a population of AIP neurons endowed of mirror properties (Rozzi, Ferrari, Bonini, Rizzolatti, & Fogassi, 2008). Functions of the Mirror Neurons in the Monkey Before discussing the functions of the mirror neurons, it is important to define three terms at the basis of motor organization: movement, motor act, and action. Movement indicates a displacement of body parts. It does not include the concept of goal. Motor act defines a movement or, most commonly, a series of movements performed to reach a goal (e.g., grasping an object). Finally, motor action is a series of motor acts (e.g., reaching, grasping, bringing to the mouth) that allow individuals to fulfill their intention (e.g., eating). Understanding the Goal of the Motor Acts The most widely accepted hypothesis on the functional role of the mirror neurons is that they play a role in understanding the goal of the observed motor acts (Rizzolatti, Fogassi, & Gallese, 2001). The proposed mechanism is the following: Individuals know the outcome of their motor acts. Thus, when the mirror neurons of an observing individual, which code a given motor act (e.g., grasping), discharge in response to the observation of that motor act (grasping) done by another individual, the observer understands its goal because that discharge corresponds to the one that occurs when the observer wants to achieve the same goal. What is the evidence in favor of such a role of the mirror neurons? Often the most direct way to establish the function of a neural system is to destroy it and look for deficits in the individual’s behavior. However, destroying the entire mirror neuron system could produce such a general cognitive deficit that discovering the specific function of the mirror neuron system would be impossible.
8/18/09 5:10:11 PM
340 The Mirror Neuron System (A)
(C)
(B)
(D)
100 spk/s 1s
Figure 16.4 (Figure C.26 in color section) Neuron responding to action observation in full vision and in hidden condition. Note: The lower part of each panel illustrates schematically the experimenter ’s action from the monkey’s vantage point: the experimenter ’s hand starts from a fixed position, moves toward an object and grasps it (A and B), or mimes grasping (C and D). A and C: The monkey sees the whole action; B and D: The monkey sees only the initial part of the action. Note that in hidden conditions, the monkey knows whether the action is directed toward an object. The asterisk indicates the location of a stationary marker. In hidden conditions, the experimenter ’s hand started to disappear from the monkey’s view when crossing the marker. The upper part of each panel shows the neuron responses and relative histograms recorded during the movement of the experimenter ’s hand shown in the lower panel. Rasters and histograms are aligned with the moment when the hand crossed the marker (red markers in the rasters). Green markers in the rasters indicate the movement onset, blue markers indicate the movement end. From “ ‘I Know What You Are Doing’: A Neurophysiological Study,” by M. A. Umiltà et al., 2001, Neuron, 32, 91–101. Reprinted with permission.
So a different strategy was adopted. To assess whether mirror neurons play a role in understanding motor acts, neurons’ responses were investigated when the monkeys could comprehend the meaning of a motor act without actually seeing it. If mirror neurons truly mediate understanding, their activity should reflect the meaning of the motor act rather then its visual features. Two series of experiments were carried out. The first series tested whether mirror neurons could recognize actions merely from their sounds (Kholer et al., 2002). The activity of mirror neurons was recorded while a monkey was observing a motor act, such as ripping a
c16.indd 340
piece of paper or breaking a peanut shell, that is normally accompanied by a distinctive sound. Then, the monkey was presented with the sound alone. Many mirror neurons that had responded to visual observation of acts accompanied by sounds also responded to the sound alone. These neurons were named “audio-visual” mirror neurons. In the second series of experiments, it was hypothesized that if mirror neurons are involved in understanding an action, they should also discharge when the monkey does not actually see the action but has sufficient clues to create a mental representation of it. Thus, F5 mirror neurons were tested in two conditions (Umiltà et al., 2001). In one, the monkey was shown a fully visible motor act directed toward an object (“full vision” condition). In the other, the monkey saw the same act but with its final critical part hidden (“hidden” condition). The results showed that more than half of the F5 mirror neurons also discharged in the hidden condition (Figure 16.4). These experiments strongly support the notion that the activity of mirror neurons underpins the understanding of motor acts. Even when the motor act comprehension is possible on a nonvisual basis, such as sound or mental representation, mirror neurons equally discharge signaling the meaning of the motor act. Intention Understanding Voluntary actions are the external manifestations of an internally generated intention to act. The problem of intention has been traditionally considered a philosophical problem. However, some recent neurophysiological experiments appear to be able to provide a neural substrate to some aspects of motor intention. This is true both for the intention of the acting person and for the understanding of the intentions of others. An attempt was made to find out whether the intention behind an action is reflected in the initial motor acts of that action (Fogassi et al., 2005). To this purpose, monkeys were trained to grasp objects for two different goals. In the first case, the monkey had to grasp an object in order to place it into container; in the second, it had to grasp a piece of food to eat it. The initial motor acts, reaching and grasping, were identical in the two conditions, while the final goal of the two actions was different. The tested hypothesis was whether the different intentions underlying the two actions would manifest itself already at the start of the actions when the monkey performed the motor acts common to them. Grasping neurons were therefore recorded from the IPL and their discharge studied in the two conditions mentioned. The results showed that two-thirds of IPL grasping neurons discharged with a different intensity according to the final goal of the action in which grasping
8/18/09 5:10:11 PM
Mirror Neuron System in Monkeys 341
2a
2b 1
spk/s
Grasp to eat
Unit 67
Unit 161
Unit 158
100
spk/s
Grasp to place
100
1s
Figure 16.5 (Figure C.27 in color section) Activity of neurons recorded from the inferior parietal lobule (IPL) during object grasping. Note: (Upper) Apparatus and the paradigm used for the motor task. (Lower) The activity of three IPL neurons during grasping in the conditions “grasp to eat” (2b) and “grasp to place” (2a). Rasters and
Unit 87
histograms are synchronized with the moment when the monkey touched the object to be grasped. From “Parietal Lobe: From Action Organization to Intention Understanding,” by L. Fogassi et al., 2005, Science, 29, pp. 662–667. Reprinted with permission.
Visual responses of mirror neurons Unit 39
Unit 80
spk/s
Grasp to eat 100
100
150
Grasp to place
1s
Figure 16.6 Activity of neurons recorded from the inferior parietal lobule (IPL) during observation of object grasping.
Organization to Intention Understanding,” by L. Fogassi et al., 2005, Science, 29, pp. 662–667. Reprinted with permission.
The experimenter grasps food either to eat it or to put it into a container. Conventions as in Figure 16.6. From “Parietal Lobe: From Action
was embedded (action-constrained neurons). Grasping in order to bring food to the mouth was the most represented motor act. Examples of action-constrained grasping neurons are shown in Figure 16.5. This action-constrained organization is appropriate for providing fluidity to action execution. Neurons coding a given motor act are functionally tuned for a given action and therefore linked with specific sets of neurons coding the next motor acts. This link determines, therefore, the formation of motor chains that lead to the final goal of the action. These motor chains appear to represent the neural substrate for motor intention of the agent.
c16.indd 341
Many of the neurons discharging in relation to grasping tuned for a specific action, also have mirror properties, responding to the observation of actions done by others (Fogassi et al., 2005). To find out whether the visual responses of these neurons were also influenced by the actions in which the motor acts were embedded, the same two conditions that were used for studying their motor properties were used. Monkeys, instead of grasping objects, observed the experimenter performing the two actions. The results showed that the majority of IPL mirror neurons were differently activated when the observed motor act belonged to one action or another (Figure 16.6). What
8/18/09 5:10:12 PM
342 The Mirror Neuron System
could be the explanation of this behavior? If we examine the motor behavior of action-constrained grasping neurons, it is frequently found that the neuron’s discharge continues if grasping is inserted in the appropriate action, although it stops abruptly in the nonappropriate action. This prolonged discharge suggests activation of neurons coding the next step of the executed action. In agreement with this interpretation are data on the receptive field properties of parietal neurons showing that motor acts performed actively, or even passive movements, activate neurons that code the next motor act in an action sequence (Obayashi et al., 2004). Thus, it is very likely that when an actionconstrained grasping neuron is activated by the observation of a grasping motor act inserted into its motor action, it triggers the whole motor chain in the observer, who, in this way, has an internal representation of the action that the agent intends to do. Thanks to this mechanism, the observer understands the intention of the agent. How can action observation activate the appropriate motor chain when the monkey actually sees only the first motor act of it? A systematic study of this problem has not been done. It is clear, however, from the grasping neuron behavior that an important factor in determining the neuron discharge is the type of stimulus that the agent interacts with. Food, for example, tends to activate eating chains as soon as the monkey sees the experimenter grasping it. Another factor is the statistical probability of a given action. Thus, for example, in a block of trials in which grasping is always followed by placing, grasping neurons that are tuned for placing become active. It is interesting to note, that in such a block of trials, if food, rather than an object, is grasped and placed into a container, grasping-to-eat neurons fire initially, then they stop firing while grasping-to-place neurons become active. Inter-Individual Communication Some mirror neurons located in area F5 become active when the monkey observes and execute mouth actions (Ferrari, Gallese, Rizzolatti, & Fogassi, 2003). Most of these mouth mirror neurons respond to the observation of ingestive actions such as biting, tearing with the teeth, sucking, licking, and so on (ingestive mouth mirror neurons). Their characteristics appear to be identical to that of hand mirror neurons. They do not respond to simple object presentation or to mouth-mimed actions and their visual response is often very specific for certain mouth acts. As hand mirror neurons, mouth mirror neurons can also be subdivided into “strictly congruent” and “broadly congruent” neurons. In addition to ingestive mouth mirror neurons, F5 also contains a small set of mouth mirror neurons that discharge when the monkey performs ingestive movements and responds to the obsevation of mouth actions typical of the monkey communicative repertoire, such as
c16.indd 342
lips-smacking, lips protrusion, or tongue protrusion and not to, rather than to the observation of ingestive actions. These neurons have been, therefore, named communicative mouth mirror neurons. Examples of ingestive and communicative mouth mirror neurons are shown in Figure 16.7. The presence of neurons that discharge during ingestive movement but prefer, as visual stimulus, a communicative action may seem to be in contrast with the typical visuomotor coupling observed in other mirror neurons. There is, however, an interesting possibility, namely, that this type of neurons represents a transition between ingestive and communicative actions. The view that communicative actions may derive from other, evolutionary, more ancient actions is not new. Van Hoof (1967), for example, in his fundamental work on the origin of monkey communicative gestures, proposed that many the most common communicative monkey gestures, such as lip-smacking or lips protruding, are ritualizations of ingestive actions that monkeys use for affiliative purposes. In a similar vein, McNeilage (1998) suggested that the human vocal communication derived from the cyclic, open-close mandibular alternation originally evolved for food ingestion. The existence of a neurophysiological link between ingestive and communicative actions is in accord with these hypotheses and provide neurophysiological support for them. THE MIRROR NEURON SYSTEM IN HUMANS The mirror neuron mechanism directly matches a sensory description of a motor act ON the motor representation of the same motor act. The mirror mechanism may have different functions that depend, first of all, on the anatomical localization of neurons endowed with mirror properties. On this basis, two mirror neuron systems can be distinguished: one located on the lateral surface of the cortex (parietofrontal mirror system), the other in the insula and rostral cingulate (limbic mirror system). The human parieto-frontal system (Figure 16.8) mediates the same basic functions that the homologous mirror system mediates in the monkey: the understanding of the goal of actions done by others and their intention. It mediates also additional functions unique to humans: imitation and verbal communication. The limbic mirror system has a different functional role. It mediates the comprehension of others’ emotions. Anatomical Organization of the Parieto-Frontal Mirror Neuron System A large number of brain imaging studies showed that parietal and frontal areas that became active during the execution of voluntary actions are also active when an individual
8/18/09 5:10:12 PM
The Mirror Neuron System in Humans U087
U076
(A)
(A) Experimenter lip-smacking
Experimenter grasps food with the mouth
50 sp/s
50 sp/s
1s
1s (B)
(B) Experimenter sucks from a syringe
Experimenter protrudes his lips
50 sp/s
50 sp/s
1s
1s
(C)
(C) Monkey grasps food with the mouth
Experimenter sucks from a syringe
50 sp/s
50 sp/s
1s
1s
(D)
(D) Monkey sucks from a syringe
Monkey protrudes is lips to take food
50 sp/s
50 sp/s
1s
1s
Figure 16.7 Activity of an ingestive and a communicative mirror neuron recorded from area F5. Note: (Left) Ingestive neuron. In each panel the rasters and the histograms represent the neuron response during a single experimental condition. The histogram represents the average of 10 trials. Rasters and histograms are aligned with the moment in which the mouth of the experimenter (observation conditions) or of the monkey (motor conditions) touched the food.
4
6
31
8
2
5
7a 7b
9 (A) (B) 40
46 10
19 39
18
44 44 45 47 11 38
43 52 41 42 17
22 21 37
19
18
20
Figure 16.8 View of the human brain showing the areas that form the mirror neuron system. Note: A: Frontal node; B: Parietal mirror neuron system node.
c16.indd 343
343
(Right) Communicative neuron. Same conventions. During observation of communicative actions the rasters and histograms alignment was made with the moment in which the action was fully expressed. From “Mirror Neurons Responding to the Observation of Ingestive and Communicative Mouth Actions in the Monkey Ventral Premotor Cortex,” by P. F. Ferrari, V. Gallese, G. Rizzolatti, and L. Fogassi, 2003, European journal of neuroscience, 17, pp. 1703–1714. Reprinted with permission.
observes similar actions done by others (see Rizzolatti & Craighero, 2004). These regions form the parieto-frontal human mirror neuron system. The two main nodes of this system are the inferior parietal lobule (IPL) and the ventral premotor cortex (PMv) plus the caudal part of the inferior frontal gyrus (IFG), roughly corresponding to the pars opercularis of Broca’s area. The localization of human parieto-frontal mirror neurons nicely corresponds to that of the homologous mirror neuron system in monkey. The human and monkey mirror neuron systems, however, are not identical. The human mirror system is larger and includes cortical sectors that are poorly developed or apparently absent in the monkey. Particularly interesting among them are the angular gyrus and the ventral part of the supramarginal gyrus.
8/18/09 5:10:12 PM
344 The Mirror Neuron System
The observation of object-directed motor acts (transitive motor acts, like grasping an object) activates in humans the sector of the supramarginal gyrus that is located close to and within the intraparietal sulcus. This IPL sector, which in humans is relatively small, could correspond to a large sector of the monkey rostral IPL (areas PF, PFG, and AIP). In contrast, the observation of intransitive hand actions, be they symbolic, mimed or meaningless, activates essentially the angular gyrus (Lui et al., 2008). Note that intransitive hand actions do not belong to the monkeys’ motor repertoire. The observations of actions done with tools activates two parietal sectors: a region around the intraparietal sulcus (the same that also becomes active during the observation of transitive hand actions; Gazzola, Rizzolatti, Wicker, & Keysers, 2007; Orban et al., 2006) and a rostral part of the supramarginal gyrus (Orban et al., 2006). It has been suggested that these two sectors could underlie two different ways in which tool use is understood. The sector around the intraparietal sulcus could mediate an association between a tool and the tool use outcome, without an understanding of tool functioning. This association appears to be at the basis of the monkey’s capacity to learn tool use. In contrast, the rostral supramarginal gyrus could be involved in understanding the tool use in terms of tool mechanism. This capacity appears to be unique to humans (Johnson-Frey, 2004; Povinelli, 2000). As far as the frontal lobe is concerned, in addition to PMv, PMd has frequently been reported to be active in humans during the observation of actions done by others. This activation is especially strong in tasks in which the participants are subsequently required to perform the observed motor acts (Buccino et al., 2004; Grèzes, Armony, Rowe, & Passingham, 2003). Although it is possible that PMd activation is due to a mirror mechanism, it may also be that its activation depends on a mental rehearsal of the impending actions that the observer is required to perform. Functional Organization of the Parieto-Frontal Mirror Neuron System Somatotopy fMRI experiments showed that the human mirror neuron system has a somatotopic, albeit rather coarse, organization. Buccino et al. (2001) presented volunteers with video clips showing motor acts (grasping, biting, kicking) done with different effectors (mouth, arm/hand, leg/foot). The action could be object-directed (transitive actions) or mimed. The observation of transitive mouth movements produced activation of the lower part of PMv and of area 44 bilaterally. Activation foci were also found in the parietal lobe. One of them was located in the rostral part of
c16.indd 344
the inferior parietal lobule (most likely area PF), while the other was located in the posterior part of the same lobule. The observation of mimed motor acts determined activation of the same premotor areas, but there was no parietal lobe activation. Observation of hand/arm transitive motor acts determined two areas of activation in the frontal lobe: one in area 44, and the other located in the precentral gyrus, dorsal to that found during the observation of mouth movements. There were again two activation foci in the parietal lobe. The rostral focus was, as in the case of mouth actions, in the rostral part of the inferior parietal lobule, but more caudally located, while the caudal focus was essentially in the same location as that for mouth actions. During the observation of mimed motor acts, the premotor activations were present, but not the parietal ones. Finally, the observation of transitive foot/leg motor acts determined an activation of a dorsal sector of the precentral gyrus and an activation of the posterior parietal lobe, partly overlapping with those seen during mouth and hand actions, partly extending more dorsally. Mimed foot motor acts produced premotor, but not parietal activations. The responses of the premotor cortex (PM) to the presentation of intransitive movements were investigated by Sakreida, Schubotz, Wolfensteller and von Cramon (2005) in an fMRI study. Distal, proximal, and axial motions were studied. The results showed an extended PM activation for each type of movement. Direct contrasts showed that the most significant activations were elicited in the PM ventrally for distal movements and dorsally for proximal movements. Axial movements activated the supplementary motor area. Wheaton, Thompson, Syngeniotis, Abbott, and Puce (2004) also found a somatotopic organization for intransitive movements PM, but was limited to the right hemisphere. Motor Acts Coded by the Mirror Neuron System There is clear evidence that the observation of motor acts that are richly represented in the observer motor repertoire determines a strong activation of the mirror neuron system. In an fMRI study, Calvo-Merino, Glaser, Grèzes, Passingham, and Haggard (2005) demonstrated that the observation of actions performed by others results in different cortical activations depending on the specific motor competencies of the tested individuals. The participants, who included classical dancers, teachers of Capoeira, and people who had never taken a dancing lesson, were shown a video of Capoeira dance steps. The observation of the dance steps caused a greater activation of the mirror neuron system in the teachers than in either the classical dancers or the beginners. Conversely, the observation of classical dance steps resulted in a much stronger activation
8/18/09 5:10:13 PM
The Mirror Neuron System in Humans
c16.indd 345
(A) 0.8 0.7
% signal change
0.6 0.5 0.4 0.3 0.2 0.1 0
Ballet dancers
Capoeira dancers Groups
Controls
Ballet dancers
Capoeira dancers Groups
Controls
(B) 1.8 1.6 1.4 % signal change
of the classical dancers’ mirror neurons compared to those of the Capoeira teachers and the beginners. In a later experiment, the same researchers tried to understand if the differences in the activation were due to whether the Capoeira teachers had greater visual experience of their dance steps or a knowledge of how to execute them, compared to the classical dancers (Calvo-Merino, Grèzes, Glaser, Passingham, & Haggard, 2006). In Capoeira, some steps are executed by both men and women, while others are different for the two sexes—obviously all the dancers, men and women, must know the steps that their partner will execute. Calvo-Merino and her colleagues showed the Capoiera teachers a video showing dance steps executed by men and women. The results showed that the mirror neuron system was activated more strongly by the sight of the dance steps executed by members of the observer ’s sex, indicating that the activation was regulated by motor practice and not by visual experience (see Figure 16.9). The data by Calvo-Merino were extended by Cross, de Hamilton, and Grafton (2006) in a study in which expert dancers learned and rehearsed novel, complex wholebody dance sequences for 5 weeks. Brain activity was recorded weekly by fMRI as dancers observed and visualized performing different movement sequences. Half these sequences were rehearsed and half were unpracticed control movements. The hypothesis was that the activity in premotor areas would increase as participants observed and simulated movements that they had learned outside the scanner. When dancers observed and simulated another dancer ’s movements, brain regions classically associated with action observation were active, including STS, IPL, and PMv. Critically, IPL and PMv activity was modulated as a function of dancers’ ratings of their own ability to perform the observed movements and their motor experience. These data show that the premotor and parietal mirror neuron system contributes to coding the observed actions by mapping them onto corresponding motor programs of the observer. But how would the mirror neuron system respond to the observation of hand actions if the observer never had hands or arms? Would it not show activations because the observer lacks motor programs for hand action, or would it show them because the observer has motor programs for the foot or mouth that have corresponding goals? Two aplasic individuals, born without arms or hands, were scanned while they observed hand actions. The results showed activations in the parieto-frontal circuit of the aplasic individuals while watching hand actions (Gazzola et al., 2007). This finding demonstrates the brain’s capacity to mirror actions that deviate from the typical motor organization by recruiting brain cortical representations involved in the execution of actions that achieve corresponding goals using different effectors.
345
1.2 1 0.8 0.6 0.4 0.2 0
Figure 16.9 Influence observation.
of
motor
expertise
on
action
Note: Signal changes in the central voxels of the frontal (a) and parietal (b) mirror neuron system nodes during the observation of classical ballet and Capoeira dancers. Parameter estimates show that the effect of expertise is driven by a crossover interaction between the two groups of expert dancers and the two stimulus types. Black bars reflect parameter estimates for ballet stimulus and white bars reflect Capoeira stimulus. From “Action Observation and Acquired Motor Skills: An fMRI Study with Expert Dancers,” by B. Calvo-Merino, D. E. Glaser, J. Grèzes, R. E. Passingham, & P. Haggard, 2005, Cerebral Cortex, 15, pp. 1243–1249. Reprinted with permission.
An fMRI experiment provided strong evidence of the importance of internal representation of actions to the understanding of actions performed by other individuals. Motor acts done by a human, a monkey, and a dog were presented to normal volunteers. Two types of actions were shown: biting and oral communicative actions (speech reading, lip-smacking, barking). As a control, static images of the same actions were presented (Buccino et al. 2004).
8/18/09 5:10:13 PM
346 The Mirror Neuron System
The results showed that the observation of biting, regardless of who performed the actions, determined bilateral activations in the inferior parietal lobule and in the pars opercularis of the IFG plus the adjacent precentral gyrus. The left activations were virtually identical for all three species, while the right activations were stronger during the observation of actions done by a human being than by an individual of another species. Markedly different results were obtained with communicative actions. Speech reading activated the left pars opercularis of IFG; observation of lip smacking, a monkey communicative gesture, activated a small focus in the right and left pars opercularis of IFG; observation of barking did not produce any frontal lobe activation. Does this mean that we are unable to understand the movements of a dog barking and distinguish them from those it makes when it bites food? The answer is no. This finding depends on two different comprehension modalities; the first is based predominantly on visual information, while the second is visuo-motor in nature. When we observe a dog barking, our comprehension of this act appears to be linked principally to the activation of the areas localized in the superior temporal sulcus (STS). Higher order visual areas also became active when biting is being observed—but in that case, the information sourcing from them activates the potential motor acts codified in the mirror neuron system, which therefore allows immediate comprehension in “first person” of the meaning of the acts that are being observed. This comprehension gives an internal “personal knowledge” (see Merleau-Ponty, 1962) that is lacking in the case of barking. Intention Understanding The sight of motor acts done by others produces, in the observer, the activation of cortical motor areas involved in the organization of the observed motor acts. This activation is at the basis of motor act understanding. Recent experiments showed, besides understanding of motor acts, the mirror neuron system is also involved in understanding the intention behind the observed motor acts. Evidence in this sense has been provided by an fMRI study (Iacoboni et al., 2005). In this study, there were three conditions: In the first one (called “context”), the volunteers saw some objects (a teapot, a mug, a plate with some food on it) arranged as if a person was ready to drink the tea or arranged as if a person had just finished having his or her breakfast; in the second condition (called “action”), the volunteers were shown a hand that grasped a mug without any context; in the third (called “intention”), the volunteers saw the same hand action within the two before and after breakfast contexts. The context and the different grip shapes suggested the intention of the agent, that is, grasping the cup for drinking or grasping it for cleaning the table.
c16.indd 346
The results showed that in both action and intention conditions there was an activation of the mirror neuron system. The comparison between intention and action conditions was crucial. This comparison showed that the understanding of the intention of the doer determined a marked increase in activity of the mirror neuron system. The observation of bringing the cup to the mouth in order to drink produced a stronger activation than the observation of grasping done in order to clean the table. This result is somehow similar to monkey findings (see earlier section) showing that the number of neurons coding grasping for bringing to the mouth largely exceeds the number of neurons coding grasping for putting an object into a container. In conclusion, these data show that the intentions behind the actions of others can be recognized by the mirror neuron mechanism. This does not imply that other more cognitive ways of “reading minds” do not exist (see Frith & Frith, 2007). However, the mirror neuron mechanism is most likely the basic neural mechanism from which other aspects of mind reading evolved. More recently, an fMRI study investigated the neural basis of human capacity to differentiate between actions reflecting the intention of the agent (intended actions) and actions that did not reflect it (nonintended actions). Volunteers were presented with video clips showing a large number of actions done with different effectors, each in a double version: One in which the actor achieved the purpose of his or her action (e.g., pour the wine), the other in which the actor performed a similar action but failed to reach the goal of it because of a motor slip or a clumsy movement (e.g., spilled the wine; Buccino et al., 2007). The results showed that both types of actions activated the mirror neuron system. The direct contrast nonintended versus intended actions showed activation in the right temporo-parietal junction, left supramarginal gyrus, and mesial prefrontal cortex. The converse contrast did not show any activation. It was concluded that the capacity to understand when an action is nonintended is based on the activation of attention areas signaling unexpected events in spatial and temporal domains (Corbetta & Shulman, 2002; Coull, 2004). These results indicate that when an individual observes an unexpected action, such as a motor slip, his cortical machinery does not try to simulate it, but rather signals the strangeness of the event. Imitation The term imitation has a number of different and sometimes contrasting meanings, depending on the branch of research examined (developmental psychology, comparative psychology, ethology, etc.; see Hurley & Chater, 2005). Here we narrow the field of possible definitions to two that best
8/18/09 5:10:14 PM
The Mirror Neuron System in Humans
reflect the mechanisms possibly related to the mirror neuron system (see Rizzolatti, 2005). The first, which is used mainly by experimental psychologists, defines imitation as the capacity of an individual to replicate an act that belongs to his or her motor repertoire after having seen it executed by others; the second, derived from ethology, defines imitation as the capacity to acquire by observation a motor behavior previously not present in the observer ’s motor repertoire and to repeat it using the same movements employed by the teacher (Tomasello & Call, 1997). Repeating Motor Acts Present in the Observer ’s Motor Repertoire Imitation as a replica of the motor act already present in observer ’s motor repertoire has been extensively investigated by Prinz and his coworkers (see Prinz, 2002). They established that the more a motor act resembles one that is present in the observer ’s motor repertoire, the greater the tendency to do it. Perception and execution must therefore possess a “common representational domain.” The discovery of mirror neurons suggested a possible reformulation of this concept by considering the “common representational domain” not as an abstract, amodal domain (Prinz, 1987), but rather as a motor mechanism directly activated by the observed actions. Here there is, however, a problem. Monkey data showed that mirror neurons do not respond to the observation of intransitive gestures. They discharge only when the
347
observed motor act has a goal. In contrast, in most human experiments in which imitation was studied, the participants copied simple movements. How can mirror neurons underlie this type of imitation? The answer is rather simple: Unlike monkeys, the human mirror system is able to code, besides goal-directed motor acts, also intransitive meaningless gestures. A large number of experiments prove this point (e.g., Fadiga, Fogassi, Pavesi, & Rizzolatti, 1995; Gangitano, Mottaghy, & Pascual-Leone, 2001; Maeda, Kleiner-Fisman & PascualLeone, 2002; Strafella & Paus, 2000). Fadiga et al. (1995) recorded motor evoked potentials (MEPs) from the right hand/arm muscles elicited by transcranial magnetic stimulation (TMS) of the left motor cortex. Volunteers were required to observe an experimenter grasping objects (transitive hand motor acts) or performing meaningless arm gestures (intransitive arm movements). Detection of the dimming of a small spot of light was used as the control condition. The results showed that the observation of both transitive and intransitive actions determined an increase of the recorded MEPs. The increase concerned selectively those muscles that the volunteers used when producing the observed movements. A study by Gangitano et al. (2001) demonstrated this property of the human mirror neuron system. These authors showed that MEPs recorded from the hand muscles increase during grasping observation, but also that the response facilitation closely reflects the different grasping phases (Figure 16.10).
(A)
500
1,000
2,000
3,000
3,500
5,500
(B) FDI
MEPs area (mV/ms)
3.5 3.0 2.5 2.0 1.5 1.0
500
1,000 2,000 3,000 3,500 Stimulation times (ms)
5,500
Figure 16.10 Modulation of the motor cortex excitability during grasping observation. Note: A: Schematization of events sequence during a single grasping trial. B: Averaged values of motor evoked potentials (MEPs) of a hand muscle (first dorsal interosseus) collected at different times during the observation of grasping movements. 500 ms, hand at the starting position (time value
c16.indd 347
refers to the onset of the video clip showing the action); 3,000 ms, hand maximum aperture. From “Phase Specific Modulation of Cortical Motor Output during Movement Observation,” by M. Gangitano, F. M. Mottaghy, A. Pascual-Leone, 2001, NeuroReport, 12, pp. 1489–1492. Reprinted with permission.
8/18/09 5:10:14 PM
348 The Mirror Neuron System
The human mirror neuron system has the potential to imitate intransitive as well as goal-directed motor acts. Direct evidence that the mirror neuron system is involved in imitation was provided by an fMRI study (Iacoboni et al., 1999). These authors studied normal human volunteers in two conditions: “observation-only” and “observation-execution.” In the “observation-only” condition, subjects were shown a moving finger, a cross on a stationary finger, or a cross on an empty background. The instruction was to observe the stimuli. In the “observation-execution” condition, the same stimuli were presented, but, this time, the instruction was to lift the right finger, as fast as possible, in response to them. The crucial contrast was between the trials in which the volunteers made the movement in response to an observed action (“imitation”) and the trials in which the movement was triggered by the cross (a nonimitative behavior). The results showed that the activation of the mirror neuron system was stronger during “imitation.” Similar results were subsequently obtained by Koski, Wohlschlager, Bekkering, Woods, and Dubeau (2002) and Grèzes et al. (2003). The issue of imitation was also addressed by Nishitani and Hari (2000) using magnetoencephalography (MEG), a technique that has a relatively poor spatial resolution, but an excellent temporal resolution. Participants were asked (A) to grasp an object with their right hand, (B) to observe this action being done by the experimenter, and (C) to observe and replicate the seen action. The results showed that in (A) the left inferior frontal cortex (area 44; that is, the frontal node of the mirror neuron system) became active first, the left primary motor cortex activation following it by 100 to 200 ms. During observation and imitation, the sequence of activations was similar, but beginning in the visual areas. The strongest activation was found during action imitation. Similar results were obtained in a subsequent MEG study by the same experimenters, in which volunteers were asked (A) to observe a still picture of lip forms, (B) to imitate them online, or (C) to make similar forms in as self-paced manner. Figure 16.11 illustrates the cortical activations in the three conditions. Further evidence that the mirror neuron system plays a crucial role in imitation was found using repetitive TMS (rTMS). In a group of volunteers, the caudal part of the left frontal gyrus (Broca’s area) was stimulated while they (a) pressed keys on a keyboard; (b) pressed the keys in response to a point of red light which, directed onto the keyboard, indicated which key to press; and (c) imitated a similar movement executed by another individual. The data showed that rTMS lowered the participants’ performance during imitation, but not during the other two tasks (Heiser, Iacoboni, Maeda, Marcus, & Mazziotta, 2003). Summing up, this experiment as well as fMRI and MEG data clearly show that the mirror neuron system plays a
c16.indd 348
M1
IP
IF ST
Left
Occ
Right
Figure 16.11 Cortical activations during the observation, imitation, and execution of lip forms. Note: The main source locations during observation (circles), imitation (triangles), and execution (squares), superimposed on an MRI brain. From “Viewing Lip Forms: Cortical Dynamics,” by N. Nishitani and R. Hari, 2002, Neuron 36, pp. 1211–1220. Reprinted with permission.
fundamental role in imitation. It transforms visual information into potential movements and motor act, enabling the observer to repeat immediately the observed motor behavior. Imitation Learning In the preceding section, we discussed imitation as a motor copy of an observed motor act; here we discuss whether mirror neurons are also involved in imitation learning. Byrne (2002) proposed an interesting model based on ethological studies of ape behavior. According to this model, learning by imitation results from the integration of two distinct processes: in the first, the observer segments the action to be imitated into its individual elements, thus converting it into a string of acts belonging to the observer motor repertoire; in the second, the observer organizes these motor acts into a sequence that replicates that of the demonstrator. It is likely that a similar process is also at the basis of learning nonsequential motor patterns, such as notes played on a piano or a guitar. The neural basis of imitation learning was investigated in an fMRI study by Buccino et al. (b) (2004). Naive participants were asked to imitate guitar chords played by an expert guitarist. Cortical activations were mapped during the following events: (a) observation of the chords made by the expert player, (b) pause, (c) execution of the observed chords, and (d) rest. In addition to the imitation condition, there were other conditions to control for observation not followed by imitation and for nonimitative motor activity. The results showed that during observation for imitation there was activation of a cortical network formed by IPL and the dorsal part of PMv, plus the pars opercularis of IFG (Figure 16.12, IMI-1). This circuit was also active during observation in the control conditions in which participants merely observed the chords, or observed them with the instruction to subsequently perform an action not related to guitar chord execution (Figure 16.12, non-IMI-1).
8/18/09 5:10:15 PM
The Mirror Neuron System in Humans Event 1
Event 2
Event 3
Event 4
⫹ IMI: “Observe the model and imitate it” ⫹ Non IMI: “Observe the model and do not imitate it”
IMI-1
Non IMI-1
IMI-2
Non IMI-2
Figure 16.12 Cortical activations during learning by imitation. Note: Upper: Graphic illustration of the events forming the experimental conditions imitation (IMI) and nonimitation (Non-IMI). Both conditions consisted of four events preceded by the presentation of a cue (a square) of different color informing the participants on the task they have to perform. IMI condition: Event 1, observe the teacher ’s hand playing the chord (IMI-1); Event 2, rehearse the observed chord (IMI-2); Event 3, replicate it. Event 4, keep the hand still. Non-IMI condition: Event 1, observe the teacher ’s hand playing the chord (Non-IMI-1); Event 2, do not rehearse the observed chord (Non-IMI-2); Event 3, touch the neck of the guitar, without playing a chord (Non-IMI-3). Event 4, keep the hand still. Lower: Cortical areas activated during Events 1 and 2 in IMI and NonIMI conditions. From “Neural Circuits Underlying Imitation Learning of Hand Actions: An Event-Related fMRI Study,” by G. Buccino, S. Vogt et al., 2004, Neuron, 42, pp. 323–334. Adapted with permission.
During the pause, activation was found in the imitation condition in the same circuit as during observation, but, most interestingly, also in the middle frontal cortex (area 46) and in the anterior mesial cortex (Figure 16.12, IMI-2). Motor activations dominate the picture during chord execution. These data show that during new motor pattern formation there is a strong activation of the mirror neuron system. However, following Byrne (2002), the authors suggested a two-step mechanism for imitation learning. First, the observed actions are decomposed into elementary motor acts that activate, via mirror mechanisms, the corresponding motor representations in the parietal and frontal lobe. Once these motor representations are activated, they are recombined to fit the observed model. For this recombination, a crucial role is played by frontal area 46.
c16.indd 349
349
An fMRI study, based on a similar experimental design, but carried out by expert and naive guitarists confirmed the joint role of the mirror neuron areas and prefrontal lobe in imitation learning, and in particular, the fundamental role that area 46 plays in combining different motor acts in a new specific motor pattern (S. Vogt et al., 2007). Emotions Up to now we discussed the neural mechanisms that enable individuals to understand “cold actions,” that is actions without an obvious emotional content. In social life, however, equally important, and maybe even more so, is the capacity to decipher emotions. Which structures mediate the understanding of emotion of others? Is there a mirror mechanism for emotions similar to that for cold action understanding? It is reasonable to postulate that, as for action understanding, there are two different mechanisms also for emotion understanding. The first consists of cognitive elaboration of sensory aspects of others’ emotional behaviors. The other consists of a direct mapping of sensory aspects of the observed emotional behavior on the motor structures that determine, in the observer, the experience of the observed emotion. These two ways of recognizing emotions are experientially radically different. With the first, the observer understands the emotions expressed by others, but does not feel them. He deduces them. A certain facial or body pattern means fear, another happiness. There is no emotional involvement. Different is the case for the direct matching mechanism. In this case, the recognition occurs because the observed emotion triggers the feeling of the same emotion in the observing person. It is a first-person recognition. It is generally agreed there is a series of emotions, often called primary emotion (e.g., fear, rage), that consist of a collection of responses that have been laid down during the course of evolution due to their original adaptive utility and that occur in the same form in different species and, in the case of humans, in different cultures. In this chapter, we focus on two of these emotions: disgust and pain. For both of them there are data obtained on the same individuals in two conditions: in one where they experienced disgust or pain evoked by appropriate stimulus, in the other when they observed the expression of these emotions in others. Disgust is an emotion whose expression has an important survival value for the conspecifics. In its most basic, primitive form (“core disgust,” Rozin, Haidt, & McCauley, 2000) it indicates that something (e.g., food) that the individual tastes or smells is bad and, most likely, dangerous. Because of its strong communicative value, disgust has been considered an ideal emotion for testing the direct matching hypothesis.
8/18/09 5:10:15 PM
350 The Mirror Neuron System
Brain imaging studies show that when an individual is exposed to disgusting odors or tastes, there is an intense activation of two structures: the amygdala and the insula (Augustine, 1996; Royet, Plailly, Delon-Martin, Kareken, & Segebarth, 2003; Small et al., 2003; Zald & Pardo, 2000). The insula is a complex structure. On the basis of its connections, the insula has been subdivided in the monkey into two main sectors: an anterior visceral sector and a posterior multimodal sector (Mesulam & Mufson, 1982). The anterior sector receives a rich input from olfactory and gustatory centers. In addition, it receives an important input from the inferotemporal lobe, a cortical region where faces are coded (Gross, 1992). The insula is not an exclusively sensory area. In both monkeys and humans, its electrical stimulation produces body movements typically accompanied by autonomic and viscero-motor responses (KrolakSalmon et al., 2003; Penfield & Faulk, 1955; Showers & Lauer, 1961). Brain imaging studies show that, in humans, observation of faces showing disgust activates foci in the anterior insula sector (Phillips et al., 1998; Schienle et al., 2002; Sprengelmeyer, Rausch, Eysel, & Przuntek, 1998). Wicker et al. (2003) carried out an fMRI study to find out whether the insula sites that show activation during the experience of disgust also show activation during the observation of faces expressing disgust. In this study, volunteers were subjected to an fMRI experiment consisting of two sessions. In the first session, the participants were exposed to unpleasant and pleasant odors; in the second session, they watched a video showing the facial expression of people sniffing an unpleasant, a pleasant, or a neutral odor. The two structures that became active during the exposure to smells were the amygdala and the insula. The amygdala was activated by both unpleasant and pleasant odors. Pleasant odorants produced a relatively weak activation located in a posterior part of the right insula, while disgusting odorants activated the anterior sector bilaterally. The results of observation showed activations in various cortical and subcortical centers, but not in the amygdala. The left anterior insula was activated only during the observation of disgust. The most important result of the study was the demonstration that precisely the same foci within the anterior insula that were activated by the exposure to disgusting odorants were also activated by the observation of disgust. These data strongly suggest that the insula contains neural populations that becomes active both when the participants experience disgust and when they see it in others. The finding we have just discussed appear to be valid not only for disgust but also for pain. Singer and coworkers (2004) conducted an fMRI experiment subdivided into two
c16.indd 350
(A)
(B) 4 3 2 1 0
Figure 16.13 Activations found when pain was applied to self or to the partner. Note: A and B illustrate the results of a conjunction analysis between the contrasts pain-no pain in the context of self and other. Results are shown on sagittal (A) and coronal (B) sections of the mean structural scan. Coordinates refer to peak activations. Increased pain-related activation was observed in the anterior cingulate (ACC), anterior insula, cerebellum and brainstem (colored area). From “Empathy for pain involves the affective but not sensory components of pain,” by T. Singer et al., 2004, Science, 303, pp. 1157–1162. Reprinted with permission.
parts. First the participants were subjected to a mildly painful electric shock from electrodes placed on their hands, second they were asked to watch while the same electrodes were fastened to the hand of a loved one. They were told that the loved person would receive the same shock they had received earlier. As shown in Figure 16.13, sites of the anterior insula and of the cingulate cortex became active in both conditions, thus showing that both direct pain experience and its evocation are mediated by a mirror mechanism similar to that found for disgust (see also Chapter 48). The hypothesis that we recognize others’ emotion by activating structures mediating the same emotion in ourselves has been advanced by various authors (e.g., Calder, Keane, Facundo Manes, Nagui Antoun, & Young, 2000; Carr, Iacoboni, Dubeau, Mazziotta, & Lenzi, 2003; Damasio, 2003; Gallese, Keysers, & Rizzolatti, 2004; Goldman & Sripada, 2003; Phillips et al., 1997). Particularly influential in this respect has been the work by Damasio and his coworkers. According to their findings, mostly based on brain lesions, the neural basis of emotion understanding is the activation of an “as-if-loop,” the core structure of which is the insula (Adolphs, Tranel, & Damasio, 2003; Damasio, 2003). As already mentioned at the beginning of this section, the direct activation theory denies that we could recognize emotions indirectly, using cognition. However, without the activation of the observers’ emotional centers, emotions would be reduced, as William James (1890) remarked, to “a cold and neutral state of intellectual perception.” Language It might seem surprising that in a chapter devoted to the mirror neuron system there is a section on language. Language is traditionally conceived of as a system of communication based on sound with little or no involvement of the motor
8/18/09 5:10:16 PM
The Mirror Neuron System in Humans
system, except, of course, for speech production. However, sound-based languages are not the only natural way for communicating. Signed languages represent another complex, fully structured communication system. By using sign language, people express abstract concepts; learn mathematics, physics, philosophy, and even poetry (see Corballis, 2002). Nonetheless, the evidence that signed languages are fully structured communication systems has not modified the traditional view that speech is the only natural way in which humans communicate. If this is true, it logically follows that the evolutionary precursors of speech should be animal calls. A series of facts indicates, however, that this view is highly implausible: First, the brain structures underlying speech and animals’ calls are different. Animals’ calls are mediated primarily by deep, diencephalic, and brain stem structures and by the cingulate cortex (Jürgens, 2002). In contrast, human speech has its anatomical core in the perisylvian areas, including area 44, a basically motor area. Second, speech is not necessarily linked to an emotional behavior, although animals’ calls are. Third, speech is mostly a person-to-person communication. In contrast, animal calls are, typically, directed to “the community,” rather than to a specific individual. Fourth, speech is endowed with combinatorial properties that are absent in animal communication. It is recursive and virtually limitless with respect to its scope of expression. Fifth, but not least, humans do possess a “call” communication system like that of nonhuman primates and its anatomical location is similar. This system mediates the utterances that humans emit when in particular emotional states (cries, yelling, etc.). These utterances, which are preserved in patients with global aphasia, lack the referential character and the combinatorial properties that characterize human speech. The advocates of the sound-based theory of language origin often use as an argument in favor of their theory the presence of referential information in some animal calls. Following the famous study of the alarm calls of vervet monkeys (Cheney & Seyfarth, 1990), the capacity of referential information has been described in a large number of species including diana monkeys, baboons, and suricates (a South Africa mongoose). Evidence also has been provided that baboons are able to acquire sophisticated information from other individual’s vocalizations. However, although animal vocalization may encode semantic as well as emotional information, callers do not intend to provide it. “Listeners acquire information as an inadvertent consequences of signaler behavior” (Seyfarth & Cheney, 2003).
c16.indd 351
351
What then could be the origin of human speech? An alternative hypothesis is that the path leading to speech started with gestural communication. This hypothesis, which was first proposed by the French philosopher Condillac (1947), has several defenders (e.g., Armstrong, Stokoe, & Wilcox, 1995; Corballis, 2002; Gentilucci & Corballis, 2006). According to it, the communication of modern humans ancestors consisted mostly of gesturing. Sounds conveying semantic information were later in evolution and were associated with gestures. The discovery of mirror neurons provided strong support for the gestural theory of speech origin. Mirror mechanism creates a direct link between the sender of a message and its receiver. Thanks to this mechanism, motor acts by an individual activate a similar motor copy in the observers allowing them, in this way, to understand directly the message. Because of this and the finding that the observation of motor acts (e.g., hand grasping) activates the caudal part of IFG (Broca’s area), Rizzolatti and Arbib (1998) proposed that the mirror mechanism is at the basis of language evolution. In fact, the mirror neuron mechanism can solve two fundamental communication problems: parity and direct comprehension. Thanks to the mirror neurons, what counts for the sender of the message also counts for the receiver. No arbitrary symbols are required. The comprehension is inherent in the neural organization of the two individuals. Note also that the activation of Broca’s area during motor act observation is not due to a verbal description of the observed action, but to a real coding of motor act (for evidence on this point, see Fadiga et al., 2006). The mirror neuron system of monkeys is a closed system linked to objects, while, in order to communicate, this system should become an open system, able to describe actions and objects. How did the transition between these two systems occur? It is likely that the great leap from an object-related mirror neuron system to a truly communicative mirror occurred with the evolution of imitative abilities (Arbib, 2005) and the related capacity of mirror neurons to discharge in response to intransitive actions. In fact true imitation (imitation in ethological sense) implies not only the understanding of the purpose of the action to be imitated (a feature already present in the monkey mirror neuron system), but also the capacity to repeat the individual movements that constitute an action in the correct order (Rizzolatti, 2005; Tomasello & Call, 1997). The necessity to keep track of precise movements in order to imitate the motor behavior of others should have sharpened the mirror system, providing it with its capacity, present in modern humans, to convey information on the observed movements per se and on only those leading to a goal. The mirror neuron system as a communication system has a great asset: Its semantics is inherent in the gestures
8/18/09 5:10:17 PM
352 The Mirror Neuron System
used to communicate. This is lacking in speech. In speech, or at least in modern speech, the meaning of the words and the phono-articulatory actions necessary to pronounce them are unrelated. This independence suggests that a necessary step in speech evolution was the transfer of gestural meaning, intrinsic to the gesture itself, to abstract sound meaning. From this follows a clear neurophysiological prediction: Hand/arm and speech gestures must be strictly linked and must, at least in part, share a common neural substrate. A number of studies prove that this is the case. TMS experiments show that the excitability of the motor cortex hand representation increases during both reading and spontaneous speech (Meister et al., 2003; Seyal, Mull, Bhullar, Ahmad, & Gage 1999; Tokimura, Tokimura, Oliviero, Asakura, & Rothwell, 1996). The effect is limited to the left hemisphere. Furthermore, no language-related effect is found in the leg motor area. Note that the increase of hand motor cortex excitability cannot be attributed to word articulation because, while word articulation recruits motor cortex bilaterally, the observed activation is strictly limited to the left hemisphere. The facilitation appears, therefore, to result from a co-activation of the dominant hand motor cortex and language centers (Meister et al., 2003). Gentilucci Benuzzi, Gangitano, and Grimaldi (2001) reached similar conclusions using a different approach. In a series of behavioral experiments, they presented participants with two three-dimensional objects, one large, the other small. Participants were required to grasp the objects and to open their mouth. The kinematics of hand, arm, and mouth movements was recorded. The results showed that lip aperture and the peak velocity of lip aperture increased when the movement was directed to the large object. In another experiment of the same study, Gentilucci et al. (2001) asked participants to pronounce a syllable (e.g., GU, GA) instead of simply opening their mouth. They found that lip aperture was larger when the participants grasped a larger object. Furthermore, the maximal power of the voice spectrum recorded during syllable emission was also higher when the larger object was grasped. In a further study, Gentilucci (2003) asked volunteers to pronounce the syllables BA or GA while observing another individual grasping objects of different size. Kinematics of lip aperture and amplitude spectrum of voice was influenced by the grasping movements of the other individual. Specifically, both lip aperture and voice peak amplitude were greater when the action, done by another individual, was directed to larger objects. Control experiments ruled out that the effect was due to the size of the object or to the velocity of the observed arm movement. Taken together, these experiments show that hand gestures and mouth gestures are strictly linked in humans and
c16.indd 352
that this link includes the oro-laryngeal movements used for speech production. Mirror Neuron System and Speech The association between communicative gestures and specific sounds has obvious advantages, such as the possibility of communicating in the dark or when the hands are holding tools or weapons. Such advantages must have exerted a strong evolutionary pressure in favor of a communication system based on sounds. To achieve, however, an efficient communication system, the emitted sounds must be clearly distinguishable by the listeners and, most importantly, maintain constant features. They must be pronounced in a very precise, consistent way. These pronunciation constraints require a sophisticated organization of the motor system dedicated to sound production and rich connections between the cortical motor areas controlling voluntary actions and the centers controlling the oro-laryngeal tract. The large expansion of the posterior part of the inferior frontal gyrus culminating in the appearance of Broca’s area in the human left hemisphere is, most likely, the result of the evolutionary pressure to achieve this voluntarily control. In parallel with these modifications for emitting sounds, a system for understanding them should also have evolved. As already discussed, monkey area F5, the homologue of human area 44, contains neurons—the so called audiovisual neurons (Kohler et al., 2002)—that respond to the observation of actions done by others as well as to the sounds of those actions. This system, however, is tuned for recognition of the sound of physical events and not of sounds made by individuals for communication purposes. To understand the speech sounds, a new mirror neuron system tuned to resonate in response to sounds emitted by the oro-laryngeal tract should have evolved. Is there evidence that humans have such a mirror neuron system? The answer is yes. Fadiga, Craighero, Buccino, and Rizzolatti (2002) recorded MEPs from the tongue muscles in normal volunteers instructed to listen carefully to acoustically presented verbal and nonverbal stimuli. The stimuli were words, regular pseudo-words, and bi-tonal sounds. In the middle of words and pseudo-words, there was either a double “f” or a double “r.” “F” is a labio-dental consonant that, when pronounced, requires virtually no tongue movements, whereas “r” is linguo-palatal fricative consonant that, in contrast, requires, marked tongue muscle involvement to be pronounced. During the stimulus presentation, the left motor cortex of the participants was stimulated with single pulse TMS. The results showed that listening to words and pseudo-words containing the double “r” produced a significant increase of MEPs amplitude recorded from tongue muscles compared to listening
8/18/09 5:10:17 PM
Summary 1.4
z-score of MEPs’area
1.0 0.6 0.2 ⫺0.2 ⫺0.6 ⫺1.0 ⫺1.4
‘rr’ ‘ff’ Words
‘rr’ ‘ff’ Pseudo-words
Bitonal sounds
Figure 16.14 Modulation of motor cortex excitability during presentation of verbal material. Note. Bars represent motor-evoked potentials (MEPs) total areas recorded from tongue muscles during listening to words, pseudowords, and bitonal sounds. “rr” and “ff” refer to verbal stimuli containing a double lingua-palatal fricative consonant “r,” and a double verbal labio-dental fricative consonant “f,” respectively. From “Speech Listening Specifically Modulates the Excitability of Tongue Muscles: A TMS Study,” by L. Fadiga, L. Craighero, G. Buccino, and Rizzolatti G., 2002, European Journal of Neuroscience, 15, pp. 399–402. Reprinted with permission.
to bi-tonal sounds and words and pseudo-words containing the double “f” (Figure 16.14). Results congruent with those of Fadiga et al. (2002) were obtained by Watkins, Strafella, and Paus (2003). Using the TMS technique, they recorded MEPs from a lip muscle (orbicularis oris) and a hand muscle (first interosseus) in four conditions: listening to continuous prose, viewing speech-related lip movements, listening to nonverbal sounds, and viewing eye and brow movements. Compared to viewing eye and brow movements, listening to and viewing speech enhanced the MEP amplitude recorded from the orbicularis oris muscle. All of these effects were seen only in response to stimulation of the left hemisphere. Speech is not purely a system based on sounds. As shown by Liberman (Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Liberman & Mattingly, 1985; Liberman & Whalen, 2000), an efficient communication system cannot be built by substituting tones or combinations of tones for speech. There is something special about speech sounds that distinguish them from other auditory material, and this is their capacity to evoke the motor representation of the heard sounds in the listener ’s motor cortex. Note that this capacity, postulated by Liberman on the basis of indirect evidence, now has a precise neural correlate shown by the existence of a mirror neuron system specifically tuned to speech sound. Does this motor resonance provide only a copy of the listened phoneme or does it intervene also in the understanding
c16.indd 353
353
of word meaning? A series of recent studies addressed this issue. In an EEG study, Pulvermueller (2001) assessed cortical activations while volunteers listened to face- and legrelated action verbs (walking versus talking). They found that words describing leg actions evoked stronger in-going current at dorsal sites, close to the cortical leg area, whereas those of the talking type elicited the stronger currents at the inferior site, next to the motor representation of the face and mouth. In an fMRI experiment, Tettamanti et al. (2005) tested whether cortical areas active during action observation were also active during listening to action sentences. Sentences that describe actions performed with mouth, hand/arm, and leg were used. The presentation of abstract sentences of comparable syntactic structure was used as a control condition. The results showed activations in the precentral gyrus and in the posterior part of IFG. The activations in the precentral gyrus, and especially those observed during listening to sentences describing hand actions, spatially corresponded to the activations found during the observation of the same actions. The activation of IFG was strong during listening of mouth actions, but also present during listening of actions done with other effectors. It is likely, therefore, that, in the IFG, in addition to mouth sentences, there is also a more general representation of action verbs. The relation between premotor responses to action observation and to phrases describing actions have recently tested in a more stringent way by Aziz-Zadeh, Wilson, Rizzolatti, and Iacoboni (2006) in an fMRI experiment. Participants observed actions and read phrases relating to foot, hand, or mouth actions. In the analysis, the points of maximal activation of foot, arm, and mouth movements were first determined. Subsequently, they examined which type of sentence activated most of these points. The results showed a strict congruence between effector-specific representations of visually presented actions and of actions described by literal phrases. Taken together, these data clearly indicate that listening to sentences describing actions or reading them activate cortical areas where these actions are coded. It is unsolved; however, the problem remains of determining what is the contribution of these activations for the understanding of verbal material. We do not yet know whether these activations are indispensable for our sentence comprehension and, if they are not, what they add to it.
SUMMARY With the description of mirror neurons in monkey area F5, neuroscientists, and even more so cognitive psychologists, sometimes express surprise that mirror neurons could be
8/18/09 5:10:17 PM
354 The Mirror Neuron System
involved in so many, diverse cognitive functions. In fact, there is no reason for such surprise. As shown in this chapter, mirror neurons of area F5 are only an example of a general neurophysiological mechanism that codifies sensory and motor information in a common format. The mirror mechanism is present not only in the premotor cortex, where they have been originally discovered, but also in other brain centers and areas. The functional role of the mirror mechanism depends, first of all, on the anatomy of the neural circuit where the mirror neurons are located. Mirror neurons of the lateral brain convexity transform the sensory representation of motor acts into a motor representation of the same acts. Their function is to give an immediate understanding of the observed motor behavior. Mirror neurons located in the insula and rostral cingulate transform an observed emotional expression or situation into a viscero-motor activity analogous to that present when an individual actually experiences that emotion. They give the observer a direct feeling of what the others feel. Mirror neurons are obviously not floating in isolation. They are connected with other neurons located in the same area where mirror neurons are located and with neurons of other areas. The first type of relation is at the basis of the activation of the whole neuron chain underlying a motor action following the observation of its first motor act. Thanks to these connections and the mirror mechanism, the observer is able to infer the outcome of an action at its outset. This mechanism appears to be fundamental for understanding the intentions of others. The connections with other areas play a fundamental role in imitation learning. The observed action is decomposed, by the mirror mechanism, into its elementary motor acts and kept in memory. This memory is then used to repeat the observed action. The prefrontal lobe appears to be the structure involved in controlling and orchestrating the activity of the mirror mechanism in this function. Finally, with the evolution of language, the mirror mechanism acquired a new role, that of translating the speech sound into the motor pattern responsible for the emission of the same speech sound. It has been suggested that mirror neuron activation may be also involved in understanding the meaning of verbal material. This fascinating hypothesis however, lacks fully convincing evidence and requires more investigation.
Armstrong, A. C., Stokoe, W. C., & Wilcox, S. E. (1995). Gesture and the nature of language. Cambridge: Cambridge University Press. Augustine, J. R. (1996). Circuitry and functional aspects of the insular lobe in primates including humans. Brain Research: Brain Research Reviews, 22, 229–244. Aziz-Zadeh, L., Wilson, S. M., Rizzolatti, G., & Iacoboni, M. (2006). Congruent embodied representations for visually presented actions and linguistic phrases describing actions. Current Biology, 16, 1818–1823. Borra, E., Belmalih, A., Calzavara, R., Gerbella, M., Murata, A., Rozzi, S., et al. (2008). Cortical connections of the macaque anterior intraparaietal (AIP) area. Cerebral Cortex, 18, 1094–1111. Buccino, G., Baumgaertner, A., Colle, L., Buechel, C., Rizzolatti, G., & Binkofski, F. (2007). The neural basis for understanding non-intended actions. Neuroimage, 2, 119–127. Buccino, G., Binkofski, F., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, V., et al. (2001). Action observation activates premotor and parietal areas in a somatotopic manner: An fMRI study. European Journal of Neuroscience, 13, 400–404. Buccino, G., Lui, F., Canessa, N., Patteri, I., Lagravinese, G., Benuzzi, F., et al. (2004). Neural circuits involved in the recognition of actions performed by non-conspecifics: An fMRI study. Journal of Cognitive Neuroscience, 16, 114–126. Buccino, G., Vogt, S., Ritzl, A., Fink, G., Zilles, K., Freund, H., et al. (2004). Neural circuits underlying imitation learning of hand actions: An event-related fmri study. Neuron, 42, 323–334. Byrne, R. W. (2002). Seeing actions as hierarchically organized structures: Great ape manual skills. In A. N. Meltzoff & W Prinz (Eds.), The imitative mind: Development, evolution and brain bases (pp. 122–140). Cambridge, England: Cambridge University Press. Calder, A. J., Keane, J., Facundo Manes, Nagui Antoun, & Young, A. W. (2000). Impaired recognition and experience of disgust following brain injury. Nature Neuroscience, 3, 1077–10 Calvo-Merino, B., Glaser, D. E., Grèzes, J., Passingham, R. E., & Haggard, P. (2005). Action observation and acquired motor skills: An fMRI study with expert dancers. Cerebral Cortex, 15, 1243–1249. Calvo-Merino, B., Grèzes, J., Glaser, D., Passingham, R., & Haggard, P. (2006). Seeing or doing? Influence of visual and motor familiarity in action observation. Current Biology, 16, 1905–1910. Carr, L., Iacoboni, M., Dubeau, M.-C., Mazziotta, J. C., & Lenzi, G. L. (2003). Neural mechanisms of empathy in humans: A relay from neural systems for imitation to limbic areas.Proceedings of the National Academy of Sciences, USA,100, 5497–5502. Cheney, D. L., & Seyfarth, R. M. (1990). How monkeys see the world: Inside The mind of another species. Chicago: University of Chicago Press. Condillac, E. B. (1947). Oevres philosophiqus. In G. Le Roy (Ed.), (Vol. 1). Paris: Presses Universitaires de France. Corballis, M. C. (2002). From hand to mouth: The origins of language. Princeton: Princeton University Press. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews: Neuroscience, 3, 201–215. Coull, J. T. (2004). fMRI studies of temporal attention: Allocating attention within, or towards, time. Cognitive Brain Research, 21, 216–226.
REFERENCES
Cross, E. S., de Hamilton, A. F., & Grafton, S. T. (2006). Building a motor simulation de novo: Observation of dance by dancers. Neuroimage, 31, 1257–1267.
Adolphs, R., Tranel, D., & Damasio, A. R. (2003). Dissociable neural systems for recognizing emotions. Brain and Cognition, 52, 61–69.
Damasio, A. R. (2003). Looking for Spinoza: Joy, Sorrow and Feeling Brain. Harcourt, New York.
Arbib, M. A. (2005). From monkey-like action recognition to human language: An evolutionary framework for neurolinguistics. Behavioral and Brain Sciences, 28, 105–124.
Di Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., & Rizzolatti, G. (1992). Understanding motor events: A neurophysiological study. Experimental Brain Research, 91, 176–180.
c16.indd 354
8/18/09 5:10:18 PM
References 355 Fadiga, L., Craighero, L., Buccino, G., & Rizzolatti, G. (2002). Speech listening specifically modulates the excitability of tongue muscles: A TMS study. European Journal of Neuroscience, 15, 399–402. Fadiga, L., Craighero, L., Fabbri-Destro, M., Finos, L., CotillonWilliams, N., Smith, A. T., et al. (2006). Language in shadow: Social. Neuroscience, 1, 77–89. Fadiga, L., Fogassi, L., Pavesi, G., & Rizzolatti, G. (1995). Motor facilitation during action observation: A magnetic stimulation study. Journal of Neurophysiology, 73, 2608–2611.
Hyvarinen, J. (1982). Posterior parietal lobe of the primate brain. Physiological Reviews, 62, 1060–1129. Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G., Mazziotta, J. C., & Rizzolatti, G. (2005). Grasping the intentions of others with one’s own mirror neuron system. PLoS Biology, 3, e79. Iacoboni, M., Woods, R. P., Brass, M., Bekkering, H., Mazziotta, J. C., & Rizzolatti, G. (1999, December 24). Cortical mechanisms of human imitation. Science, 286, 2526–2528.
Ferrari, P. F., Gallese, V., Rizzolatti, G., & Fogassi, L. (2003). Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. European Journal of Neuroscience, 17, 1703–1714.
James, W. (1890). Principles of psychology. London: Macmillian.
Ferrari, P. F., Gregoriou, G., Rozzi, S., Pagliara, S., Rizzolatti, G., & Fogassi, L. (2003). Functional organization of the inferior parietal lobule of the macaque monkey [Program No. 919.7, 2003 Neuroscience Meeting Planner. Washington, DC: Society for Neuroscience 2003. Online Abstract viewer/itinerary planner]. Washington, DC: Society for Neuroscience.
Johnson-Frey, S. H. (2004). The neural bases of complex tool use in humans. Trends in Cognitive Sciences, 8, 71–78.
Ferrari, P. F., Rozzi, S., & Fogassi, L. (2005). Mirror neurons responding to observation of action made with tools in monkey ventral premotor cortex. Journal of Cognitive Neuroscience, 17, 212–226.
Kohler, E., Keysers, C., Umiltà, M. A., Fogassi, L., Gallese, V., & Rizzolatti, G. (2002, August 2). hearing sounds, understanding actions: Action representation in mirror neurons. Science, 297, 846–848.
Fogassi, L., Ferrari, P. F., Gesierich, B., Rozzi, S., Chersi, F., & Rizzolatti, G. (2005). Parietal lobe: From action organization to intention understanding. Science, 29, 662–667.
Koski, L., Wohlschlager, A., Bekkering, H., Woods, R. P., & Dubeau, M. C. (2002). Modulation of motor and premotor activity during imitation of target-directed actions. Cerebral Cortex, 12, 847–855.
Frith, C. D., & Frith, U. (2007). Social cognition in humans. Current Biology, 17, 724–732.
Krolak-Salmon, P., Henaff, M. A., Isnard, J., Tallon-Baudry, C., Guenot, M., Vighetto, A., et al. (2003). An attention modulated response to disgust in human ventral anterior insula. Annals of Neurology, 53, 446–453.
Gallese, V., Fadiga, L., Fogassi, L., & Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain, 119, 593–609. Gallese, V., Keysers, C., & Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends in Cognitive Sciences, 8, 396–403. Gangitano, M., Mottaghy, F. M., & Pascual-Leone, A. (2001). Phase specific modulation of cortical motor output during movement observation. NeuroReport, 12, 1489–1492. Gazzola, V., Rizzolatti, G., Wicker, B., & Keysers, C. (2007). The anthropomorphic brain: The mirror neuron system responds to human and robotic actions. NeuroImage, 35, 1674–1684. Gazzola, V., van der Worp, H., Mulder, T., Wicker, B., Rizzolatti, G., & Keysers, C. (2007). Aplasics born without hands mirror the goal of hand actions with their feet. Current Biology, 17, 1235–1240. Gentilucci, M. (2003). Grasp observation influences speech production. European Journal of Medicine, 17, 179–184. Gentilucci, M., Benuzzi, F., Gangitano, M., & Grimaldi, S. (2001). Grasp with hand and mouth: A kinematic study on healthy subjects. Journal of Neurophysiology, 86, 1685–1699.
Jellema, T., Baker, C. I., Wicker, B., & Perrett, D. I. (2000). Neural representation for the perception of the intentionality of actions. Brain and Cognition, 442, 280–302.
Jürgens, U. (2002). Neural pathways underlying vocal control. Neuroscience and Biobehavioral Reviews, 26, 235–258.
Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74, 431–461. Liberman, A. M., & Mattingly, I. G. (1985). The motor theory of speech perception revised. Cognition, 21, 1–36. Liberman, A. M., & Whalen, D. H. (2000). On the relation of speech to language. Trends in Cognitive Sciences, 4, 187–196. Lui, F., Buccino, G., Duzzi, D., Benuzzi, F., Crisi, G., Baraldi, P., et al. (2008). Neural substrates for observing and imagining non objectdirected actions. Social Neuroscience, 3, 261–275. Luppino, G., Belmalih, A., Borra, E., Gerbella, M., & Rozzi, S. (2005). Architectonics and cortical connections of the ventral premotor areA F5 of the Macaque [Program No. 194.1, 2005 Neuroscience Meeting Planner]. Washington, DC: Society for Neuroscience 2005. Online. McNeliage, P. F. (1998). The frame/content theory of evolution of speech production. Behavioral and Brain Science, 21, 499–511.
Gentilucci, M., & Corballis, M. C. (2006). From manual gesture to speech: A gradual transition. Neuroscience and Biobehavioral Reviews, 30, 949–960.
Maeda, F., Kleiner-Fisman, G., & Pascual-Leone, A. (2002). Motor facilitation while observing hand actions: Specificity of the effect and role of observer ’s orientation. Journal of Neurophysiology, 87, 1329–1335.
Goldman, A. I., & Sripada, C. S. (2005). Simulationist models of facebased emotion recognition. Cognition, 94, 193–213.
Meister, I. G., Boroojerdi, B., Foltys, H., Sparing, R., Huber, W., & Topper, R. (2003). Motor cortex hand area and speech: Implications for the development of language. Neuropsychologia, 41, 401–406.
Gregoriou, G. G., Borra, E., Matelli, M., & Luppino, G. (2006). Architectonic organization of the inferior parietal convexity of the macaque monkey. Journal of Comparative Neurology, 496, 422–451.
c16.indd 355
Hurley, S., & Chater, N. (2005). Pespective on imitation: From neuroscience to social science (Vol. 1). Cambridge, MA: MIT Press.
Merleau-Ponty, M. (1962). Phenomenology of perception. (Translated from the French by C. Smith). London: Routledge.
Grèzes, J., Armony, J. L., Rowe, J., & Passingham, R. E. (2003). Activations related to “mirror” and “canonical” neurones in the human brain: An fMRI study. NeuroImage, 18, 928–937.
Mesulam, M. M., & Mufson, E. J. (1982). Insula of the old world monkey: Pt. III. Efferent cortical output and comments on function. Journal of Comparative Neurology, 212, 38–52.
Gross, C. G. (1992). Representation of visual stimuli in inferior temporal cortex of the monkey. Philosophical Transactions, Royal Society, London, B, 335, 3–10.
Murata, A., Fadiga, L., Fogassi, L., Gallese, V., Raos, V., & Rizzolatti, G. (1997). Object representation in the ventral premotor cortex (area F5) of the monkey. Journal of Neurophysiology, 78, 2226–2230.
Heiser, M., Iacoboni, M., Maeda, F., Marcus, J., & Mazziotta, J. C. (2003). The essential role of Broca’s area in imitation. European Journal of Neuroscience, 17, 1123–1128.
Nelissen, K., Luppino, G., Vanduffel, W., Rizzolatti, G., & Orban, G. A. (2005, October 14). Observing others: Multiple action representation in the frontal lobe. Science, 310, 332–336.
8/18/09 5:10:18 PM
356 The Mirror Neuron System Nishitani, N., & Hari, R. (2000). Temporal dynamics of cortical representation for action. Proceedings of the National Academy of Sciences, USA, 97, 913–918.
Rozin, R., Haidt, J., & McCauley, C. R. (2000). Disgust. In M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of emotions (2nd ed., pp. 637– 653). New York: Guilford Press.
Nishitani, N., & Hari, R. (2002). Viewing lip forms: Cortical dynamics. Neuron 36, 1211–1220.
Rozzi, S., Calzavara, R., Belmalih, A., Borra, E., Gregoriou, G. G., Matelli, M., et al. (2006). Cortical connections of the inferior parietal cortical convexity of the macaque monkey. Cerebral Cortex, 16, 1389–1417.
Obayashi, S., Suhara, T., Nagai, Y., Okauchi, T., Maeda, J., & Iriki, A. (2004). Monkey brain areas underlying remote-controlled operation. European Journal of Medicine, 19, 1397–1407. Orban G..A, Peeters R, Nellisen K., Buccino, G., Vanduffel, W., Rizzolatti, G. (2006) The use of tools, a unique human feature represented in the left parietal cortex. Program number 114.2. 1006 Neuroscience Meeting Planner. Atlanta, GA: Society for Neuroscience, 2006. Online Pandya, D. N., & Seltzer, B. (1982). Intrinsic connections and architectonics of posterior parietal cortex in the rhesus monkey. Journal of Comparative Neurology, 204, 196–210. Penfield, W., & Faulk, M. E. (1955). The insula: Further observations on its function. Brain, 78, 445–470. Perrett, D. I., Harries, M. H., Bevan, R., Thomas, S., Benson, P. J., Mistlin, A. J., et al. (1989). Frameworks of analysis for the neural representation of animate objects and actions. Journal of Experimental Biology, 146, 87–113. Perrett, D. I., Mistlin, A. J., Harries, M. H., & Chitty, A. J. (1990). Understanding the visual appearance and consequence of hand actions. In M. A. Goodale (Ed.), Vision and action: The control of grasping (pp. 163–342). Norwood, NJ: Ablex. Phillips, M. L., Young, A. W., Scott, S. K., Calder, A. J., Andrew, C., Giampietro, V., et al. (1998). Neural responses to facial and vocal expressions of fear and disgust. Proceedings of the Royal Society of London, B, 265, 1809–1817. Phillips, M. L., Young, A. W., Senior, C., Brammer, M., Andrew, C., Calder, A. J., et al. (1997, October 2). A specific neural substrate for perceiving facial expressions of disgust. Nature, 389, 495–498. Povinelli, D. J. (2000). Folk physics for apes: The chimpanzee’s theory of how the world works. Oxford: Oxford University Press. Prinz, W. (1987). Ideomotor action. In H. Heuer & A. Sanders. (Eds.), Perspective on perception and action (pp. 47–76). Hillsdale, NJ: Erlbaumpp. Prinz, W. (2002). Experimental approaches to imitation. In A. Meltzoff & W. Prinz (Eds.), The imitative mind: Development, evolution, and brain bases (pp. 143–162). Cambridge: Cambridge University Press. Pulvermuller, F. (2001). Brain reflections of words and their meaning. Trends in Cognitive Sciences, 5, 517–524. Rizzolatti, G. (2005). The mirror neuron system and imitation. In S Hurley & Chater N (eds) In Perspective on Imitation. From Neuroscience to Social Science, MIT Press, Cambrige (MA) vol. 1, 55–76. Rizzolatti, G., & Arbib, M. A. (1998). Language within our grasp. Trends in Neurosciences, 21, 188–194. Rizzolatti, G., & Craighero, L. (2004). The mirror-neuron system. Annual Review of Neuroscience, 27, 169–192. Rizzolatti, G., Fadiga, L., Fogassi, L., & Gallese, V. (1996). Premotor cortex and the recognition of motor actions. Cognitive Brain Research, 3, 131–141. Rizzolatti, G., Fogassi, L., & Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Reviews: Neuroscience, 2, 661–670. Rizzolatti, G., & Luppino, G. (2001). The cortical motor system. Neuron, 31, 889–901. Royet, J. P., Plailly, J., Delon-Martin, C., Kareken, D. A., & Segebarth, C. (2003). fMRI of emotional responses to odors: Influence of hedonic valence and judgment, handedness, and gender. Neuroimage, 20, 713–728.
c16.indd 356
Rozzi, S., Ferrari, P.F., Bonini, L., Rizzolatti, G., Fogassi, L. (2008). Functional organization of inferior parietal lobule convexity in the macaque monkey: electrophysiological characterization of motor, sensory and mirror responses and their correlation with cytoarchitectonic areas. Eur J Neurosci, 18, 1569–1588 Sakata, H., Taira, M., Murata, A., & Mine, S. (1995). Neural mechanisms of visual guidance of hand action in the parietal cortex of the monkey. Cerebral Cortex, 5, 429–438. Sakreida, K., Schubotz, R. I., Wolfensteller, U., & von Cramon, D. Y. (2005). Motion class dependency in observers’ motor areas revealed by functional magnetic resonance imaging. Journal of Neuroscience, 25, 1335–1342. Schienle, A., Stark, R., Walter, B., Blecker, C., Ott, U., Kirsch, P., et al. (2002). The insula is not specifically involved in disgust processing: An fMRI study. NeuroReport, 13, 2023–2026. Seyal, M., Mull, B., Bhullar, N., Ahmad, T., & Gage, B. (1999). Anticipation and execution of a simple reading task enhance corticospinal excitability. Clinical Neurophysiology, 110, 424–429. Seyfarth, R. M., & Cheney, D. L. (2003). Signalers and receivers in animal communication. Annual Review of Psychology, 54, 145–173. Showers, M. J. C., & Lauer, E. W. (1961). Somatovisceral motorpatterns in the insula. Journal of Comparative Neurology, 117, 107–115. Singer, T., Seymour, B., O’Doherty, J., Kaube, H., Dolan, R. J., & Frith, C. D. (2004, February 20). Empathy for pain involves the affective but not sensory components of pain. Science, 303, 1157–1162. Small, D. M., Gregory, M. D., Mak, Y. E., Gitelman, D., Mesulam, M. M., & Parrish, T. (2003). Dissociation of neural representation of intensity and affective valuation in human gustation. Neuron, 39, 701–711. Sprengelmeyer, R., Rausch, M., Eysel, U. T., & Przuntek, H. (1998). Neural structures associated with recognition of facial expressions of basic emotions. Proceedings of the Royal Society of London, B, 265, 1927–1931. Strafella, A. P., & Paus, T. (2000). Modulation of cortical excitability during action observation: A transcranial magnetic stimulation study. NeuroReport, 11, 2289–2292. Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V., Danna, M., Scifo, P., et al. (2005). Listening to action-related sentences activates fronto-parietal motor circuits. Journal of Cognitive Neuroscience, 17, 273–281. Tokimura, H., Tokimura, Y., Oliviero, A., Asakura, T., & Rothwell, J. C. (1996). Speech-induced changes in corticospinal excitability. Annals of Neurology, 40, 628–634. Tomasello, M., & Call, J. (1997). Primate cognition. Oxford: Oxford University Press. Umiltà, M. A., Kohler, E., Gallese, V., Fogassi, L., Fadiga, L., Keysers, C., et al. (2001). “I know what you are doing”: A neurophysiological study. Neuron, 32, 91–101. Van Hoof, JARAM (1967). The facial displays of the catarrhine monkeys and apes. In D. Morris (Ed.), Primate ethology (pp. 7–68). London: Weidenfield & Nicolson. Vogt, O., & Vogt, C. (1919). Allgemeinere Ergebnisse unserer Hirnforschung. Journal of Psychology and Neurology (Leipzig), 25, 279–462. Vogt, S., Buccino, G., Wohlschlager, A. M., Canessa, N., Shah, N. J., Zilles, K., et al. (2007). Prefrontal involvement in imitation learning of hand actions: Effects of practice and expertise. Neuroimage, 37, 1371–1383.
8/18/09 5:10:18 PM
References 357
c16.indd 357
Watkins, K. E., Strafella, A. P., & Paus, T. (2003). Seeing and hearing speech excites the motor system involved in speech production. Neuropsychologia, 41, 989–994.
Wicker, B., Keysers, C., Plailly, J., Royet, J.-P., Gallese, V., & Rizzolatti, G. (2003). Both of us disgusted in my insula: The common neural basis of seeing and feeling disgust. Neuron, 3, 655–664.
Wheaton, K. J., Thompson, J. C., Syngeniotis, A., Abbott, D. F., & Puce, A. (2004). Viewing the motion of human body parts activates different regions of premotor, temporal, and parietal cortex. Neuroimage, 22, 277–288.
Zald, D. H., & Pardo, J. V. (2000). Functional neuroimaging of the olfactory system in humans. International Journal of Psychophysiology, 36, 165–181.
8/18/09 5:10:19 PM
c16.indd 358
8/18/09 5:10:19 PM
Chapter 17
Varieties of Attention AMIR RAZ
“Everyone knows what attention is…” wrote William James, the American father of modern psychology, in his seminal 1890 volume Principles of Psychology. He described attention as “the taking possession by the mind in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. . . . It implies withdrawal from some things in order to deal effectively with others, and is a condition which has a real opposite in the confused, dazed, scatterbrained state.” James’ account heavily joins attention with subjective experience. Moreover, James’ effort to deal with both attention to objects and attention to trains of thought is important for understanding current approaches to sensory orienting and executive control. Attention in the sense of orienting to sensory objects, however, can actually be involuntary and occur unconsciously. Furthermore, as any neophyte magician knows, paying attention is not the same as being aware. According to another famous James—James Randi, an accomplished magician, writer-educator, and a vociferous skeptic—magicians are “honest liars,” actors who use an arsenal of techniques, including attentional diversion, to accomplish their entertaining effects. Whereas magicians have been exploiting the vagaries of attention to trick their audiences for thousands of years, scientists have been studying the psychology of attention and unraveling its underlying mechanisms for a little over a mere century. Thus, researchers of attention may benefit from the insight and experience of magicians. The study of attention has turned into one of the oldest and most central issues in psychological science. Investigators have learned a great deal about what attention is, what it does, and how it works. Attention refers to both external and internal information. For example, the preparedness for and selection of certain aspects of our physical environment, such as objects, or some ideas in our mind that are stored in memory. Unlike William James, however, I am less sanguine today that “Everyone knows what attention is . . .” especially as the scientific literature grows exponentially and continues to unravel the neural and psychological substrates of
Attention has many faces—a pivotal theme in psychological science, researchers have unraveled some of the mechanisms underlying the process of attention. Cognitive neuroscientists increasingly construe attention as disparate control networks, which correlate with discrete neural circuitry and respond to focal brain injuries, specific drugs, and mental states. It is possible to tease apart these varieties of attention and elucidate their individual development and function. On the one hand, illuminating the neural correlates of attention exemplifies the links between brain and behavior and binds psychology to the techniques of neuroscience. On the other hand, it shows how it is possible to illuminate different aspects of attention using disparate approaches. In this chapter, we discuss how investigators and magicians interpret attention as an organ system and as a vehicle to an art form, respectively. The way a researcher and a magician approach attention provides complementary perspectives on the varieties of attention and serves to elucidate the correspondence between a psychological phenomenon and its neural underpinnings.
ATTENTION AND MAGIC Attention is one of magic’s main currencies. I have paid attention to magic since childhood. Initially gleaning my information from side panels of cereal boxes, I have gradually amassed specialty books, joined professional societies, and befriended accomplished performers. My interest and proficiency in the art of magic had many redeeming qualities: as a young man, it allowed me to mystify audiences and be the life of social affairs; as an impoverished student, it helped me supplement my income and eke out a better living; and as an eligible bachelor, it was key to many successful excursions. Above all, however, my experience with magic influenced my interest in human behavior, shaped my research program, and paved the road to my academic career as a cognitive neuroscientist with a keen interest in attention. In this chapter, I wear one of two hats intermittently: that of a researcher and that of a magician. 361
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c17.indd 361
8/17/09 2:11:05 PM
362 Varieties of Attention
attention. Both before and after William James, great scientists and philosophers have grappled with the study of attention, but James was probably one of the first to broach the concept of attentional varieties, challenging a monolithic conception of attention and recognizing the existence of different shades of attention rather than one unitary form. Several researchers have followed in James’ footsteps and suggested multiple components to describe attention. In this chapter, I show how one of the most influential models in the field of attention illustrates the crosstalk between the science of attention and the art of magic. This approach has been one of my favorite ways of thinking about attention for many years and has gained considerable experimental support in recent time.
A FEW GROSS CHARACTERISTICS OF ATTENTION Before jumping into an in-depth description of any specific model of attention, it is helpful to appreciate a few of its gross characteristics. These qualities may seem intuitive to some, but intuition can be specious, especially when dealing with both attention and magic. Attention can apply to various areas of the visual field and can change the detail with which we look at any given area. For example, you can look at this page and pay attention to its setup as a whole, or you can zoom in on specific words and certain letters therein. If you are paying attention to single characters, you can glean a lot of information about punctuation marks, spelling errors, and even spot minute imperfections on the physical paper. But in that case, you may miss the main idea of a paragraph. We can change the location of attention as well as the size of the attention focus. These “zoom lens” or “attention spotlight” appellations may be clichéd metaphors that only sketch the crux. They are helpful, however, in that they relate to our common experience concerning the kind of attention we use (e.g., for learning versus proofreading). Attention can be either overt or covert. Overt attention involves looking directly at the scene of interest. We usually look straight at what we wish to attend. Sometimes, however, attending to a location different from fixation is advantageous. Covert attention is the ability to select visual information at a cued location, without moving the eyes to study it directly, and to grant such information priority in processing. For example, we often engage in covert attention in social situations when we want to examine a person without being conspicuous. Note, however, that covert attention is not the same as daydreaming—looking at something is not the same as paying attention to it. Attention often involves selection. For example, think of when multiple people are talking simultaneously and
c17.indd 362
you try to hone in on one of these streams of conversation in order to follow it in detail (e.g., at a party). You can do it based on the location of the person by visually orienting toward her or locking on her voice frequency—it is easier to separate a female voice from a male voice than it is to separate two male voices—or you may follow by content. When we attend to one input stream, the unattended information goes into the background; although present, it rarely receives the same focal analysis. The brain processes unattended information in subtle and complicated ways. Unattended information can suddenly get interesting because your name or another particularly charged word is present, or because something happens that is related to the conversation you are following, and you find yourself orienting to the new information. Psychologists have studied these phenomena experimentally in great detail and have elucidated some of the mechanisms subserving them as well as the computations that such data receive. Attention can also influence perception and mental processes. For example, an individual reading an engrossing book may fail to identify certain environmental cues. Similarly, attention can aid perception. Improvement in perception, however, is not synonymous with altered thresholds for detection, better performance, or faster reaction times. Cognitive scientists draw a distinction between how attention may be useful for simple detection of events versus how performance can improve at those events. Attention is not a panacea to perception because there is much that attention cannot do. For example, attention can give priority to stimuli appearing at a specific physical location, but it cannot substitute for the acuity provided by the fovea. While the fovea is critical for visual acuity, the costs in latency for an unexpected foveal stimulus are just as great as for an unexpected peripheral event. In other words, visual attention does not compensate for visual acuity. Although performance may improve with increased attentional investment, controversy persists over what orienting attention to a sensory stimulus actually does. General agreement posits that attention provides priority, so that reaction time to the attended stimulus is usually faster. Thus, visual attention influences priority or processing preference. This characteristic of attention applies to other attentional modalities such as hearing.
THREE-NETWORK MODEL OF ATTENTION In line with William James’ early notion of distinct attentional varieties, Michael I. Posner proposed a modern model of attention wherein at least three main functionally and anatomically distinct types of supramodal attentional varieties cooperate and work closely together. Neuromodulators of Attention.
8/17/09 2:11:05 PM
Atypical Attention
Pharmacological findings relate each of the three attentional networks with specific chemical neuromodulators: the norepinepherine system, which arises in the locus coeruleus of the midbrain, functions in alerting; the cholinergic system, which arises in the basal forebrain, plays an important role in orienting through its effects in the parietal cortex, where it seems to reduce neural activity and reaction time cost associated with cueing to an invalid target; and the anterior cingulate cortex and lateral prefrontal cortex, involved in executive attention, are target areas of the mesocortical dopamine system (Posner & Rothbart, 2007).
METHODS OF INVESTIGATING ATTENTION. Although attention had already been studied from a neurophysiological view in the 1890s, mental chronometry together with application of the subtraction method provided rich information on psychological processes. In the subtraction method, investigators compared reaction times in two experimental conditions, which allegedly differed only in that one was hypothesized to require an additional cognitive process. Differences in reaction time were then taken to support and index the putative additional process. By systemically varying cognitive processing, researchers developed intricate models of brain function, many of which were subsequently supported by neuroimaging studies. Reaction time assays were later combined with such mathematical formulations as formal information theory. However, because these methods were largely divorced from anatomical and neurobiological data, these approaches were deemed inadequate to elucidate the mechanisms whereby the human brain pays attention. In the 1950s, the advent of microelectrode recordings of single neurons from laboratory animals, at first anesthetized but later awake, afforded examination of neurophysiological processes and supported the notion that the brain processes information in serial stages. Studies using awake monkeys revealed control systems—the terminological precursor to attention networks—where higher brain areas feed back their influence onto earlier processing stages. This top-down effect challenged the then-common view of a completely serial approach to information processing and provided evidence for focal brain areas within the monkey parietal lobe that could be systematically related to processing operations involved in attention (see Chapter 16 for details). These ideas were extended to humans and tested using reactiontime paradigms in neuropsychological patients. The arrival of analog and then digital computers in the 1960s initiated the field of neuroimaging by recording the average electrical event-related potentials (ERPs) from scalp electrodes. Electrophysiology became an ideal tool
c17.indd 363
363
to explore the notion of “attention for action,” which is characterized by millisecond resolution. ERP components were systematically related to sensory and motor stages of information processing. In the late 1980s, neuroimaging experiments made possible the examination of activity in localized brain areas—first through the use of injected radionuclides detected by positron emission tomography (PET) and later through the use of an externally imposed magnetic field in functional magnetic resonance imaging (fMRI). Over the past decade, fMRI has improved in spatial and temporal resolution and can now provide accurate spatial information of focal brain areas that are involved in cognitive tasks such as attention. The inferences obtained from both ERPs and magnetoencephalography (MEG), which probe perceptual processing with fine temporal detail, have been important complements to the millimeteric spatial resolution of fMRI. More recently, neuroimaging technology has been joined by genomics. The Human Genome Project has made great progress in identifying the 30,000 protean genes in the human genome as well as the approximately 1.7 million polymorphic sites scattered across the 6 billion base-pair length of the human genome. These findings hold promising prospects for illuminating how genes can influence disease development and may aid in the association of genes with particular psychopathology. In addition, genomics has the potential to promote the discovery of new treatments and to afford new insights into behavioral genetics, such as the relationship between certain genetic configurations and manifest behavior. Combining neuroimaging with genetics, recent exploratory assays endeavored to noninvasively probe genes that have been shown to result in variation in protein levels or biochemical activity in the context of both typical and atypical attention. Such pooled research efforts promise to elucidate both the neural and genetic correlates of attention. Findings from genetic and neuroimaging studies of attention have provided some converging results. While most neuroimaging studies yield a small number of widely distributed brain areas that must be orchestrated to carry out a cognitive task, it is often unclear what the unique contribution of each area might be. However, in the case of attention, as in the case for language, these mechanisms have been sufficiently elucidated by a careful teasing apart based on chronometry, neuroimaging, and genetics. Attention, therefore, is a primary research domain, which exemplifies the links between brain and behavior and binds psychology to the techniques of neuroscience.
ATYPICAL ATTENTION A number of human practices, such as drug ingestion, meditation, and hypnosis, can dramatically influence attention.
8/17/09 2:11:05 PM
364 Varieties of Attention
Cognitive neuroscientists are beginning to unlock the ways these routines influence the human brain and how such effects alter common information processing. It is possible to test the limits of attentional functions by examining healthy individuals under atypical conditions. That more notice should be given to the investigation of healthy individuals driven toward the neuropsychological domain is evident in light of the contributions of social psychology to cognitive science, exploratory assays of evanescent attention deficits, and the impact of reversible lesion research using transcranial magnetic stimulation (TMS). Cognitive neuroscientists generally agree that mental processes come in two varieties: controlled and automatic. Some processes are thought to be innately automatic; others become automatic through practice. General accounts posit that once automatized, these processes are initiated unintentionally, effortlessly, even ballistically, and cannot be easily interrupted or prevented. For example, the Stroop effect suggests that reading words is an automatic process for a proficient reader. The standard account posits that words are processed automatically to the semantic level and that the Stroop effect is the “gold standard” of automated performance. Although cognitive scientists have focused on the processes that lead to automatization with over 4,000 citations to Stroop’s original work alone, the question of whether it is possible to regain control over an automatic process is unanswered, and mostly unasked. However, mounting evidence from assays of atypical attention show that deautomatization is possible. A few meditative practices claim to achieve deautomatization with some sparse evidence of reduced Stroop interference. The most compelling findings addressing this issue
showed that a specific posthypnotic suggestion reduced and even removed Stroop interference in highly hypnotizable individuals. Reduction of the Stroop effect occurred following reduction in anterior cingulated cortex activation and altered processing in an occipito-parietal location that might be related to the chunking of visual letters into words. Independent accounts under typical conditions also challenge the robustness of the Stroop effect. Although critiqued, interpretation of these and other results supports the idea that attention may be employed to derail automatic processes. Clinicians are often interested in deautomatization as a way to unlearn or free one from undesired habits. Such derailment of automaticity may also occur spontaneously in extreme situations (e.g., in combat individuals might not realize that they have been hurt until much later). Other demonstrations of top-down modulation and deautomatization showed that following hypnotic instruction to view a colored picture as grayscale, highly hypnotizable individuals demonstrated reduced activity in color areas of the prestriate cortex. These studies show that atypical attention can influence at least executive attention and possibly some of the other attentional networks. Exploratory assays using other forms of atypical attention may further elucidate the malleability of attentional networks. For example, meditation training may be a way to induce a long-term baseline change in attentional function permitting individuals to achieve better, more effective self-regulation. Alerting refers to the increase and maintenance of response readiness in preparation for an imminent stimulus (Figure 17.1). Roughly equivalent to sustained attention and vigilance, alerting is probably a more foundational
Stay alert! In precisely 250 milliseconds, one of these bombs will go off!
“ALERT?” But, I don’t know where, just when!
Figure 17.1 The alert attention network.
c17.indd 364
8/17/09 2:11:05 PM
Atypical Attention
form of attention on which other attentional functions rest. Without putting too fine a point on it, however, alerting is typically task-specific rather than a general cognitive control of arousal. Modern experimentalists have replaced the “older” vigilance tasks by “newer” alerting tasks, although some researchers argue that these two task-types tap different mechanisms. The relationship between alerting and arousal is complex and psychological variables such as stress can further influence alerting, increasing or decreasing it as a function of specific context and task. In contrast to the in-depth studies of the other attention systems (i.e., orienting and executive control), alerting has been relatively understudied and attention research has yet to satisfactorily elucidate its neural substrates. Orienting informs us where an important event is likely to occur (Figure 17.2). It is also the ability to select specific information among multiple sensory stimuli. Sometimes known as scanning or selection, it is the most studied attentional network. Whether overt or covert, orienting has traditionally been measured by reductions in reaction time to a target following a cue that gives information on the location, but not the timing, of an event. Scientists distinguish between exogenous orienting—when the flash of a cue automatically captures attention to a specific location— and endogenous orienting—when a central arrow points to one of two lateralized target presentation locations. Some researchers argue that at least part of the capacity subsumed by alerting is conceptualized as orienting in the temporal domain. The bulk of the evidence, however, supports the notion that orienting and alerting are largely controlled by different brain systems. Although most research in orienting has been conducted in the visual domain, neural activity increases in response to an orienting cue and concomitant performance enhancement have been demonstrated in most sensory systems. Some researchers suggest
365
that orienting may encompass not only sensory, but also purely mental events, including memory. Recent work has shown an orienting effect for a variety of internal representations, including items stored in working memory and long-term memory. It is possible to increase the efficiency of a specific attention network by focal training. For example, several rehabilitation programs for patients with specific impairments of the orienting system involve expressly tailored attention exercises. Attention training has also been used in early child education to improve self-regulation. This form of attention operates in close coordination with working memory in many cognitive tasks. (A detailed description of the attention training procedure is available on the web at www.teach-the-brain.org/learn/attention.) Many studies have shown that children between the ages of about 3 and 7 develop a brain network that allows them to regulate their thoughts and emotions. Executive attention typically relates to conflict of the kind you encounter when trying not to scratch a particularly itchy mosquito bite or when confronted with two police officers who demand that you comply with conflicting orders (Figure 17.3). In general, executive functions pertain to planning or decision making, error detection, novel or not well-learned responses, conditions judged to be difficult or dangerous, regulation of thoughts and feelings, and overcoming habitual actions. While some may consider any instance of top-down control as executive attention, others construe it as the monitoring and resolution of conflict between computations in different neural areas. Executive attention is typically measured using experimental tasks where one is faced with an incompatibility between dimensions of the stimulus or response. Whether and to what extent executive attention governs the other attentional networks remains unclear. A more
At some point, the rabbit will come out of the center hat.
Hmm. I know where to ORIENT, but when will things happen?
Figure 17.2 The orient attention network.
c17.indd 365
8/17/09 2:11:06 PM
366 Varieties of Attention
Yikes! How do I resolve this CONFLICT? GO
Figure 17.3 The conflict attention network.
successful effort has related concepts such as emotionregulation, self-regulation, effortful control and inhibitory control to executive attention. These findings collectively reveal that attention is a strong modulator of emotion, cognition, thought and action. For example, findings elucidate several aspects of the influence of attention training on executive attention in young children, drawing on measures of brain activity, cognition, and behavior in children as early as 4 years of age. These measures include behavioral assessments of executive attention and intelligence, genotyping of dopamine-related genes, recording electrical activity at the scalp generated by neuronal function, and parental questionnaires relating to the child’s temperament. This training program—adapted to be childfriendly from a method originally used to prepare macaque monkeys for space travel—was given for 5 days over a 2- to 3-week period and resulted in great attention improvements, including increase in IQ and better self-regulation of affect and cognition. This approach potentially opens a new vista for experiments in developmental cognitive neuroscience in which genetics, brain function, and behavior can be related through the study of individual differences and demonstrates that executive attention skills can be trained. In addition, these findings could potentially lead to better intervention strategies for children with attentional and other behavioral problems such as Attention-Deficit Hyperactivity Disorder.
MAGIC, PSYCHOLOGY, AND ATTENTION In the time-honored tradition of a complex, secret, and skillful art, magicians are typically reluctant to share their
c17.indd 366
methods with outsiders. The art of deception, however, goes beyond the mechanics of a specific trick and is deeply entrenched in psychology with a special emphasis on cognitive processes and perception. For many years, magicians have successfully used the psychology of deception, including self-deception; psychological science is just beginning to unravel data showing how suggestion and expectation, for example, influence human behavior. Using attention as a vehicle, current research paves the road to realizing this new direction. Magicians and researchers approach attention very differently. To appreciate this difference, consider a person who tries to understand the meaning of time. Consulting with a watchmaker is probably not the best way to go. Speaking to a physicist or a philosopher of science would likely be a better choice. While watchmakers fix timepieces, they are not necessarily experts on time. Similarly, magicians are the watchmakers of attention—they are experts at hoodwinking their audience’s perceptual or cognitive systems, without necessarily having keen insights into the underlying mechanisms. By contrast, investigators typically try to understand the mechanisms and identify the processes that subserve attention. Yet most would not make great magicians. These different approaches to attention may appear disjoint but can actually be complementary, in the same way that social psychology and cognitive neuroscience, two largely separate disciplines, have been increasingly overlapping. Social psychologists have traditionally tried to “push” healthy individuals closer to the pathological spectrum by incorporating into their research arsenal techniques such as suggestion and deception. Cognitive neuroscientists, on the other hand, have shied away from this approach and attempted, instead, to understand brain function by studying patients with specific deficits and focal brain lesions, as well as healthy individuals. The marriage of the methods of social psychology with cognitive neuroscience created an opportunity to test the limits of attentional functions by examining healthy individuals under atypical conditions including hypnosis, meditation, and sleep deprivation (see earlier section on Atypical Attention). Similar to some social psychologists, magicians capitalize on exploring the limits of human processing and triumph in commanding ways to tap these pliable behavioral perimeters. In this way, the cognitive neuroscience of attention can benefit from the contributions of both social psychologists and magicians. At least to some degree, most magic tricks rely on misdirection—or rather on direction—appellations that broadly designate attentional effects. Without knowing much about the science of attention, magicians have devised a vast array of practical and often cunning ways to
8/17/09 2:11:06 PM
Summary
direct one’s attention. One way is to direct where or when a spectator is looking, granting the performer sufficient, albeit brief, opportunity to accomplish a trick. Attentional misdirection can also create spurious expectations thereby reducing or diverting the spectator ’s suspicion from the modus operandi. In addition, misdirection can influence later recall. Misremembering the exact details often leads to the subsequent reconstruction of a past event, frequently through the spectator ’s own posthoc re-enactments and retrospective ascriptions. This misinformation effect has been thoroughly studied by psychologists. Seasoned magicians, however, especially those who practice mentalism, have long appreciated that while a magical effect can mesmerize the audience in attendance, a true miracle is an event described by those to whom it was told by others who did not even see it. Thus, magicians have known that attention can influence spatial and temporal information as well as meta-cognition and memory. Recent research on visual attention and visual memory confirms what magicians have both known and practiced for a long time. For example, people are surprisingly poor at noticing even large changes to objects, images, and motion pictures from one instant to the next—a phenomenon called change blindness. In addition, inattention blindness—a form of sighted blindness where the inability to perceive involves subjects who are not attending to the stimulus but are attending instead to something else—is a related phenomenon. People usually experience inattentional blindness when they don’t know what they should attend to—exactly what a magician exploits when standing in front of an unsuspecting audience. Simple experiments show that effects such as change and inattentional blindness are no longer esoteric exemplars confined to the psychology research lab but rather compelling demonstrations of plausible perceptual experience (see the videos at http://viscog.beckman.uiuc.edu/djs_lab/demos.html or www.quirkology.com/United States/index.shtml for online demonstrations). Attentional phenomena such as change and inattention blindness raise critical questions about the relationship between attention and perception. For example, how much of our visual world do we perceive when we are not paying attention? Thus, attention or lack thereof—directing attention away from a target object—plays a key role in perception. While magicians can provide many practical demonstrations of these traits, researchers are beginning to unravel the neurocognitive substrates that subserve them. Although practitioners of magic were on the scene way before scientists started to study the limitations of the human ability to deal with multiple concurrent signals in a variety of practical tasks, Posner ’s model of attention nicely illustrates how magic tricks might work. Since
c17.indd 367
367
its inception in the early 1970s, Posner ’s model has been revised and refined, but still retains its original tenor, namely that attention comprises a system of three disparate control networks. Experimental findings suggest that these attentional subsystems can modulate cognition, emotion, thought, and action. Furthermore, these networks can influence early stages of neural processing concerning both the location and time of sensory information as well as relate to meta-cognition and alter certain kinds of memory.
SUMMARY Compelling evidence suggests that different attentional networks exist in the human brain. The exact nature of these networks and the degree to which they are independent is still unclear. Although other important models promote different views concerning the functions and mechanisms of attention, Posner ’s three-network account is an influential model of attention that fits nicely with a magician’s intuition. Individuals outside the magic fraternity often fail to appreciate that although performers may recast their tricks to gel with contemporary culture and employ modern technology, these variations are largely cosmetic and rely on age-old principles, mostly grounded in the psychology of attention and deception. The basic principles of conjuring comprise the subtleties of attentional misdirection, the understanding of human perception, including the understanding of visual and psychological illusion, and good showmanship. Practitioners of magic, like skilled therapists, often use their utterances and gesticulations to create false images and specious expectations in the minds of their spectators. Whereas clinicians may only infrequently resort to trickery, conjurors thrive and constantly seek novel ways to perfect the art of deception, for the audience’s entertainment pleasure as well as for their own fame and fortune. While medical practitioners may play by the rules, for magicians there are no rules—almost everything is allowed, including the unthinkable, in order to win the “Trick the Audience” game. With more research tools becoming progressively available, understanding of attention is likely to yield innovations in education, the treatment of pathological conditions, rehabilitation, cognitive training, and . . . the magical arts. It will also provide insights into cultural and individual differences and further integrate the psychological and brain sciences. While most research has been conducted with normal or pathological individuals in the context of typical, waking attention, carefully designed experimentation in the plane of atypical attention may further accelerate this process in the quest to elucidate human attention.
8/17/09 2:11:07 PM
368 Varieties of Attention
As researchers begin to pay attention to magic and how it teaches us about human behavior and brain function, practitioners of the world’s second oldest profession may benefit from scientific insights into attention to improve,
polish, and invent new powerful effects. Whether the best magician among the scientists or the best scientist among the magicians, I pay attention to magic and to the magic of attention. Poof!
GLOSSARY Mental chronometry: Reaction time studies, such as those early experiments conducted by Donders in 1868, where researchers try to “time the mind” and attempt to describe the processes going on by fragmenting cognitive processing into separate stages. Change blindness: While changes to a scene typically produce a detectable motion signal, when a change coincides with another event that disrupts the motion signal, observers are often blind to surprisingly large changes. Recent experiments show change blindness during events such as saccades, flashed blank screens, blinks, and real-world occlusions. Emotion regulation: The reduction, increase, or sustaining of an emotional response (e.g., fear, anger or pleasure) based on the actions of the self or others. Self-regulation: The ability to manipulate one’s own emotions, thoughts or actions upon direction from the self or another person. Emotion regulation can be a form of self-regulation but it could also be induced by actions of others. Stroop effect: The Stroop conflict task requires proficient readers to name the ink color of a displayed word. Individuals are usually slower and less accurate indicating the ink color of an incompatible color word (e.g., responding “blue” when the word “RED” is inked in blue) than identifying the ink color of a congruent color name (e.g., “RED” inked in red). This difference in performance constitutes the Stroop conflict and is one of the most robust and wellstudied phenomena in attention research. Effortful control: The ability to inhibit, activate or sustain a response, which includes the capacity to inhibit a dominant response in order to perform a subdominant response. In temperament research, individual differences in effortful control are measured as a factor score that combines scales dealing with attention and the ability to regulate behavior on command. Inhibitory control: The reduction in the probability, speed, or vigor of the normal response to a stimulus based on instruction from the self or others. It is often measured by scale scores on a questionnaire or by a task that requires one to withhold or delay a response. Hypnosis: Attentive receptive concentration whereby certain individuals can change the way they experience themselves and the environment and often display heightened compliance with suggestion. Posthypnotic suggestion: A condition during common wakefulness (after termination of the hypnotic experience) wherein, usually upon a prearranged cue, a subject readily complies with a suggestion made during the hypnotic episode. Functional magnetic resonance imaging (fMRI): A noninvasive technique that permits imaging of the living brain and provides findings that relate neural to cognitive activity by measuring small changes in the magnetic properties of blood. Event-related potentials (ERP): A noninvasive electrophysiological technique based on scalp electrode recordings of evoked-response potentials. Top-down effect: Controlling, regulating, or overriding a stimulus-driven or other bottom-up process by such factors as attention or expectation. Magnetoencephalography (MEG): A technique similar to event-related potentials, which detects the changing magnetic fields associated with brain activity. Vigilance tasks: A set of tasks requiring sustained attention typically requiring participants to monitor displays over extended periods of time for the occasional occurrence of critical events (signals). Signals are low-probability events requiring action, embedded in the context of recurrent nonsignal events which require no overt response. Alerting tasks: A set of tasks requiring participants to prepare for the imminent appearance of a target at a known location. For example, a visual cue may alert the participant that a subsequent target will soon appear at a known location.
c17.indd 368
8/17/09 2:11:07 PM
References 369
Positron emission tomography (PET): A technique using positron-emitting radioactive tracers that are attached to molecules that enter biological pathways of interest to study the relationship between energy consumption and neural activity. Transcranial magnetic stimulation (TMS): A technique used to induce transient interruption of normal activity in a relatively restricted area of the brain by rapidly changing a strong magnetic field near the brain area of interest. Mentalism: The simulation of psychic powers for the purpose of entertainment, usually without explicitly claiming to possess such powers. Whereas at least some members of the larger fraternity of magical arts draw a distinction between magic and mentalism, the lay public often views mentalists as anywhere from pseudo-psychics all the way to genuine exemplars of the paranormal. Unfortunately, performers of mentalism typically acquiesce when others attribute paranormal powers to them. Regretfully, only few mentalists judiciously represent their performances for what they really are—entertaining tricks based in deception—and dutifully steer clear of claims of the paranormal.
REFERENCES Hyman, R. (1989). The psychology of deception. Annual Review of Psychology, 50, 133–154. Lamont, P., & Wiseman, R. (1999). Magic in theory. Hertfordshire: University of Hertfordshire Press.
Raz, A., & Buhle, J. (2006). Typologies of attentional networks. Nature Reviews Neuroscience, 7, 367–379. Schiffman, N. (1997). Abracadabra! Secret methods magicians and others use to deceive their audience. Amherst, NY: Prometheus Books. Sorensen, J. (2007). A cognitive theory of magic. Lanham: AltaMira.
Posner, M. I., & Rothbart, M. (2007). Research on attention networks as a model for the integration of psychological science. Annual Review of Psychology, 58, 1–23.
c17.indd 369
8/17/09 2:11:07 PM
Chapter 18
Attentional Mechanisms YALCHIN ABDULLAEV AND MICHAEL I. POSNER
Hebb’s contribution). Hebb proposed that cell assemblies link widespread systems of neurons from multiple brain areas. Neuroimaging has revealed brain networks involved in many of the cognitive and emotional tasks (Posner & Raichle, 1994). Some of these are indicated in Table 18.1 with references that discuss them in more detail. In all of these cases, a small number of widely dispersed brain areas are active. It has been argued that each node of these networks carries out its own operation (Posner & Raichle, 1994), but there is still much discussion of the role of each brain area. Hebb also proposed phase sequences that coordinated the activity of cell assemblies in real time. Currently, neuroscientists explore the coordination of remote brain areas by common oscillations (Knight, 2007). Thus, at both the cellular and the network levels, electrical recording within cells (intracellular recording) and from electrodes placed on the head skin surface (electroencephalography or EEG) have revealed the coordination of widespread neural areas in real time. In this chapter, we first trace the modern history of studies of attention. We begin with studies arising shortly after World War II and consider each subsequent decade. We emphasize links between attention and underlying brain mechanisms including studies of patients with brain lesions, recording of electrical activity noninvasively or by use of implanted intracerebral electrodes and efforts to understand the genes related to attention. We then examine studies at the neurosystems level using methods of imaging and lesions to trace critical brain areas that are the sources of attentional networks. Next, we consider studies at the cellular or synaptic level. Evolutionary studies provide needed links between human and nonhuman primates (Chapter 3 by Snowdon & Cronin in this book). Recordings from depth electrodes can penetrate the microstructure involved in computations within neural areas. Finally, we consider the contribution from genetic studies that trace the role of genes and environment in shaping the development of attentional networks.
Neuroscience contributions to the mechanisms of attention can be examined at the systems, cellular, synaptic, or genetic levels. These studies use the methods discussed in the Part I, Foundations, of this Handbook. The integration of appropriate constraints from each of these levels is an important requirement for a full understanding of cognitive processes such as attention. This chapter examines the use of various methods and the attentional typology outlined in Chapter 17. We first summarize the modern history of attention studies and then examine studies at the systems, cellular, synaptic, and genetic levels. Our emphasis is on the executive attention system because it is central to human behavior and it plays a large role in the control or regulation of thoughts and feelings. The field of attention is one of the oldest in psychology. At the turn of the twentieth century, Titchener (1909) called attention “the heart of the psychological enterprise.” Attention is relatively easy to define subjectively. The classical definition of William James, for example, was, “Everyone knows what attention is. It is the taking possession of the mind in clear and vivid form of one or of what seem several simultaneous objects or trains of thought” (1890, p. 403). However, this subjective definition does not provide hints that might lead to an understanding of mechanisms of attention that can illuminate its physical basis in terms of underlying physiological (I) process nor clarify its normal development (II) and pathologies. For these goals, it is useful to think about attention as an organ system with its own anatomy and physiology that develops in early life under the control of genes and experience. This is the focus of this chapter. The modern history of attention as an organ starts with the important studies of Moruzzi and Magoun (1949) on the reticular activating system of the brain. About the same time, Hebb (1949) called attention to the importance of networks of neural areas (cell assemblies) linked in real time (phase sequences) in building conscious representation of stimulus input (see Posner & Rothbart, 2007b, for a review of 370
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c18.indd 370
8/17/09 2:11:27 PM
Introduction 371 TABLE 18.1
Networks Studied by Neuroimaging and Their References
Arithmetic Dehaene, S. (1997). The number sense. Oxford, UK: Oxford University Press, Figure 8.5). Autobiographical Memory Fink, G. R., Markowitsch, H. J., Reinkemeier, H., Bruckbauer, T., Kessler, J., & Heiss, W. D. (1996). Cerebral representation of one’s own past: Neural networks involved in autobiographical memory. Journal of Neuroscience, 16, 4275–4282. Faces Haxby, J. V. (2004). Analysis of topographically organized patterns of response in fMRI data: Distributed representation of objects in the ventral temporal cortex. In N. Kanwisher and J. Duncan (Eds.), Functional neuroimaging of visual cognition attention and performance XX. Oxford UK: Oxford Universiy Press (pp. 83–97). Fear Ochsner, K. N., Ludlow, D. H., Knierim, K., Hanelin, J., Ramachandran, T., Glover, G. C., & Mackey, S. C. (2006). Neural correlates of individual differences in pain-related fear and anxiety. Pain, 129, 69–77. Reading and Listening Posner, M. I., & Raichle, M. E. (1997). Images of mind (2nd ed.). New York: Scientific American Library. Reward Knutson, B., Fong, G. W., Bennett, S. M., Adams, C. M., & Homme, D. (2003). A region of mesial prefrontal cortex tracks monetarily rewarding outcomes: Characterization with rapid event-related fMRI. NeuroImage, 18, 263–272. Self Reference Johnson, S. C., Schmitz, T. W., Kawahara-Baccus, T. N., Rowley, H. A., Alexander, A. L., Lee, J. H., & Davidson, R. J. (2005). The cerebral response during subjective choice with and without self-reference. Journal of Cognitive Neuroscience, 17, 1897–1906.
INTRODUCTION
1960s
1950s
One of the significant developments of the 1960s involved the ability to average electrical signals from the scalp to create the event-related potential (ERP)—a series of electrical events time locked to the stimulus. The technique was applied to the study of attention. Sutton, Nraren, Zubin, and John (1965) reported that surprising or unexpected cognitive events of the type that might be closely inspected produced a strong positive wave in the scalp potential called the P300 (meaning a positive deflection of electrical activity at about 300 ms after stimulus onset). This component has and continues to play an important role in attention research (Donchin & Cohen, 1967; Rugg & Coles, 1995). At about the same time, Gray Walter reported that the brain produced a marked DC shift during the period following a warning and prior to a target. This was called the contingent negative variation and was viewed as a sign that alerting was taking place (Walter, Cooper, Aldridge, McCallum, & Winter, 1964). Reaction time improved markedly over the first 500 ms following the warning. Often errors increased with warning interval producing a tradeoff between speed and accuracy. This finding suggested that warning effects did not improve the accrual of information but instead made it faster to attend to the input and thus speed the response (Posner, 1978).
D. O. Hebb (1949) argued that each simulus has two effects. One of these involved the reticular activating system and worked to keep the cortex tuned in the waking state, whereas the other used the great sensory pathways and provided information about the identity and location of the stimulating event. In the early 1950s, Colin Cherry (1953) initiated an epic series of experiments designed to examine how subjects selected stimuli that were presented simultaneously to each ear. Rapid presentation of pairs of digits, one to each ear, led people to recall all digits presented to the right ear first, followed by all presented to the left. Broadbent (1958) summarized these and other results by suggesting that a peripheral short-term memory system buffers sensory input prior to a filter, which selects a channel of entry (in this case an ear) and sends information to a limited capacity perceptual system. A second line of attention research that emerged from studies conducted during the World War II involved the study of sustained attention during vigilance tasks (Mackworth & Mackworth, 1956). During continuous tasks, subjects tended to miss more signals as the task continued. Changes in the EEG suggested that there was an increase in a sleep-like state.
c18.indd Sec1:371
8/17/09 2:11:27 PM
372 Attentional Mechanisms
1970s The work of Hubel and Wiesel (1968) using microelectrodes to probe the structure of the visual system began in the early 1960s. However, before this method could be applied to attention, it was necessary to adapt the microelectrode technique to alert animals. This was accomplished in the early 1970s by Evarts (1968) and applied by Mountcastle (1978) and Wurtz, Goldberg, and Robinson (1980) to examine mechanisms of visual attention in the superior colliculus and parietal lobe. Their findings suggested the importance of both of these areas to a shift of visual attention. It had been known for many years that patients with lesions of the right parietal lobe could suffer from a profound neglect of space opposite the lesion. The findings of attention-related cells in the posterior parietal lobe of alert monkeys suggested that these cells might be responsible for the clinical syndrome. In the 1970s and 1980s, recording neuronal activity during cognitive tasks was accomplished in human neurosurgery patients with diagnostic and/or therapeutic depth intracerebral electrodes (Bechtereva, 1978; Bechtereva & Abdullaev, 2000; Bechtereva, Medvedev, Abdullaev, Melnichuk, & Gurchin, 1989; Engel, Moll, Fried, & Ojemann, 2005; Ojemann, Creutzfeldt, Lettich, & Haglund, 1988). This opened new ways of studying the regional cellular mechanisms of attention and other cognitive functions. An impressive result from the microelectrode work was that the time course of parietal cell activity seemed to follow a visual stimulus by 80 to 100 ms. Beginning in the 1970s, Hillyard (Hink, Van Voorhis, Hillyard, & Smith, 1977) and other investigators explored the use of scalp electrodes to examine time differences in the brain activity between attended and unattended visual locations. They found that the N1 and P2 components of the visual ERP showed changes due to attention starting at about 100 ms after input. These findings showed likely convergence of the latency of psychological processes as measured by ERPs in human subjects and cellular processes measured in alert monkeys. This important development in mental chronometry suggested that scalp recordings could accurately reflect the underlying timing aspects of brain activity.
1980s Posner (1980) studied the use of a cue in an otherwise empty visual field as a way of moving attention to a target. Subject’s task was to press a button as soon as a visual stimulus appeared in either left or right visual field, and a brief presentation of a left or right cue preceded the target stimulus directing attention to the cued location. Electrodes near the eyes were used to ensure there were no eye movements. Because only one response was required, there was
c18.indd Sec1:372
no way to prepare the response differently depending on the cue, making it clear that whatever changes were induced by the cue were covert and not due to motor adjustment of the eyes or hand. It was found that covert shifts could enhance the speed of responding to the target even in a nearly empty field. Within half a second, one could shift attention to a visual event and, when it indicated a likely target at another location, move attention to enhance processing at the new location. Shulman, Remington, and McClean (1979) showed that response times to probes at intermediate locations were enhanced at intermediate times as though attention actually moved through the space. It was possible to prepare to move the eyes to one location while moving attention covertly in the opposite direction (Posner, 1980). Whether attention moves through the intermediate space and how free covert attention is from the eye movement systems are still disputed matters, suggesting the limitation of purely behavioral studies (LaBerge, 1995; Rizzolatti, Riggio, Dascola, & Umilta, 1987). At the time, it was also hard to understand how a movement of attention could possibly be executed by neurons. Subsequently, it was shown that the population vector of a set of neurons in the motor system of a monkey could carry out what would appear behaviorally, as a mental rotation (Georgopulos, Lurito, Petrides, Schwartz, & Massey, 1989). After that discovery, a covert shift of attention mediated by a population of neurons did not seem too far-fetched. It had been reported that patients with lesions of the parietal lobe could make same-different judgments concerning objects that they were unable to report consciously (Volpe, LeDoux, & Gazzaniga, 1979). It was possible to follow this result in more analytic cognitive studies. What did a right parietal lesion do that made access to material on the left side difficult or impossible for consciousness and yet still left the information available for other judgments? (See also Volume II, Chapter 66 by Taub & Uswatte). This puzzle was partially answered by the systematic study of patients with different lesion locations in the parietal lobe, the pulvinar, and the colliculus. These lesions all tended to show neglect of the side of space opposite the lesion. But in a detailed cognitive analysis, it was clear that they differed in showing deficits in specific mental operations involved in shifting attention (Posner, 1988). These studies supported a limited form of brain localization. The hypothesis was that different brain areas executed individual mental operations or computations such as disengaging from the current focus of attention (parietal lobe), moving or changing the index of attention (colliculus), and engaging the subsequent target (pulvinar). If this hypothesis were correct, it might explain why Lashley (1931) thought the whole brain was involved in mental tasks. While the
8/17/09 2:11:27 PM
Systems Level 373
neuroimaging studies in the next decade raised serious questions about the details of the localizations of component operations, they tended to support the importance of widespread networks with nodes that carried out computations like these (Posner & Rothbart, 2007b). 1990s to Date In the late 1980s, the Washington University School of Medicine was developing a positron emission tomography (PET) center led by Marc Raichle. These studies helped establish neuroimaging as a means of exploring brain activity during cognitive functions in general and the study of attention in particular (Posner & Raichle, 1994, 1998). In general, these studies have shown that most cognitive tasks, including those that are designed to separate mechanisms of attention, have activated a small number of widely scattered neural areas. Some people have argued that these areas are specific for domains of function like language, face perception, or episodic memory (Kanwisher & Duncan, 2004). In the area of attention, the mental operations or computations carried out by a particular area are more frequently considered (Corbetta & Shulman, 2002). These two ideas are not mutually exclusive. It is certainly possible to talk about the set of areas that are involved in language and at the same time maintain that the areas carry out different computations within that domain. The findings from neuroimaging that cognitive tasks involve a number of different anatomical areas has led to an emphasis on tracing the time dynamics of these areas during tasks involving attention. Because shifts of attention can be so rapid, it is difficult to follow them with hemodynamic imaging. To fill this role, algorithms have been developed (Scherg & Berg, 1993) to relate the scalp distribution of brain electrical activity recorded from high density electrical or magnetic sensors on or near the skull to brain areas active during hemodynamic imaging (see Dale et al., 2000, for a review). In some areas of attention, there has been extensive validation of these algorithms (Heinze et al., 1994) and they allow precise data on the sequence of activations during the selection of visual stimuli (see Hillyard, Di Russo, & Martinez, 2004, for a review). The combination of spatial localization with hemodynamic imaging and temporal precisions from electrical or magnetic skill recordings has provided an approach to the networks underlying attention. To relate imaging to underlying neural systems an important approach is to record extracellular from implanted electrodes in humans or primates. Work in humans was of great importance because neuronal activity usually provides excellent spatial resolution with precise localization of the recorded activity (in the immediate proximity of the tip
c18.indd Sec2:373
of the recording electrode), and fairly good temporal resolution, usually within a tenth to a hundredth of a second when changes of firing rate are recorded over time. It also provides valuable neurophysiological information about excitation or inhibition of neuronal activity. Disadvantages include its invasiveness, requiring use of patients’ having some chronic brain disorder (epilepsy, Parkinsonism, and others), and providing information from only a few recording sites of the brain that are used to guide therapy. This means that what is happening in many other brain regions is unknown. Subjects for these studies are neurosurgical patients with medically intractable forms of Parkinsonism, epilepsy, some other disorders involving brain tumors, or trauma with stereotactically implanted intracerebral electrodes for diagnostic and treatment purposes. Some institutions use chronically implanted electrodes that stay in the brain from a few weeks to a few months and allow studying neuronal activity after recovery period in a lab close to normal cognitive studies (Bechtereva, 1978; Bechtereva & Abdullaev, 2000; Bechtereva et al., 1989; Abdullaev, Bechtereva, Melnichuk, 1998). Others record neuronal activity in the operating room during open brain surgery (Ojemann et al., 1988). Changes of the cellular firing rate measured during performance of cognitive tasks provide evidence of the participation of these recorded cells in the attention or other measured cognitive function, its time course, and whether this accompanying neuronal activity change is excitatory or inhibitory. For example, cells recorded from the head of the caudate nucleus respond to list of visual words with increasing firing rate when doing high-level semantic tasks (deciding whether each noun is an abstract or concrete word) at about 200 to 300 ms after the stimulus onset (Figure 18.1, upper graph). The same cells respond to the same words with more sustained inhibition when the task is to read the words aloud (Figure 18.1, middle graph), or when making an old/new discrimination of words memorized the day before from new words (Figure 18.1, lower graph).
SYSTEMS LEVEL The late twentieth century methods provide an improved prospect of an integration of psychological science around the ideas introduced by Hebb (1949; see Posner & Rothbart, 2007b, for an extension of this argument). Cell assemblies and phase sequences are names for aspects of neural networks. Now, thanks to work on the computational properties of neural networks (i.e., Rumelhart & McClelland, 1986), we are in a much better position to develop detailed theories integrating information from physiological, cognitive, and behavioral studies.
8/17/09 2:11:27 PM
374 Attentional Mechanisms c
a
d ML b Left
(A)
a
Right
Lb.Pt
Semantic task
Lb.Fr
1.06 1.00 0.94
Lb.Pc
(B)
Reading aloud task
1.15 12.6
1.00
14.7
15 0 16.5 mm 0.85
CN
(C)
Tk
Memory retrieval task
c GPm GPI
Stk
b da
1.25 AC-PC 1.00
0.75 NCL
Figure 18.1 Anatomical and temporal precision of recording from depth electrodes. Note: The left side of Figure 18.1 shows the anatomical location of four recording sites (a, b, c, and d) in the head of the caudate nucleus (CN) in Parkinsonism patients. The right side of the figure shows recording of neuronal activity from indwelling electrodes in these human patients. The graphs demonstrate responses of neurons recorded from site a in CN during semantic categorization task (upper graph), control task of reading the same words (middle graph), and recognition memory task with similar words discriminating words memorized the day before from new
One of these developments—neuroimaging—allows us to examine neuronal activity in terms of localized changes in blood flow or metabolism by PET or changes in blood oxygenation by functional magnetic resonance imaging (fMRI) (Toga & Mazziotta, 1996). By using tracers that bind to different transmitters, PET can also be used to examine neurotransmitter receptor density in the brain (Fischman & Badgaiyan, 2006). By measuring electrical (EEG) and magnetic (MEG) signals outside the skull, the time course of activation of different brain areas localized by fMRI can be measured (Dale et al., 2000). Use of diffusion tensor imaging, a form of MRI that traces white matter tracts can also image pathways of activation. In addition to the study of naturally occurring lesions, interrupting information flow
c18.indd Sec2:374
words (lower graph). First vertical line marks onset of word presentation for 200 ms. The second vertical line marks the onset of response cue, allowing subject to say aloud yes or no (in semantic and memory tasks) or the presented word (in reading task). Vertical axis represents a deviation of discharge rate from the mean prestimulus level. Statistical significance under each latency bin, each bin is labeled either by a dot or a bar; long, medium, and short bars correspond to P .001, P .01, and P .05. From “Activity of Human Caudate Nucleus and Prefrontal Cortex in Cognitive Tasks,” by Y. Abdullaev, N. P. Bechtereva, and K. V. Melnichuk, 1998, Behavioural Brain Research, 97, pp. 159–177.
by transcranial magnetic stimulus (TMS) can produce temporary functional lesions of pathways (see Posner, Sheese, Odludas, & Tang, 2006; Toga & Mazziotta, 1996, for a review of these and other methods). These methods provide a tool kit that can be used either alone or together to make human brain networks accessible for detailed physiological study. Sites and Sources Attention can influence processing in most areas of the brain. These areas are sites at which attention has an effect. However, the sources of these influences are much fewer. The sources are the brain networks from which attention
8/17/09 2:11:28 PM
Systems Level 375 Superior parietal lobe Posterior area
Frontal eye field Anterior cingulate gyrus
Temporoparietal junction
Frontal area
Thalamus Pulvinar
Prefrontal cortex Superior colliculus Alerting Orienting
Executive attention
Figure 18.2 Brain areas involved in various attention networks. Note: The executive network emphasized in this chapter is shown by large triangles and involves the anterior cingulate (frontal midline) and lateral prefrontal brain areas. The alerting network shown in squares involves a thalamic origin in the locus coeruleus and nodes in frontal and parietal areas. Phasic alertness involves mostly the right cerebral hemisphere while tonic alertness involves the left was well. Orienting shown in squares involves superior and inferior parietal areas as well as the frontal eye fields. Subcortical areas also involved in orienting include the superior colliculus and thalamus. Data from Fan, McCandliss, Fossella, Flombaum, and Posner (2005).
effects originate (see Raz Chapter 17). The sources of three common networks underlying attention are shown in Figure 18.2. To distinguish the brain areas that are involved in orienting from the sites at which they operate, it is useful to separate the presentation of a cue indicating where a target will occur from the presentation of the target requiring a response (Corbetta & Shulman, 2002; Posner, 1980). This methodology has been used for behavioral studies with normal people (Posner, 1980); patients (Posner, 1988) and monkeys (Marrocco & Davidson, 1998); and in studies using scalp electrical recording (Hillyard et al., 2004) and event-related neuroimaging (Corbetta & Shulman, 2002). Studies using event-related fMRI have shown that following the presentation of the cue and before the target is presented, a network of brain areas become active (Corbetta & Shulman, 2002; Hillyard et al., 2004; Kastner, Pinsk, De Weerd, Desimone, & Ungerleider, 1999). These include the superior parietal lobe, temporal parietal junction, and frontal eye fields. There is widespread agreement about the identity of these areas (see orienting areas in Figure 18.2) but there remains a considerable amount of work to do in order to understand the function of each area. When a target is presented at the cued location, it is processed more efficiently than if no cue had been presented. The brain sites influenced by orienting are those normally
c18.indd Sec2:375
used to process the target. For example, in the visual system, orienting can influence sites of processing in the primary visual cortex or in a variety of extrastriate visual areas where the computations related to the target are performed. Orienting to target motion influences area MT (V5) while orienting to target color will influence area V4 (Corbetta, Miezin, Dobmeyer, Shulman, & Petersen, 1991). This principle of activation of brain areas also extends to higher-level visual input as well. For example, attention to faces modifies activity in the face-sensitive area of the fusiform gyrus (Wojciulik, Kanwisher, & Driver, 1998). The finding that attention can modify activity in primary visual areas has been particularly important because the microcircuitry of this brain area has been more extensively studied than others (Posner & Gilbert, 1999). When multiple targets are presented, they tend to suppress the normal level of activity that would have been produced if the targets were presented in isolation (Kastner et al., 1999). This finding has become the cornerstone of one of the most popular views of attention in which emphasis is placed on competition between potential targets within each relevant brain area (Desimone & Duncan, 1995). This view places less stress on top-down control or at least emphasizes that top-down control emerges from bottom-up competition. The biased competition theory is partly based on the work in visual search that has been important in cognitive studies (Treisman & Gelade, 1980). The cognitive studies stress the function of a top-down search of the visual field. The neuroscience data suggests the array of search objects exerts a direct inhibitory effect on each other, which can be counteracted by orienting of attention. Although it is possible to have multiple visual locations when only a single attribute is important (e.g., green), it is not possible to report on multiple attributes from more than one target (Huang & Pashler, 2007). The neuro and cognitive approaches to visual search are being combined and this is an important vehicle for further integration. Localization Some have thought that the influence of imaging has been merely to tell us where in the brain things happened (Utall, 2001). Certainly many, perhaps even most, imaging studies have been concerned with anatomical issues. As Figure 18.2 illustrates several functions of attention that have been shown to involve specific anatomical areas that carry out important functions. However, imaging also probes other neural networks that underlie all aspects of human thought, feelings, and behavior (Posner & Rothbart, 2007b). Networks have been studied in all the topics shown in Table 18.1. The full significance of imaging for (a) viewing brain networks,
8/17/09 2:11:28 PM
376 Attentional Mechanisms TABLE 18.2 Sites of Attentional Effects for Each Network and the Dominant Neuro-modulator Network
Structures
Modulator
Orient
Superior parietal Temporal parietal junction Frontal eye fields Superior colliculus
Acetylcholine
Alert
Locus coruleus Right frontal Parietal cortex
Norepinephrine
Executive attention
Anterior cingulate Lateral ventral Prefrontal Basal ganglia
Dopamine
(b) examining their computation in real time, (c) exploring how they are assembled in development, and (d) their plasticity following physical damage or training, are common themes in current research that are just beginning to reach their potential. Functional neuroimaging has allowed many cognitive tasks to be analyzed in terms of the brain areas they activate, and studies of attention have been among the most often examined in this way (Corbetta & Shulman, 2002; Driver, Eimer, & Macaluso, 2004; Posner & Fan, 2008. Imaging data have supported the presence of three networks related to different aspects of attention (Fan et al., 2005). These networks carry out the functions of alerting, orienting and executive control (Posner & Fan, 2008). Figure 18.2 and Table 18.2 provide a summary of aspects of these networks. Figure 18.2 shows where key nodes of each network are located. Table 18.2 names these areas and provides information on the dominant neurotransmitter involved in each network. Next we discuss each of these networks briefly (see also Chapter 17 by Raz in this book). Alerting is defined as achieving and maintaining a state of high sensitivity to incoming stimuli; orienting is the selection of information from sensory input; and executive attention involves mechanisms for monitoring and resolving conflict among thoughts, feelings, and responses. The alerting system has been associated with thalamic as well as frontal and parietal regions of the cortex (Fan et al., 2005). A particularly effective way to vary alertness has been to use warning signals prior to targets. The influence of warning signals on the level of alertness is thought to be due to modulation of neural activity by the neurotransmitter norepinepherine (Marrocco & Davidson, 1998). Orienting involves aligning attention with a source of sensory signals. This may be overt, as when eye movements accompany movements of attention, or may occur covertly, without any eye movement. The orienting system for visual
c18.indd Sec2:376
events has been associated with posterior brain areas, including the superior parietal lobe and temporal parietal junction, and in addition, the frontal eye fields (Corbetta & Shulman, 2002). Orienting can be manipulated by presenting a cue indicating where in space a target is likely to occur, thereby directing attention to the cued location (Posner, 1980). Event-related functional magnetic resonance imaging (fMRI) studies have suggested that the superior parietal lobe is associated with orienting following the presentation of a cue (Corbetta & Shulman, 2002). The superior parietal lobe in humans is closely related to the lateral intraparietal area (LIP) in monkeys, which is involved in the production of eye movements (Andersen, 1989). When a target occurs at an uncued location and attention has to be disengaged and moved to a new location, there is activity in the temporal parietal junction (Corbetta & Shulman, 2002). Lesions of the parietal lobe and superior temporal lobe have been consistently related to difficulties in orienting (Karnath, Ferber, & Himmelbach, 2001). Executive control of attention is often studied by tasks that involve conflict, such as various versions of the Stroop task. In the Stroop task, subjects must respond to the color of ink (e.g., red) while ignoring the color word name (e.g., blue; Bush, Luu, & Posner, 2000). Resolving conflict in the Stroop task activates midline frontal areas (anterior cingulate) and lateral prefrontal cortex (Botvinick, Braver, Barch, Carter, & Cohen, 2001; Fan, Flombaum, McCandliss, Thomas, & Posner, 2003). There is also evidence for the activation of this network in tasks involving conflict between a central target and surrounding flankers that may be congruent or incongruent with the target (Botvinick et al., 2001; Fan, Fossella, Summer, Wu, & Posner, 2003). Experimental tasks may also provide a means of fractionating the contributions of different areas within the executive attention network (MacDonald, Cohen, Stenger, & Carter, 2000). The role of the anterior cingulate cortex (ACC) in modulating sensory input has been demonstrated experimentally by showing enhanced connectivity between ACC and the sensory modality to which the person is asked to attend (Crottaz-Herbette & Menon, 2006). A similar finding showed that the more ventral part of the ACC involved in emotion regulation is coupled to the amygdala during processing of negative information (Etkin, Egner, Peraza, Kandel, & Hirsch, 2006). These findings support the general idea that ACC activity regulates other brain areas and supports the distinction between the more dorsal (the flat portion of cingulate as shown in Figure 18.2) ACC related to cognitive control and the more ventral (where the ACC bends in Figure 18.2) for emotional control (Bush et al., 2000). There is also evidence that lateral prefrontal areas may be involved in the type of regulation involved in inhibiting responses. In many tasks that require the inhibition of responses, right
8/17/09 2:11:30 PM
Genetic Level 377
frontal activity has been shown to be differentially active on inhibitory (no-go) trials.
CELLULAR AND SYNAPTIC LEVEL Evolutionary Perspectives Studies have revealed a similarity between human and primate studies of alerting and orienting (Corbetta & Shulman, 2002; Marrocco & Davidson, 1998) both in behavior and in the brain areas involved (II-5). However, comparative anatomical studies point to important differences in the evolution of anterior cingulate connectivity between nonhuman primates and people. Anatomical studies show the great expansion of white matter, which has increased more in recent evolution than has the neocortex itself (Zilles, 2005). One type of projection cell called Von Economo neuron is found only in the anterior cingulate and a related area of the anterior insula (Allman, Watson, Tetreault, & Hakeem, 2005). It is thought that this neuron is important in communication between the cingulate and other brain areas. This neuron is not present at all in Macaques and expands greatly in frequency between great apes and humans. The two brain areas in which von Economo neurons are found (cingulate and anterior insula) are also shown to be in close communication even during the resting state when no task is imposed (Dosenbach et al., 2007). Moreover, there is some evidence that the frequency of this type of neuron also increases in development between infancy and later childhood (Allman et al., 2005). These neurons and the rapid and efficient connectivity they provide may be a major reason why self-regulation in adult humans can be so much stronger than in other organisms, and the development of this system may relate to the achievements in self-regulation that occurs in childhood (Posner & Rothbart, 2007a). In addition to functional connectivity, advances in MRI methods now allow noninvasive study of anatomical connections between brain regions in living human subjects that before could only be studied in cadavers. One approach used by MRI is called diffusion tensor imaging (DTI) and allows tracing of white matter tracts that connect neural areas (Conturo et al., 1999; Dougherty, Ben-Shachar, Bammer, Brewer, & Wandell, 2005; Jones, Horsfield, & Simmons, 1999). This form of imaging uses the diffusion of water molecules in particular directions due to the presence of myelinated fibers. This approach has already started providing important insights into differences in anatomical connectivity underlying reading difficulties and other highlevel cognitive disorders. As methods develop to record from many neurons, it has been possible to examine connectivity at the cellular
c18.indd Sec3:377
level (Buschman & Miller, 2007; Saalmann, Pigarev, & Vidyasagar, 2007). These studies have shown that the cells of remote areas of networks are synchronized during attention demanding tasks (Womelsdorf et al., 2007). They show how networks revealed by MRI can also be studied at the cellular level. Saalmann et al. (2007) showed that parietal activation preceded increased activity in visually selective areas. This idea fit well with fMRI findings suggesting that parietal areas can be a site part of the attention network that influences activity in visual areas (Corbetta & Shulman, 2002). How can the local actions at the various nodes of a network be coordinated in real time? Recent results from surface EEG recordings (Fan et al., 2007) and depth electrodes (Womelsdorf et al., 2007) support the idea that oscillatory neuronal electrical activity within defined frequency ranges support synchronized interactions between anatomically distinct regions during tasks (Lakatos, Karmos, Mehta, Ulbert, & Schroeder, in press). Fan et al. (2007) argues that different attentional networks may use different dominant frequencies. However, many different ranges of frequencies have been reported and there is not as yet a clear framework for predicting what frequencies will be involved. It is likely that precise coordination of neural areas by synchronized activity will be an important topic of concern during the next decade. Knight (2007) writes of the development of cellular studies of network coordination as refutation of phrenology. This appears to be true but the findings also support the network ideas that were articulated by Hebb and by imaging. According to this new view, there is localization of function within nodes of the network, but the network must work together to perform actual tasks. This view supports efforts by neuroscientists to study the microcircuitry in neural areas related to attention in order to determine how their activity produces local calculation and supports interaction among nodes.
GENETIC LEVEL At the turn of the century, the overall sequence of the human genome had been accomplished (Venter et al., 2001). Although humans have a common genome, there are differences among individuals in many genes (polymorphisms). These differences make it possible to examine genes related to individual differences in behavior and in brain activity (Goldberg & Weinberger, 2004; Mattay & Goldberg, 2004). How is it that networks are assembled during the early life of the individual? Developmental psychologists have long been interested in the problem of how children come to
8/17/09 2:11:30 PM
378 Attentional Mechanisms
be able to regulate their own emotions and behavior. However, this work is often divorced from the study of brain mechanisms (Posner & Rothbart, 2007a). In this section, we address this question by reviewing how genes and experience contribute to the development of the executive attention system. This research has involved genotyping individuals and asking how differences among their genes relate to differences in their executive attention. Individual differences are invariably found in cognitive tasks involving attention. The Attention Network Test (ANT) was developed to examine individual differences in the efficiency of the brain networks of alerting, orienting, and executive attention discussed earlier (Fan, McCandliss, Sommer, Raz, & Posner, 2002; Rueda et al., 2004). The ANT uses differences in reaction time (RT) between conditions to measure the efficiency of each network. Each trial begins with a cue (or a blank interval, in the no-cue condition) that informs the participant either that a target will be occurring soon, or where it will occur, or both. The target always occurs either above or below fixation and consists of a central arrow, surrounded by flanking arrows that can either point in the same direction (congruent) or in the opposite direction (incongruent). Subtracting RTs for congruent from incongruent target trials provides a measure of conflict resolution and assesses the efficiency of the executive attention network. Subtracting RTs obtained in the double-cue condition from RT in the no-cue condition gives a measure of alerting due to the presence of a warning signal. Subtracting RTs to targets at the cued location (spatial cue condition) from trials using a central cue gives a measure of orienting because the spatial cue, but not the central cue, provides valid information on where a target will occur. Individual differences in these networks were shown to be reliable (Fan et al., 2002). Fossella et al. (2002) found that the individual differences between the networks were not correlated across individuals. Although this may not be generally true because the networks are frequently used together, it does provide some support for a degree of independence among individuals in the efficiency of the networks. The ability to measure differences in attention raises the question of the degree to which attention is heritable. To explore this issue, Fan, Wu, Fossella, and Posner (2001) used the ANT to assess attention in monozygotic and dizygotic same-sex twins. Strong heritability of the executive network (.89) was found, some heritability of the alerting network (.18), and no apparent heritability of the orienting network. These results supported a search for genes in executive attention. The association of the executive network with the neuromodulator dopamine (see Table 18.2) was used as a way of searching for candidate genes that might relate to the efficiency of the networks (Fossella et al., 2002). To do this,
c18.indd Sec3:378
200 persons performed the ANT and were genotyped to examine frequent polymorphisms in genes related to dopamine. Significant association of two genes, the dopamine D4 receptor (DRD4) gene and monoamine oxidase a (MAOA) gene, were found with executive attention. Persons with different alleles of these two genes were compared using neuroimaging while they performed the ANT (Fan, Fossella et al., 2003). Groups with different alleles of these genes showed differences in the ability to resolve conflict as measured by the ANT and also produced significantly different activations in the anterior cingulate, a major node of the executive attention network. These results confirmed the relation between genetic alleles and neural networks related to executive attention. Recent studies have extended these observations. In two different studies employing conflict-related tasks other than the ANT, alleles of the catechol-o-methyl transferase (COMT) gene were related to the ability to resolve conflict (Blasi et al., 2005; Diamond, Briand, Fossella, & Gehlbach, 2004). A study using the child ANT showed a significant relation between the DAT1 and executive attention as measured by the ANT (cholinergic gene, the alpha 4 subunit of the neural nicotinic cholinergic receptor [CHRNA4]), was related to performance differences in the ability to orient attention during tasks involving visual attention (Rueda, Rothbart, McCandliss, Saccamanno, & Posner, 2005). Different alleles of a search (Parasuraman, Greenwood, Kumar, & Fossella, 2005), confirmed the link between orienting and the neuromodulator acetylcholine. There is also increasing evidence that the serotonin system plays a role in executive attention along with the dopamine system (Canli et al., 2005; Reuter, Ott, Vaitl, & Hennig, 2007). The relation of genetic factors to the functioning of the executive attention system does not mean that the system cannot be influenced by experience. Several training-oriented programs have been successful in improving attention in patients suffering from different pathologies. For example, the use of Attention Process Training (APT) has led to specific improvements in executive attention in patients with specific brain injury (Sohlberg, McLaughlin, Pavese, Heidrich, & Posner, 2000) as well as in children with Attention Deficit Hyperactivity Disorder (ADHD; Kerns, Esso, & Thompson, 1999). With normal adults, training with video games produced better performance on a range of visual attention tasks (Green & Bavelier, 2003). Genetic variation allows for additional influence from parenting and other experiences (Sheese, Voelker, Rothbart, & Posner, 2007). It was found that the 7 repeat allele of the DRD4 gene interacted with the quality of parenting to influence such temperamental variables in the child as activity level, sensation seeking, and impulsivity. Other research has shown similar findings for externalizing behavior of the
8/17/09 2:11:30 PM
References 379
child, as rated by the parents in the Child Behavior Checklist (Bakermans-Kranenburg & van Ijzendoorn, 2006). There is evidence that the 7 repeat allele of the DRD4 gene is under positive selective pressure (Ding et al., 2002). The Ding et al. study used molecular genetics to show the complexity of deriving the 7 repeat allele from the dominant 4 repeat. In addition to mutation, some positive selective pressure would be needed to account for the frequency of the 7 repeat. The 7 repeat allele has been associated with risk taking and also Attention Deficit Disorder (Posner, Rothbart, & Sheese, 2007). Risk taking could well increase the possibility of reproductive success. The interaction between parenting and the 7 repeat might mean that a genetic allele increases the possibility that children will be influenced by their culture, for example, through parenting style. This idea could be important for understanding the principles of why the frequency of genetic alleles changed during human evolution. Human genetic evolution and cultural evolution may be interrelated where certain genetic variations make culture influence more successful. Genes do not directly produce attention. They code for different proteins that influence the efficiency with which modulators such as dopamine are produced and/or bind to their receptors. These modulators are in turn related to individual differences in the efficiency of the attention networks. There is a great deal in common among humans in the anatomy of high-level networks. This must have a basis within the human genome. The same genes that are related to individual differences in attention are also likely to be important in the development of the attentional networks that are common to all humans. Some of these networks are also common to nonhumans. By examining these networks in animals, it should be possible to better understand the role of genes in shaping networks. Can animals perform the same tasks we have developed for humans? The answer is clearly yes (Chapter 3 by Snowdon & Cronin in this book). Monkeys have been trained to shift attention to cues and to carry out conflict tasks like those in the ANT. Rodents have also been trained in attention shifting tasks (Beane & Marrocco, 2004). These tasks make it possible to examine the role that genes play in carrying out the same attentional operations as have been studied in humans. It has also been reported that areas of the frontal midline corresponding to the anterior cingulate are activated in the mouse during trace but not delayed conditioning (Han, O’Tuathaigh, & Koch, 2004). Since trace and delayed conditioning are both very simple tasks and the two are quite similar, they could be used to measure operation of rodent brain areas that may be related to executive attention in humans. We need to develop methods of manipulating relevant genes in specific anatomical locations that are important
c18.indd Sec4:379
nodes of a particular network. Usually genes are expressed at multiple locations so that changes (e.g., knock out studies) are not specific to one brain area. Subtractive genomics is a method currently being developed to manipulate genes at specific anatomical location (Dumas et al., 2005). This method is now being employed to manipulate the DRD4 gene within the midfrontal cortex of the mouse. It should become possible to determine the specific operations performed by genes at particular locations in attentional networks. In the future, this kind of genetic analysis of network development will create a productive link between genes and the development of the networks involved in self-regulation (Posner & Rothbart, 2007b).
SUMMARY Neuroscience and psychology have been converging on the idea of cognitive tasks including attention that are carried out by networks of neural areas extending over much of the brain, but involving quite localized computation of particular subroutines or mental operations involved in the task (Knight, 2007; Posner & Rothbart, 2007b). Consistent with this view, the functions of alerting, orienting, and executive control have been shown to involve separate but partially overlapping networks. These findings make it important to have strategies for examining such questions as: (a) how are computations performed within each node of a network, (b) how do the nodes of a network communicate the results of their computations in real time, and (c) how do genes and experience combine to shape the efficiency of networks. Progress on each of these fronts is summarized in this chapter and some of the remaining questions are discussed.
REFERENCES Abdullaev, Y., Bechtereva, N. P., & Melnichuk, K. V. (1998). Activity of human caudate nucleus and prefrontal cortex in cognitive tasks. Behavioural Brain Research, 97, 159–177. Allman, J. M., Watson, K. K., Tetreault, N. A., & Hakeem, A. Y. (2005). Intuition and autism: A possible role for Von Economo neurons. Trends in Cognitive Sciences, 9(8), 367–373. Andersen, R. A. (1989). Visual eye movement functions of the posterior parietal cortex. Annual Review of Neuroscience, 12, 377–403. Bakermans-Kranenburg, M. J., & van Ijzendoorn, M. H. (2006). Geneenvironment interaction of the dopamine D4 receptor (DRD4) and observed maternal insensitivity predicting externalizing behavior in preschoolers. Developmental Psychobiology, 48, 406–409. Beane, M., & Marrocco, R. (2004). Cholinergic and noradrenergic inputs to the posterior parietal cortex modulate the components of exogenous attention. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 313–325). New York: Guilford Press.
8/17/09 2:11:30 PM
380 Attentional Mechanisms Bechtereva, N. P. (1978). The neurophysiological aspects of human mental activity. New York: Oxford University Press. Bechtereva, N. P., & Abdullaev, Y. (2000). Depth electrodes in clinical neurophysiology: Neuronal activity and human cognitive function. International Journal of Psychophysiology, 37, 11–29. Bechtereva, N. P., Medvedev, S. V., Abdullaev, Y., Melnichuk, K. V., & Gurchin, F. A. (1989). Psychophysiological micromapping of the human brain. International Journal of Psychophysiology, 8, 107–135. Blasi, G., Mattay, G. S., Bertolino, A., Elvevåg, B., Callicott, J. H., Das, S., et al. (2005). Effect of cCatechol-O-Methyltransferase val 158 met genotype on attentional control. Journal of Neuroscience, 25(20), 5038–5045. Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Reviews, 108, 624–652. Broadbent, D. E. (1958). Perception and communication. New York: Pergamon Press. Buschman, T. J., & Miller, E. K. (2007, March 30). Top-down versus bottom up control of attention in the prefrontal and parietal cortex. Science, 315, 1860–1862. Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional influences in the anterior cingulate cortex. Trends in Cognitive Sciences, 4, 215–222. Canli, T., Omura, K., Haas, B. W., Fallgatter A., Todd, R., Constable, R. T., et al. (2005). Beyond affect: A role for genetic variation of the serotonin transporter in neural activation during a cognitive attention task. Proceedings of the National Academy of Sciences, USA, 102, 12224–12229. Cherry, E. C. (1953). Some experiments on the recognition of speech with one and two ears. Journal of the Acoustical Society, 25, 975–979. Conturo, T. E., Lori, N. F., Cull, T. S., Akbudak, E., Snyder, A. Z., Shimony, J. S., et al. (1999). Tracking neuronal fiber pathways in the living human brain. Proceedings of the National Academy of Sciences, USA, 96, 10422–10427. Corbetta, M., Miezin, F. M., Dobmeyer, S., Shulman, G. L., & Petersen, S. E. (1991). Selective and divided attention during visual discriminations of shape, color, and speed: Functional anatomy by positron emission tomography. Journal of Neuroscience, 11, 2383–2402. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Neuroscience Reviews, 3, 201–215. Crottaz-Herbtte, S., & Mennon, V. (2006). Where and when the anterior cingulate cortex modulates attentional response: Combined fMRI and ERP evidence. Journal of Cognitive Neuroscience, 18, 766–780. Dale, A. M., Liu, A. K., Fischi, B. R., Buckner, R., Beliveau, J. W., Lewine, J. D., et al. (2000). Dynamic statistical parameter mapping: Combining fMRI and MEG for high-resolution cortical activity. Neuron, 26, 55–67.
Dougherty, R. F., Ben-Shachar, M., Bammer, R., Brewer, A. A., & Wandell, B. A. (2005). Functional organization of human occipital-callosal fiber tracts. Proceedings of the National Academy of Sciences, USA, 102, 7350–7355. Driver, J., Eimer, M., & Macaluso, E. (2004). Neurobiology of human spatial attention: Modulation, generation, and integration. In N. Kanwisher & J. Duncan (Eds.), Attention and performance XX: Functional brain imaging of visual cognition (pp. 267–300). Oxford: Oxford University Press. Dumas, T., Hostick, U., Wu, H., Spaltenstein, J., Ghatak, C., Nguyen, J., et al. (2005). Maximizing the anatomical specificity of native neuronal promotorers by a subtractive transgenic technique. Society for Neuroscience Abstracts. Engel, A. K., Moll, C. K., Fried, I., & Ojemann, G. A. (2005). Invasive recordings from the human brain: Clinical insights and beyond. Nature Reviews Neuroscience, 6, 35–47. Etkin, A., Egner, T., Peraza, D. M., Kandel, E. R., & Hirsch, J. (2006). Resolving emotional conflict: A role for the rostral anterior cingulate cortex in modulating activity in the amygdala. Neuron, 51, 871–882. Evarts, E. V. (1968). Relation of the pyramidal tract activity to force exerted during voluntary movement. Journal of Neurophysiology, 31, 14–27. Fan, J., Byrne, J., Worden, M. S., Guise, K. G., McCandliss, B. D., Fossella, J., et al. (2007). The relation of brain oscillations to attentional networks. Journal of Neuroscience, 27, 6197–6206. Fan, J., Flombaum, J. I., McCandliss, B. D., Thomas, K. M., & Posner, M. I. (2003). Cognitive and brain consequences of conflict. NeuroImage, 18, 42–57. Fan, J., Fossella, J. A., Summer T., Wu, Y., & Posner, M. I. (2003). Mapping the genetic variation of executive attention onto brain activity. Proceedings of the National Academy of Sciences, USA, 100, 7406–7411. Fan, J., McCandliss, B. D., Fossella, J., Flombaum, J. I., & Posner, M. I. (2005). The activation of attentional networks. NeuroImage, 26, 471–479. Fan, J., McCandliss, B. D., Sommer, T., Raz, M., & Posner, M. I. (2002). Testing the efficiency and independence of attentional networks. Journal of Cognitive Neuroscience, 3(14), 340–347. Fan, J., Wu, Y., Fossella, J., & Posner, M. I. (2001). Assessing the heritability of attentional networks. BioMedCentral Neuroscience, 2, 14. Fischman, A. J., & Badgaiyan, R. D. (2006) Neurotransmitter imaging. In M. Charron (Ed.), Pediatric PET (pp. 385–403). New York: Springer; pp. 385–403. Fossella, J., Sommer, T., Fan, J., Wu, Y., Swanson, J. M., Pfaff, D. W., et al. (2002). Assessing the molecular genetics of attention networks. BMC Neuroscience, 3, 14.
Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222.
Georgopoulos, A. P., Lurito, J. T., Petrides, M., Schwartz, A. B., & Massey, J. T. (1989, January 13). Mental rotation of the neuronal population vector. Science, 243, 234–236.
Diamond, A., Briand, L., Fossella, J., & Gehlbach, L. (2004). Genetic and neurochemical modulation of prefrontal cognitive functions in children. American Journal of Psychiatry, 161, 125–132.
Goldberg, T. E., & Weinberger, D. R. (2004). Genes and the parsing of cognitive processes. Trends in Cognitive Sciences, 8, 325–335.
Ding, Y. C., Chi, H. C., Grady, D. L., Morishima, A., Kidd, J. R., Kidd, K. K., et al. (2002). Evidence of positive selection acting at the human dopamine receptor D4 gene locus. Proceedings of the National Academy of Sciences, USA, 99, 309–314. Donchin, E., & Cohen, L. (1967). Average evoked potentials and intermodal selective attention. Electroencephalography and Clinical Neurophysiology, 22, 537–546. Dosenbach, N. U. F., Fair, D. A., Miezin, F. M., Cohen, A. L., Wenger, K. K., Dosenbach, R. A. T., et al. (2007). Distinct brain networks for adaptive and stable task control in humans. Proceedings of the National Academy of Sciences, USA, 104, 11073–11078.
c18.indd Sec4:380
Green, C. S., & Bavelier, D. (2003). Action video game modifies visual selective attention. Nature, 423, 434–437. Han, C. J., O’Tuathaigh, C. M., & Koch, C. (2004). A practical assay for attention in mice. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 294–312). New York: Guilford Press. Hebb, D. O. (1949). Organization of behavior. New York: Wiley. Heinze, H. J., Mangun, G. R., Burchert, W., Hinrichs, H., Scholtz, M., Muntel, T. F., et al. (1994, December 8). Combined spatial and temporal imaging of brain. Nature, 372, 543–546. Hillyard, S. A., Di Russo, F., & Martinez, A. (2004). The imaging of visual attention. In N. Kanwisher & J. Duncan (Eds.), Functional neuroimaging
8/17/09 2:11:31 PM
References 381 of visual cognition attention and performance XX (pp. 381–390). Oxford: Oxford Universiy Press.
Posner, M. I. (1980). Orienting of attention: The 7th Sir F.C. Bartlett Lecture. Quarterly Journal of Experimental Psychology, 32, 3–25.
Hink, R. F., Van Voorhis, S. T., Hillyard, S. A. & Smith, T. S. (1977). The division of attention and the human auditory evoked potential. Neuropsychologia, 15, 597–605.
Posner, M. I. (1988). Structures and functions of selective attention. In T. Boll & B. Bryant (Eds.), Master lectures in clinical neuropsychology and brain function: Research, measurement, and practice (pp. 171–202). Washington, DC: American Psychological Association.
Huang, L. Q., & Pashler, H. (2007). A boolean map theory of visual attention. Psychological Review, 114(3), 599–631. Hubel, D., & Wiesel, T. N. (1968). Receptive field and functional architecture of the monkey striate cortex. Journal of Physiology (London), 195, 215–243. James, W. (1890). Principles of psychology. New York: Holt Rinehart and Winston. Jones, D. K., Horsfield, M. A., & Simmons, A. (1999). Optimal strategies for measuring diffusion in anisotropicsystems by magnetic resonance imaging. Magnetic Resonance in Medicine, 42, 515–525. Kanwisher, N., & Duncan, J. (Eds.). (2004). Functional neuroimaging of visual cognition. Attention and performance XX. Oxford: Oxford University Press.
Posner, M. I., & Gilbert, C. D. (1999). Attention and primary visual cortex. Proceedings of the National Academy of Sciences, USA, 96, 2585–2587. Posner, M. I., & Raichle, M. E. (1994). Images of mind. New York: Scientific American. Posner, M. I., & Raichle, M. E. (Eds.). (1998). Overview: The neuroimaging of human brain function. Proceedings of the National Academy of Sciences, USA, 95, 763–764. Posner, M. I., & Rothbart, M. K. (2007a). Educating the human brain. Washington, DC: APA Books.
Karnath, H.-O., Ferber, S., & Himmelbach, M. (2001, May 3). Spatial awareness is a function of the temporal not the posterior parietal lobe. Nature, 411, 950–953.
Posner, M. I., & Rothbart, M. K. (2007b). Research on attention networks as a model for the integration of psychological science. Annual Review of Psychology, 58, 1–23.
Kastner, S., Pinsk, M. A., De Weerd, P., Desimone, R., & Ungerleider, L. G. (1999). Increased activity in human visual cortex during directed attention in the absence of visual stimulation. Neuron, 22, 751–761.
Posner, M. I., Rothbart, M. K., & Sheese, B. E. (2007). Attention genes. Developmental Science, 10, 24–29.
Kerns, K. A., Esso, K., & Thompson, J. (1999). Investigation of a direct intervention for improving attention in young children with ADHD. Developmental Neuropsychology, 16, 273–295. Knight, R. T. (2007, June 15). Neural networks debunk phrenology. Science, 316, 1578–1579.
Posner, M. I., Sheese, B. E., Odludas, Y., & Tang, Y. (2006). Analyzing and shaping human attentional networks. Neural Networks, 19, 1–8. Reuter, M., Ott, U., Vaitl, D., & Hennig, J. (2007). Impaired executive control associated with a variation in the tryp\tophan hydroxylase-2 gene. Journal of Cognitive Neuroscience, 19, 401–408.
LaBerge, D. (1995). Attentional processing. Cambridge, MA: Harvard University Press.
Rizzolatti, G., Riggio, L., Dascola, I., & Umilta, C. (1987). Reorienting attention across the horizontal and vertical meridians: Evidence in favor of the premotor theory of attention. Neuropsychologia, 25, 31–40.
Lakatos, P., Karmos, G., Mehta, A. D., Ulbert, I., Schroeder, C. (in press). Entrainment of neuronal oscillations as a mechanism of attentional selection. Science.
Rueda, M. R., Fan, J., Halparin, J., Gruber, D., Lercari, L. P., McCandliss, B. D., et al. (2004). Development of attention during childhood. Neuropsychologia, 42, 1029–1040.
Lashley, K.S. (1931). Mass action in cerebral function. Science, 73, 245–254.
Rueda, M. R., Rothbart, M. K., McCandliss, B. D., Saccamanno, L., & Posner, M. I. (2005). Training, maturation and genetic influences on the development of executive attention. Proceedings of the National Academy of Sciences, USA, 102, 14931–14936.
MacDonald, A. W., Cohen, J. D., Stenger, V. A., & Carter, C. S. (2000, June 9). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288, 1835–1838. Mackworth, J. F., & Mackworth, N. H. (1956). The overlapping of signals for decisions. American Journal of Psychology, 69, 26–47. Marrocco, R. T., & Davidson, M. C. (1998). Neurochemistry of attention. In R. Parasuraman (Ed.), The attentive brain (pp. 35–50). Cambridge, MA: MIT Press.
Rugg, M. D., & Coles, M. G. H. (1995). Electrophsyiology of mind. Oxford: Oxford University Press. Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing. Cambridge, MA: MIT Press. Saalmann, Y. B., Pigarev, I. N., & Vidyasagar, T. R. (2007, June 15). Neural mechanisms of visual attention: How top-down feedback highlight relevant visual locations. Science, 316, 1612–1615.
Mattay, V. S., & Goldberg, T. E. (2004). Imaging genetic influences in human brain function. Current Opinion in Neurobiology, 14(2), 239–247.
Scherg, M., & Berg, P. (1993). Brain electrical source anlaysis, Version 2.0. Herndon, VA: NeuroScan.
Moruzzi, G., & Magoun, H. W. (1949). Brainstem reticular formation and activation of the EEG. Electroencephalography and Clinical Neurophysiology, 1, 455–473.
Sheese, B. E., Voelker, P. M., Rothbart, M. K., & Posner, M. I. (2007). Parenting quality interacts with genetic variation in Dopamine Receptor DRD4 to influence temperament in early childhood. Developmental and Psychopathology,19, 1039–1046.
Mountcastle, V. M. (1978). The world around us: Neural command functions for selective attention. Neuroscience Research Progress Bulletin, 14, 1–47. Ojemann, G. A., Creutzfeldt, O., Lettich, E., & Haglund, M. M. (1988). Neuronal activity in human lateral temporal cortex related to shortterm verbal memory, naming and reading. Brain, 111, 1383–1403.
c18.indd Sec4:381
Posner, M. I., & Fan, J. (2008). Attention as an organ system. In J. Pomerantz (Ed.), Topics in Integrative Neuroscience (pp. 31–61). New York: Cambridge University Press.
Shulman, G. L., Remington, R. W., & McClean, J. P. (1979). Moving attention through space. Journal of Experimental Psychology: Human Perception and Performance, 5, 522–526. Sohlberg, M. M., McLaughlin, K. A., Pavese, A., Heidrich, A., & Posner, M. I. (2000). Evaluation of attention process therapy training in persons with acquired brain injury. Journal of Clinical and Experimental Neuropsychology, 22, 656–676.
Parasuraman, R., Greenwood, P. M., Kumar, R., & Fossella, J. (2005). Beyond heritability: Neurotransmitter genes differentially modulate visuospatial attention and working memory. Psychological Science, 16, 200–207.
Sutton, S., Nraren, M., Zubin, J., & John, E. R. (1965, November 26). Evoked potential correlates of stimulus uncertainty. Science, 150, 1187–1188.
Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum.
Titchener, E. B. (1909). Experimental psychology of the thought processes. New York: Macmillan.
8/17/09 2:11:31 PM
382 Attentional Mechanisms Toga, A. W., & Mazziotta, J. C. (Eds.). (1996). Brain mapping: The methods. San Diego, CA: Academic Press. Treisman, A., & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97–136. Uttal, W. R. (2001). The new phrenology: The limits of localizing cognitive processes in the brain. Cambridge, MA: MIT Press. Venter, J. C., Adams, M. D., Myers, E. W., Li, P. W., Mural, R. J., Sutton, G. G., et al. (2001, February 16). The sequence of the human genome. Science, 291, 1304–1335. Volpe, B. T., LeDoux, J. E., & Gazzaniga, M. S. (1979). Information processing of visual stimuli in an extinguished visual field. Nature, 282, 1947–1952. Walter, W. G., Cooper, R., Aldridge, V. J., McCallum, W. C., & Winter, A. L. (1964, July 25). Contingent negative variation: An electrical sigh of
c18.indd Sec4:382
sensorimotor association and expectancy in the human brain. Nature, 203, 380–384. Wojciulik, E., Kanwisher, N., & Driver, J. (1998). Covert visual attention modulates face-specific activity in the human fusiform gyrus: FMRI study. Journal of Neurophysiology, 79, 574–1578. Womelsdorf, T., Schoffelen, J.-M., Oostenveld, R., Singer, W., Desimone, R., Engel, A. K., et al. (2007, June 15). Modulalation of neuronal interactions through neural synchronization. Science, 316, 1609–1612. Wurtz, R. H., Goldberg, E., & Robinson, D. L. (1980). Behavioral modulation of visual responses in monkey: Stimulus selection for attention and movement. Progress in Psychobiology and Physiological Psychology, 9, 43–83. Zilles, K. (2005). Evolution of the human brain and comparative syto and receptor architecture. In S. Dehaene, J.-R. Duhamel, M. D. Hauser, &
8/17/09 2:11:31 PM
Chapter 19
Mental Imagery STEPHEN M. KOSSLYN, GIORGIO GANIS, AND WILLIAM L. THOMPSON
on the basis of his auditory mental imagery, well after he could no longer hear a sound. Similarly, you can probably visualize what it would be like to sit on the back of an elephant and see over the top of the animal’s head, even though you have never had the experience. Imagery has played a central role in theories of mental function at least since the time of Plato. It has fallen in and out of fashion, in large part because it is inherently a private affair—by definition restricted to the confines of one’s mind. Thus, imagery has been difficult to study. In fact, in 1913 the founder of behaviorism (the school of psychology that focused solely on observable stimuli, responses, and the consequences of responses), John B. Watson, denied that mental images even existed. Instead, he suggested, thinking consists of subtle movements of the vocal apparatus (Watson, 1913). Even after the so-called cognitive revolution of the late 1950s, when the mind was likened to computer software, mental imagery carried a whiff of disrepute. At least in North America, behaviorism has cast a very long shadow. In spite of the fact that Alan Paivio (1971) and his colleagues were able to show that the use of imagery dramatically improves memory, many researchers were not convinced that imagery is a distinct form of thought. Watson’s position was filtered and refracted through the lens of computationally oriented cognitive science 60 years later by Zenon Pylyshyn. This theorist championed the view that mental images are not “images” at all, but rather rely on mental descriptions no different in kind from those that underlie language. According to Pylyshyn (1973, see also, Pylyshyn, 2002, 2003a, 2003b), the pictorial aspects of imagery that are evident to conscious experience are entirely epiphenomenal, like the heat thrown off
Mental imagery has until recently fallen within the purview of philosophy and cognitive psychology. Although both enterprises have raised important questions, they have also encountered significant obstacles when trying to answer these questions. With the advent of cognitive neuroscience, many of these questions are now empirically tractable. Neuroimaging studies, combined with other methods (such as studies of brain-damaged patients and of the effects of transcranial magnetic stimulation) are revealing the ways in which imagery draws on mechanisms used in other activities, such as perception and motor control. Because of its close relation to these basic processes, imagery holds promise of becoming one of the best-understood “higher” cognitive functions.
MENTAL IMAGES IN THE PERCEIVING AND ACTING BRAIN What shape are Mickey Mouse’s ears? Most people report that when they visualize the cartoon rodent’s head, they “see” that his ears are round. Such an experience is a hallmark that certain kinds of representations are being processed, namely those that underlie visual mental imagery. Mental imagery occurs when perceptual information is accessed from memory, giving rise to the experience of “seeing with the mind’s eye,” “hearing with the mind’s ear,” and so on. In contrast, perception occurs when information is registered directly from the senses. Mental images need not be simply the recall of previously perceived objects or events; they also can be created by combining and modifying stored perceptual information in novel ways. For example, Beethoven wrote whole symphonies entirely
Preparation of different parts of this chapter was supported by the National Science Foundation (Grant No. REC-0411725) and the National Institutes of Health (Grant No. 2 R 01 MH60734). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or the National Institutes of Health. Parts of this chapter were based on an earlier review, published by Kosslyn, Ganis, and Thompson (2001), adapted with permission of the publisher. 383
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c19.indd 383
8/17/09 2:11:53 PM
384
Mental Imagery
by a light bulb when you read (which plays no role in the reading process). The emergence of cognitive neuroscience has opened a new chapter in the study of imagery. An enormous amount has been learned about the neural underpinnings of visual perception, memory, emotion, and motor control. Much of this information has come from the study of animal models. Unlike language and reasoning, these more basic functions have many common features among higher mammals—including humans. In addition, neuroimaging technologies, especially positron emission tomography (PET) and functional magnetic resonance imaging (fMRI), offer new ways to test theories of imagery in humans. Researchers have taken advantage of these developments to show that mental imagery uses much of the same neural machinery as perception in the same modality and can engage mechanisms used in memory, emotion, and motor control. In this chapter, we draw on results from a variety of methods, including studies of the effects of selective brain damage on behavior, neuroimaging, and studies examining the effects of transcranial magnetic stimulation (TMS). Each method has its strengths and weaknesses, but they are complementary. Thus, for example, neuroimaging provides correlational (not causal) data (when engaged in a particular task, a particular set of brain areas is activated) but can monitor the entire brain; TMS can be used to establish the causal roles of processes supported by distinct brain areas (e.g., by showing that performance in a task that draws on a specific brain area is impaired when TMS is used to alter neural functioning in that area). TMS must be targeted to a specific location, and often TMS researchers rely on prior findings from fMRI and PET to guide them to stimulate relevant brain loci. Not only are the techniques complementary, but they also provide convergent evidence for specific inferences about the nature of cognition and its neural foundations. That is, if the same conclusion is reached using different methods, then it can be taken more seriously. In this chapter, we briefly review major classes of imagery research. We begin with visual mental imagery, and then turn to auditory imagery, and conclude by considering so-called motor imagery. Along the way, we review convergent evidence from many sources, and emphasize that imagery relies on mechanisms also used in perception, which can affect not only cognition but also action.
VISUAL MENTAL IMAGERY AND PERCEPTION We begin with visual mental imagery, which is the most common and by far the most intensively studied modality.
c19.indd 384
In the following sections, we briefly review four main classes of research, which show that visual imagery: engages brain mechanisms also used in perception, is carried out by a system of distinct processes, engages even the earliest visual cortex (Areas 17 and 18) during some forms of imagery, and can activate the autonomic and limbic systems during visualization of emotional material. Shared Mechanisms in Imagery and Perception Well over 100 years ago, researchers described brain-damaged patients who had lost the ability to form visual mental images after they became blind (for review, see Farah, 1984; see also Bartolomeo, 2002; Chatterjee & Southwood, 1995). Methods from cognitive psychology and neuropsychology have allowed researchers to characterize such deficits with increasing precision. For example, some patients have perceptual deficits in only one of the two major cortical visual functions supported by the two major visual pathways. One such pathway runs from the occipital lobe down to the inferior temporal lobes (the so-called “ventral stream,” or the “what” system, or the “object properties processing” pathway; see Chapter 11; Ungerleider & Mishkin, 1982); when damaged, the animal or person cannot easily recognize shapes or, in some cases, color. The other pathway runs from the occipital lobe to the posterior parietal lobes (the so-called “dorsal stream,” or the “where” system, or the “spatial properties processing” pathway); when damaged, the animal or person cannot easily register spatial properties, such as locations. For present purposes, a critical point is that the parallel deficits appear in imagery: damage to the ventral stream disrupts the ability to visualize shapes (as used, for example, to determine from memory whether George Washington had a beard), whereas damage to the dorsal stream disrupts the ability to visualize locations (as used, for example, to indicate the locations of furniture in a room when your eyes are closed; cf. Levine, Warach, & Farah, 1985). Very subtle deficits can occur in imagery that parallel the deficits found in perception. For example, some brain-damaged patients can no longer distinguish colors perceptually or in imagery (De Vreese, 1991) and others can no longer distinguish faces perceptually or in imagery (Young, Humphreys, Riddoch, Hellawell, & de Haan, 1994, for review, see Ganis, Thompson, Mast, & Kosslyn, 2003). However, although the deficits in imagery and perception often parallel each other (Ganis et al., 2003), this is not always the case. In a seminal literature review and analysis, Farah (1984) showed that some patients have selective problems in generating images (i.e., producing them on the basis of information stored in memory) even though they are able to recognize and identify perceptual stimuli.
8/17/09 2:11:53 PM
Nature of Imagery Representation: Imagery and the Early Visual Cortex 385
In addition, patients have been reported who could visualize but had impaired perception (e.g., Bartolomeo, 2002; Behrmann, Winocur, & Moscovitch, 1992; Jankowiak, Kinsbourne, Shalev, & Bachman, 1992). In short, the results from research with brain-damaged patients suggest that visual mental imagery and visual perception share many common mechanisms, but do not draw on identical processes. Although shape, location, and surface characteristics may be represented and interpreted in comparable ways during both functions, the two differ in key ways: Imagery, unlike perception, does not require low-level organizational processing. And perception, unlike imagery, does not require us to activate information in memory when the stimulus is not present. For reviews of the relationship between imagery and memory, see Behrmann (2000) and Kosslyn, Thompson, and Ganis (2006). The results of neuroimaging studies that compare imagery and perception have dovetailed nicely with those from studies of brain-damaged patients. One study, for example, found that of all the brain areas activated during perception and during imagery, approximately two-thirds were activated in common (Kosslyn, Thompson, & Alpert, 1997). Another study that used more similar imagery and perception stimuli and tasks found that over 90% of the same parts of the brain are activated in common during visual mental imagery and perception (Ganis, Thompson, & Kosslyn, 2004). Presumably, lesions in the areas not activated in common produce the dissociations, when imagery or perception is disrupted independently, whereas lesions in the areas activated in common produce the more frequently reported parallel deficits in imagery and perception. Structure of Visual Mental Imagery Processing The distinction between shape-based imagery (which relies on the ventral visual stream) and spatial imagery (which relies on the dorsal visual stream) is important not simply because it has allowed researchers to document parallels between imagery and perception; this distinction shows that imagery is not a single, undifferentiated ability, but rather relies on sets of distinct processes. Indeed, studies of deficits following brain damage have underscored the fact that imagery—like all other cognitive functions—is accomplished by a collection of abilities, each of which can be disrupted independently. For example, some patients can make imagery judgments about the shape or color of objects but have difficulty imagining an object rotating (for instance, when trying to decide whether the letter p would be another letter when rotated 180 degrees, or whether z would be another letter when rotated 90 degrees clockwise). Other patients have the reverse pattern of deficits. In addition, when participants perform different imagery tasks while
c19.indd 385
their brain activity is monitored, different patterns of activation are observed. For example, when participants mentally rotate patterns, their parietal lobes (often bilaterally) and right frontal lobes typically are strongly activated (e.g., Cohen et al., 1996; Jordan, Heinze, Lutz, Kranowski, & Jancke, 2001; Kosslyn, DiGirolamo, Thompson, & Alpert, 1998; Ng et al., 2001; Richter et al., 2000; Wraga, Shephard, Church, Inati, & Kosslyn, 2005). In contrast, if they are asked to visualize previously memorized patterns of stripes and judge which are longer, wider, and so on (all on the basis of their mental images, with eyes closed), these areas are not activated, but other areas in the occipital lobe and left association cortex are activated (Kosslyn et al., 1999; Thompson, Kosslyn, Sukel, & Alpert, 2001). Underscoring the fact that imagery is not a single ability, findings from neuroimaging studies have shown that different sets of areas are activated when different types of imagery tasks are used (Downing, Chan, Peelen, Dodds, & Kanwisher, 2006; Haxby et al., 2001; Kanwisher & Yovel, 2006; O’Craven & Kanwisher, 2000; Thirion et al., 2006). Brain activation during mental imagery may vary according to the type of object that is visualized. Using fMRI, O’Craven and Kanwisher (among others) found activation in the fusiform face area (FFA; see Kanwisher, McDermott, & Chun, 1997) when participants visualized faces; in contrast, when participants visualized indoor or outdoor scenes depicting a spatial layout, these researchers found activation in the parahippocampal place area (PPA). There was no hint of activation of the PPA during face imagery nor of the FFA during place imagery. These results are similar to what was observed when participants actually perceived faces and places. The findings document that imagery and perception share very specific, specialized mechanisms. Hasson, Harel, Levy, and Malach (2003) suggest that seven brain areas represent information in different categories; these areas are in the occipito-temporal cortex, near the early visual areas, and include face-, object-, and building-related areas. Moreover, functional neuroimaging evidence shows that during imagery, some of these category-selective areas are activated predominantly via inputs from prefrontal and parietal cortex, whereas in perception, these regions are activated predominantly bottomup, on the basis of inputs from early visual areas (Mechelli, Price, Friston, & Ishai, 2004).
NATURE OF IMAGERY REPRESENTATION: IMAGERY AND THE EARLY VISUAL CORTEX A large portion of research on the neural bases of imagery focuses on whether the early visual cortex is activated
8/17/09 2:11:53 PM
386
Mental Imagery
during imagery (for a review, see Kosslyn & Thompson, 2003; Kosslyn, Thompson, & Ganis, 2006). The early visual cortex comprises Areas 17 and 18, the first two cortical areas to receive input from the eyes. Researchers have wanted to know whether visual imagery activates these early areas for three main reasons. First, these areas are known to be topographically organized; that is, they preserve (roughly) the local spatial geometry of the retina—and thus patterns of activation in them serve to depict shape. If these areas are activated during imagery, and such activation plays a functional role, this would be evidence that imagery relies on representations that depict information, not describe it. (In other words, this would be evidence that mental imagery relies on actual depictive images represented in the brain.) Such a finding would have implications for more general questions, such as whether there exists more than one language of thought—that is, whether all thought consists of symbolic (propositional) representations or whether at least some of the representations used in cognition are depictive (or spatially analogous to the objects they represent). Second, such findings cannot be explained by appeal to “tacit knowledge,” which Pylyshyn (1981, 2002, 2003a, 2003b) used to explain away the findings from earlier behavioral experiments that attempted to demonstrate that imagery relies on depictive representations. According to this view, participants in imagery experiments may have unconsciously tried to imitate what they thought they would have done in the corresponding perceptual situation (such as by taking more time to scan farther distances across an imaged scene). But such tacit knowledge, stored as descriptions, would not explain why the early visual cortex would be activated when participants had their eyes closed during imagery. Third, if imagery can alter the activation of the early visual cortex, this suggests that one’s knowledge and expectations can (at least under some circumstances) modulate what one actually sees during perception. And this finding would have clear-cut implications for the reliability of eyewitness testimony and the veracity of visual memory more generally. More than 50 neuroimaging studies have examined activation in early visual cortex (for reviews, see Kosslyn & Thompson, 2003; Thompson & Kosslyn, 2000). The studies used, in decreasing order of sensitivity, fMRI, PET, and single photon emission computer tomography (SPECT). According to Kosslyn and Thompson (2003), 19 fMRI, 8 PET, and 2 SPECT studies reported activation in early visual cortex, compared to 8 fMRI, 15 PET, and 7 SPECT studies that reported no such activation. (Note that given the statistical thresholds involved, chance results would not result in half the studies finding this activation and half
c19.indd 386
not finding it; rather, the ratio might approximate 1 in 20 studies that would report activation, if chance alone were at work.) The following studies seem to provide strong support for the claim that the early visual cortex is activated during at least some forms of visual mental imagery. Kosslyn, Thompson, Kim, and Alpert (1995) asked participants to visualize line drawings of objects at different sizes (as if they fit into squares of different dimensions that were memorized before the PET scan) and used auditory cues to make specific judgements. Not only was Area 17 activated, compared to a control condition in which identical auditory cues were provided but no imagery was used, but also the specific locus of activation depended on the size of the visualized object. Even though their eyes were closed, the mere fact of visualizing an object at a larger size shifted the activation to more anterior parts of the calcarine sulcus (the major anatomical landmark of Area 17)—just as is found for larger objects in perception proper (e.g., Sereno et al., 1995). This result was replicated by Tootell, Hadjikani, Mendola, Marrett, and Dale (1998) using fMRI and a precise method to localize Area 17. There is no doubt that varying the size of objects in mental images shifts the locus of activation along Area 17 comparably to what occurs in perception. In addition, Klein, Paradis, Poline, Kosslyn, and Le Bihan (2000) used event-related fMRI to chart activation in Area 17 when visual mental images were formed. They found clear activation in every participant, with a clearcut temporal pattern; activation began about 2 seconds after an auditory cue, and peaked around 4 to 6 seconds later, before dropping off during the next 8 seconds or so. Moreover, Klein et al. (2004) found that the activation in visual cortex was related to the orientation of the imaged stimulus; depending on whether the bow-tie-shaped stimulus was vertical or horizontal, the pattern of activation appropriately matched the cortical representation of either the vertical or horizontal meridian. Moreover, also using fMRI, Slotnick, Thompson, and Kosslyn (2005) presented rotating and flickering checkerboard wedges to map the precise retinotopy of each participant. In a separate imagery condition, participants were asked to reproduce the flickering wedges in their mind’s eye. The imagery retinotopic maps were similar to the maps produced in the perception condition and often were more similar to perception than were the maps produced by an attention-based control condition. But is such activation playing a functional role in imagery? In another study, participants memorized four quadrants, each with black-and-white stripes, which varied in length, width, orientation, and separation (Figure 19.1, top panel), and later were asked to visualize them and make subtle shape comparisons, such as which set had longer or
8/17/09 2:11:53 PM
Nature of Imagery Representation: Imagery and the Early Visual Cortex 387
c19.indd 387
(A)
1
2
3
4
(B) 3,500 2
3,000
2,500 Response time (ms)
wider stripes (Kosslyn et al., 1999). PET scanning revealed that Area 17 was activated during this task. Moreover, in another group of participants, repetitive TMS (rTMS) was applied to Area 17 prior to the same shape comparison tasks; rTMS causes neurons in the cortex beneath the magnetic coil to respond sluggishly to subsequent events (within a brief period of time). Following rTMS to the posterior occipital lobe, every participant required more time to make these judgments than when rTMS was applied so that it did not affect Area 17 (Figure 19.1, bottom panel). The magnitude of the decrement in performance was the same when participants had their eyes closed and visualized the stripes (imagery) as when they had their eyes open and made judgments based on visible stripes (perception). This makes sense if Area 17 was critical in both the imagery and perceptual versions of the task. (For more information on TMS applied to the visual system, see Kammer, Puls, Erb, & Grodd, 2005; Kammer, Puls, Strasburger, Hill, & Wichmann, 2005). Consistent with this finding, Farah, Soso, and Dasheiff (1992) reported that after one occipital lobe was surgically removed from a patient (as part of a medical treatment), the apparent size of the patient's mental images decreased by approximately half—as expected if each occipital lobe represents the contralateral part of space. Finally, in another PET study, participants closed their eyes and visualized named letters of the alphabet, in upper case form (Kosslyn, Thompson, Kim, Rauch, & Alpert, 1996). Four seconds after forming the image, they were asked to judge whether the letter had a specific characteristic (such as any curved lines); the response times and error rates were recorded at the same time that the participants’ brains were scanned. Not only were variations in the level of activation in Area 17 significantly correlated with the time participants required to make the judgments, but this correlation was present even after all other correlations between variations in regional cerebral blood flow and response time were statistically removed. In summary, these results indicate that: (a) Activation in Area 17 is systematically related to spatial properties of the imaged object (specifically size and orientation); (b) if Area 17 is impaired, via TMS or removal of the occipital lobe in one hemisphere, so is the use of visual imagery; and (c) the activation in Area 17 is not likely to be an artifact of activation in other areas, which is merely incidentally sent (via neural connections) to Area 17. Given these positive results, why have so many studies failed to find activation in Area 17? Kosslyn and Thompson (2003) performed a meta-analysis of neuroimaging studies, that pinpointed three factors which account for the variability among findings. First, not surprisingly, the sensitivity of the technique is important (note the proportion of fMRI studies that detected such activation versus those that did
4 2,000 1 3 1,500 2
5
1,000
5,00
0
4 3 5 1
Sham Real Perception
Sham Real Imagery
Figure 19.1 Stimulus display (top) and Results (bottom) Note: (Top) Prior to the imagery condition, the participants memorized the stimulus display. They also learned which quadrants were labeled by the numbers 1, 2, 3, and 4. During the imagery task, the participants visualized the entire display, and then listened to the stimuli. Their task was to decide whether the stripes in the quadrant named first had a pattern that was greater on the named dimension (e.g., longer stripes) than the stripes in the quadrant named second; if so, they were to press the pedal under their left foot, if not, the pedal under their right foot. The participants were told that they should visualize the entire display, and “look” at the image in order to make the discrimination. Repetitive TMS to the medial occipital cortex was performed in a separate group of participants immediately prior to the same task. During real rTMS, the center of the coil targeted the tip of the calcarine fissure. During sham rTMS, the induced magnetic field did not enter the brain, although the touch on the scalp and the sound of the coil's being activated were comparable to those in the real rTMS condition. (Bottom) Response times in the imagery task for each individual participant (N⫽ 5) after the rTMS are illustrated; as evident, performance degraded in the Real TMS condition, in both perception and imagery. From “The Role of Area 17 in Visual Imagery: Convergent Evidence from PET and rTMS,” by S. M. Kosslyn et al., 1999, Science, 284, pp. 167–168. Adapted with permission.
8/17/09 2:11:54 PM
388
Mental Imagery
not, 19:8, compared to the corresponding proportion for the much less sensitive SPECT technique, 2:7). Second, the meta-analysis revealed that if a task requires participants to find a high-resolution detail in an image (such as by evaluating the shape of an animal’s ears or comparing two similar sets of stripes), activation in the early visual cortex is likely. Third, if a task requires a spatial judgment, activation is less likely. Many of the studies that did not report activation in the early visual cortex used spatial tasks (Mellet et al., 1996, 2000; Mellet, Tzourio, Denis, & Mazoyer, 1995). In contrast, spatial imagery tasks activated the parietal lobes, but not the early visual cortex (Thompson & Kosslyn, 2000). A second puzzle is why some brain-damaged patients continue to have some use of imagery in spite of the fact that the early visual cortex has been severely damaged (e.g., Bartolomeo, 2002; Chatterjee & Southwood, 1995). Probably the most straightforward account for this finding is that the early visual cortex is not necessary for all forms of visual imagery. Crick and Koch (1995) make a good case that the experience of visual perception does not arise from the early visual cortex, but rather from later areas that receive input from the earlier ones. The same is probably true in imagery. If so, then when later areas are activated in the absence of the appropriate immediate sensory input, one may experience visual imagery. However, such later areas do not make fine spatial variations accessible to later processes, and hence one apparently needs to reconstruct the local geometry in earlier areas (which have much smaller receptive fields, and hence higher resolution) if one must extract fine-grained details from the imaged object. Visual Imagery and Emotion If visual mental imagery engages many of the mechanisms used in perception, then we should not be surprised that imagery of emotional events activates the autonomic nervous system and (as also evident in single-cell recordings in humans) the amygdala. Thus, we would expect visualizing an object to have many of the same effects on the body as actually seeing the object. And in fact, Lang, Greenwald, Bradley, and Hamm (1993) showed that skin conductance increases, as do heart rate and breathing rate, when participants view pictures of threatening objects. And the same result occurs when they merely visualize the objects. Kosslyn, Shin, et al. (1996) found that mental images of aversive stimuli activate the anterior insula, the major cortical site of feedback from the autonomic nervous system. In addition, Kreiman, Koch, and Fried (2000) recorded from single cells in the human brain (hippocampus, amygdala, enthorinal cortex, and parahippocampal gyrus) while participants were shown pictures or formed mental images of those same pictures. Some of the cells that responded
c19.indd 388
selectively when participants viewed specific visual stimuli (e.g., faces) also responded selectively when those same stimuli were visualized. Of particular interest, this pattern was seen in the amygdala, which is known to play a key role in certain emotions, especially fear and anger (LeDoux, 1995, 1996). Thus, visual mental imagery can engage neural structures also engaged in perception, and those neural structures in turn can engage both the autonomic and limbic systems.
AUDITORY IMAGERY Although visual mental imagery has received the most attention from researchers, imagery is not limited to this modality. For instance, try to answer this question: Do the first three notes of the children’s song “Three Blind Mice” ascend or descend? Most people report that they “hear” the song in the process of deciding. Although the literature on the neural bases of auditory imagery is not as rich as that on visual imagery, progress has been made. For example, in one seminal study, Zatorre and Halpern (1993) studied brain-damaged patients to discover whether specific brain areas are critical for auditory imagery. They studied a group of patients who had had the left or right temporal lobe removed (for the treatment of otherwise intractable epilepsy) and compared them to similar control participants. In one condition, the participants heard a familiar song while also reading the lyrics, and judged which of two particular words had the higher pitch. In another condition, the participants saw the lyrics and made the same judgments, but did not actually hear the song—and thus had to rely on their auditory mental imagery. The patients with right-temporal lesions were impaired in both conditions, compared to both other groups. These findings demonstrate that at least some of the neural structures that play a key role in pitch discrimination in perception also play a comparable role in auditory mental imagery. Most research on auditory imagery has focused on imagery for music. For example, Zatorre, Halpern, Perry, Meyer, and Evans (1996) asked whether auditory imagery draws on the same mechanisms used in auditory perception. Their participants either listened to songs and judged the relative pitch of pairs of words, or imagined hearing songs and made the same judgments. No auditory stimulation was present during the baseline condition, which required the participants to judge the relative length of visually presented words. PET revealed that many of the same areas were in fact activated in common in the auditory tasks, including bilateral associative auditory cortex (BA 21/22, in spite of the fact that the left temporal lobe has often been identified with the perception of language and
8/17/09 2:11:54 PM
Motor Imagery
the right with music or environmental sounds), the bilateral frontal cortex (BA45/9 and 10/47), the left parietal cortex (BA 40/7), and the supplementary motor cortex (BA 6). The bilateral activation in the associative auditory cortex observed in this study, in apparent contrast with the patient studies, may reflect the fact that these researchers used verbal melodies. In a subsequent study, Halpern and Zatorre (1999) asked musically trained participants to listen to the opening notes of familiar (nonverbal) melodies and then continue “hearing the melody with the mind’s ear.” Again using PET, they found activation in two regions of the right temporal lobe (the superior and inferior temporal cortex), which is consistent with their earlier study of brain-damaged patients; both of these areas are involved in storing and interpreting nonverbal sounds. Moreover, auditory imagery of a melody that required retrieval from memory also activated two right-hemisphere regions, in the frontal lobe and superior temporal gyrus (which is critical for auditory perception). Finally, the supplementary motor area (SMA) was also activated by auditory imagery, regardless of whether the melody was retrieved or simply rehearsed online. This is interesting because no overt behavior was required. Halpern and Zatorre infer that stored movements are used in this sort of imagery—which makes sense for verbal melodies, where one can subvocalize the tune as part of the process of retrieving the information. A more recent review finds that “neural activity in auditory cortex can occur in the absence of sound . . . and that this activity likely mediates
389
the phenomenological experience of imagining music” (Zatorre & Halpern, 2005, p. 9; Figure 19.2). Finally, Griffiths (2000) reports a novel study of patients who became deaf and then hallucinated hearing music. These patients were neither psychotic nor beset with an obvious neurological problem, such as epilepsy. Griffiths was able to perform PET while the patients had such hallucinations, and reports that the posterior temporal lobes, in the auditory cortex, were activated as well as several other areas (specifically, the right basal ganglia, the cerebellum, and the inferior frontal cortices). In short, auditory imagery appears to draw on most of the neural structures used in auditory perception. However, unlike in visual imagery, there is no discernable evidence that the first auditory cortical area to receive input from the ears, Area A1, is activated during auditory imagery. MOTOR IMAGERY At the outset, we asserted that mental imagery occurs when perceptual information is accessed from memory instead of arising from immediate sensory input. Given this characterization, how can we conceptualize motor imagery, which occurs when people imagine moving in some way? In our view, such imagery does not arise directly from producing the relevant outputs to the motor system, but rather is a result of activating the kinesthetic feedback sensations one would feel if one moved in a specific way. Does this mean that motor imagery and kinesthetic imagery are identical? No, because one can imagine a sensation (e.g., of a feather being used to tickle the back of your neck) without imagining a movement. Motor imagery occurs when a movement is mentally simulated, which leads to the kinesthetic sensations of making that movement. In this section, we review evidence that motor imagery engages neural mechanisms involved in physical movement, that motor imagery is one strategy that people use to transform objects in images, that primary motor cortex is involved in at least some cases of motor imagery, and that mechanisms involved in imitation may also play a role when people use motor imagery to practice an activity. Motor Imagery and Physical Movement
Figure 19.2 A lateral view of the right hemisphere illustrates a hemodynamic increase (darker gray areas, as measured by fMRI), during auditory imagery. Note: Although participants receive no actual auditory input (the task is performed in a silent environment), activation occurs in the posterior superior temporal gyrus, a region of the auditory cortex. From “Mental Concerts: Musical Imagery and Auditory Cortex,” by R. J. Zatorre and A. R. Halpern, 2005, Neuron, 47, pp. 10. Copyright 2005 Elsevier. Reprinted with permission.
c19.indd 389
Researchers have produced an impressive body of evidence that motor imagery can in fact simulate the corresponding actual behavior. For instance, when people are asked to imagine walking to a specific goal placed in front of them and to indicate when they would have arrived, their estimates of transit time are remarkably similar to the actual time they subsequently require to walk that distance (Decety & Jeannerod, 1995).
8/17/09 2:11:54 PM
390
Mental Imagery
Many studies have now been carried out to investigate the neural bases of such motor imagery, and to distinguish motor imagery from purely visual imagery (Tomasino, Borroni, Isaja, & Rumiati, 2005; Wraga et al., 2005; Wraga, Thompson, Alpert, & Kosslyn, 2003). Although visual imagery may often accompany motor imagery, researchers have documented that motor imagery relies on distinct mechanisms. Specifically, many researchers have shown that the cortex used in movement control also plays a role in motor imagery. In a classic study, Georgopoulos, Lurito, Petrides, Schwartz, and Massey (1989) recorded activity in individual neurons in the motor strip of monkeys while the animals were planning to move a lever along a specific arc. They found that these neurons fired in a systematic sequence, depending on their orientation tuning. At first, only neurons tuned for orientations near the starting position of the lever fired, followed by those tuned for orientations slightly farther along the trajectory, and so on. All of this occurred before the animal actually began moving. These findings do not, however, show that the processing underlying motor imagery occurs in the motor strip itself; it is possible that the computation takes place elsewhere in the brain (e.g., the posterior parietal lobes), and that the results of such computation are simply being executed in the motor strip. A host of neuroimaging studies of mental rotation— that is, imagining incrementally changing an object’s orientation—have now been reported, all of which have shown that multiple brain areas are activated during mental rotation. For example, Richter et al. (2000) measured brain activation with fMRI while participants mentally rotated the three-dimensional multi-armed angular stimuli invented by Shepard and Metzler (1971; which look as if they had been constructed by gluing small cubes together to form the arms). Participants were shown pairs of such shapes and asked to report whether the figures in each pair were the same or mirror-reversed. Richter et al. (2000) report that the superior parietal lobules (in both hemispheres) were activated during this task, as well as the premotor cortex (in both hemispheres), the supplementary motor cortex, and also the left primary motor cortex. Other neuroimaging studies have provided strong support for the role of motor processes in mental transformations. For example, Parsons et al. (1995) showed participants a sequence of pictures of a hand that could be rotated to various degrees; the pictures were presented in the left visual field (so the image was registered first by the right hemisphere) or in the right visual field (so the image was registered first by the left hemisphere). The participants were to decide whether each picture was a left or right hand. The researchers expected the motor cortices to be activated in this task if participants imagined rotating their own hand into congruence with the stimulus.
c19.indd 390
And, in fact, not only was the supplementary motor cortex activated bilaterally, but also prefrontal and insular premotor areas were activated in the hemisphere contralateral to the hand (left or right) used as a stimulus—suggesting that participants did in fact imagine the appropriate movements. Many other areas, including in the frontal and parietal lobes, the basal ganglia and cerebellum, were active, as was Area 17. Some researchers (Decety, 1996; Jeannerod, 1994; Jeannerod & Decety, 1995) have suggested that people often transform images by imagining what they would see if the objects were manipulated in a specific way. One PET study (Kosslyn et al., 1998) directly compared rotation of hands versus inanimate objects, again using the three-dimensional multi-armed angular stimuli invented by Shepard and Metzler (1971). The participants compared pairs of drawings and decided whether they were identical or mirror images (using the task and stimuli from the original Shepard & Metzler study). In the experimental condition, the figures were presented at different relative orientations, and one had to be mentally rotated into congruence with the other; in the baseline condition, the figures were presented at the same orientation, and thus no mental rotation was necessary. The comparison of the two conditions revealed which areas were activated by mental rotation. In another condition, the same design was used for drawings of hands, but now the participants decided whether the two hands in a pair were both left or both right or whether one was a left hand and one a right hand. In this study, several motor areas were activated when participants mentally rotated hands, including the primary motor cortex (Area M1), the premotor cortex, and the posterior parietal lobe. None of the frontal motor areas were activated when objects were mentally rotated. However, Cohen et al. (1996) used fMRI to study mental rotation of exactly the same inanimate objects and found that the premotor cortex was activated in this task, but only in half the participants. Alternative Mechanisms of Mental Transformation The fact that only some participants in the Cohen et al. (1996) study had activation in frontal motor areas during mental rotation of inanimate objects suggests that there may be multiple strategies for performing such rotations. One strategy might involve imagining what you would see if you manipulated an object; another might involve imagining what you would see if someone else (or an external force, such as a motor) manipulated the object. To test this idea, Kosslyn, Thompson, Wraga, and Alpert (2001) asked participants to perform the same mental rotation task used by Cohen et al. (1996), but with a twist: Immediately prior
8/17/09 2:11:55 PM
Motor Imagery
Internal Action minus External Action
M1 (Area 4) Left
Right
AC ⫺ PC ⫹ 56 mm MGH PET
Figure 19.3 Internal action minus external action. Note: At the outset, the participants learned to visualize the mental rotation either by external action (EA, which was rotation driven by an electric motor) or internal action (IA, which was rotation driven by manual turning of the figure). An axial PET slice, at 56 mm above the AC-PC line, demonstrates activation in Area M1 when data from the EA condition were subtracted from those in the IA condition. Depending on the strategy used, motor regions of the brain are recruited during mental rotation. The result also shows that the strategy used to accomplish a given task can vary according to previous training and can be adopted voluntarily. From “Imagining Rotation by Endogenous versus Exogenous Forces: Distinct Neural Mechanisms,” by S. M. Kosslyn, W. L. Thompson, M. Wraga, and N. M. Alpert, 2001, NeuroReport, 12, p. 2523. Copyright 2001 Lippincott, Williams & Wilkins. Reprinted with permission.
391
in allowing participants to manipulate objects in images. It is possible that the actual computation is taking place in another area that incidentally sends activation to primary motor cortex. To test this hypothesis, Ganis, Keenan, Kosslyn, and Pascual-Leone (2000) disrupted function in the left primary motor cortex by administering TMS while participants mentally rotated pictures of hands and feet (with the to-be-rotated stimulus appearing in the right visual field). The TMS was time-locked so that it disrupted neural processing only a specific amount of time after the stimulus appeared. Participants required more time to perform this task if a single magnetic pulse was delivered to the motor strip (roughly over the “hand area”) 650 ms after the stimuli were presented (but not at the other temporal delays tested); moreover, rotation of hands was impaired more than rotation of feet, as expected if this area is specialized for controlling the hand per se. Within the limits of the spatial resolution afforded by the TMS technique, these results suggest that activation in this area reflects processing used to perform the task. We cannot say, however, whether this area is the site of processing or merely relays information computed elsewhere in the brain. The finding that area M1 is recruited during the mental rotation of hands has been replicated and extended with similar stimuli and procedures (Tomasino et al., 2005). Motor Imagery, Mental Practice, and Mirror Neurons
to the task, the participants saw a wooden model of that type of stimulus (an exemplar not actually used in the task) either being rotated by an electric motor or they themselves physically turned the stimulus. They were told that during the task they should imagine the stimuli being rotated just as they had seen the model rotate at the outset. In this experiment, Area M1 was activated when participants mentally rotated stimuli by imagining themselves physically rotating the stimulus, and not when they imagined the electric motor rotating the stimulus (Figure 19.3). These results show that imagining oneself manipulating an object is one way in which mental transformation of objects in general (not just body parts) can take place and the results also show that humans can voluntarily adopt this strategy or use a strategy in which they imagine what would happen if an external force transformed an object.
Role of the Primary Motor Cortex in Motor Imagery Just as researchers have asked “how low” mental imagery activates the visual system, they have also asked “how low” it activates the motor system. Specifically, they have asked whether the primary motor cortex plays a functional role
c19.indd 391
The fact that mental imagery can engage the motor system may help to explain how “mental practice” can improve actual performance in various domains. Mental practice occurs when people rehearse performing an activity in imagery, which often later improves their performance of the actual activity. Researchers have shown that mental practice can improve actual performance in activities ranging from athletics to patient rehabilitation (Butler & Page, 2006; Driskell, Copper, & Moran, 1994; MacIntyre, Moran, & Jennings, 2002; Malouin, Belleville, Richards, Desrosiers, & Doyon, 2004; Maring, 1990; Morganti et al., 2003; Weiss, Hansen, Rost, & Beyer, 1994). In this case, imagining making movements may not only exercise the relevant brain areas, but may build associations among processes implemented in different areas—which in turn facilitates complex performance. One striking aspect of mental practice is that it can be based on observing a model perform the activity. For example, a person can observe a golf coach performing a swing, and then can imagine herself performing that swing. How can observing someone else become translated into a motor program in one’s own head? This process apparently relies on mirror neurons, which originally were discovered in Area F5 of the monkey brain (itself part of premotor
8/17/09 2:11:55 PM
392
Mental Imagery
cortex). Mirror neurons respond selectively not only when the animal performs specific actions with the hand and/or mouth, but also when the animal merely observes the same actions performed by another monkey (or human). Neuroimaging and TMS studies have provided evidence that we humans also have mirror neurons. In many studies, researchers have found that the human premotor cortex is activated when humans observe other people’s actions (e.g., Fadiga, Fogassi, Pavesi, & Rizzolatti, 1995; Gangitano, Mottaghy, & Pascual-Leone, 2001; Grafton, Arbib, Fadiga, & Rizzolatti, 1996; Hari et al., 1998; Rizzolatti et al., 1996; Rizzolatti, Luppino, & Matelli, 1998). The likely homologue of Area F5 in humans is Broca’s area (typically characterized as being involved in speech production), which has prompted some authors to theorize that the mirror neurons in humans may have a crucial role not only in imitation, but also in language acquisition. Mirror neurons may also play a role in motor imagery, consistent with the idea that people often transform images by imagining what they would see if the objects were manipulated in a specific way. Although mirror neurons have not been implicated, researchers have observed a similar cross-modality transfer when people see highly overlearned stimuli. In this case, the mere act of seeing the stimuli can activate the motor system. For example, James and Gauthier (2006) found that interacting with letters engages motor areas of the brain, in addition to a larger network. Seeing and acting apparently are tightly linked in the brain, and using imagery in mental practice can further strengthen or modify that link.
SUMMARY The great behaviorist B. F. Skinner (1977, p. 6) wrote, “There is no evidence of the mental construction of images to be looked at or maps to be followed. The body responds to the world, at the point of contact; making copies would be a waste of time.” We hope that we have convinced the reader that the first part of this claim is incorrect; images are in fact internal representations. With the advent of neuroimaging, imagery may no longer be seen as an awkward holdover from a previous, less rigorous age, a topic unfit for polite company. Rather, researchers agree that most of the neural processes underlying like-modality perception are also used in imagery, and imagery in many ways can “stand in” for (re-present, if you will) a perceptual stimulus or situation. Imagery can not only engage the motor system, but also can engage the autonomic and limbic systems. Nevertheless, many questions remain. For example, under what circumstances is early sensory cortex always recruited during imagery? Why is early sensory cortex often recruited during visual mental imagery, but not during
c19.indd 392
auditory imagery? Why do people differ so much in their imagery abilities? Does genetics affect some aspects of imagery more than others? How does semantic content in images engage specific mechanisms? Unlike even 20 years ago, questions such as these can now be answered.
REFERENCES Bartolomeo, P. (2002). The relationship between visual perception and visual mental imagery: A reappraisal of the neuropsychological evidence. Cortex, 38, 357–378. Behrmann, M. (2000). The mind’s eye mapped onto the brain’s matter. Current Directions in Psychological Science, 9, 50–54. Behrmann, M., Winocur, G., & Moscovitch, M. (1992, October 15). Dissociation between mental imagery and object recognition in a braindamaged patient. Nature, 359, 636–637. Butler A.J., & Page S.J. (2006). Mental practice with motor imagery: evidence for motor recovery and cortical reorganization after stroke. Archives of Physical Medicine & Rehabilitation, 87, S2–S1. Chatterjee, A., & Southwood, M. H. (1995). Cortical blindness and visual imagery. Neurology, 45, 2189–2195. Cohen, M. S., Kosslyn, S. M., Breiter, H. C., DiGirolamo, G. J., Thompson, W. L., Bookheimer, S. Y., Belliveau, J. W., & Rosen, B. R. (1996). Changes in cortical activity during mental rotation: A mapping study using functional MRI. Brain, 119, 89–100. Crick, F., & Koch, C. (1995, May 11). Are we aware of neural activity in primary visual cortex? Nature, 375, 121–123. Decety, J. (1996). Neural representation for action. Reviews in the Neurosciences, 7, 285–297. Decety, J., & Jeannerod, M. (1995). Mentally simulated movements in virtual reality: Does Fitts’s law hold in motor imagery? Behavioral Brain Research, 72(1/2), 127–134. De Vreese, L. P. (1991). Two systems for colour-naming defects: Verbal disconnection vs colour imagery disorder. Neuropsychologia, 29, 1–18. Downing, P. E., Chan, A. W., Peelen, M. V., Dodds, C. M., & Kanwisher, N. (2006). Domain specificity in visual cortex. Cerebral Cortex, 16, 1453–1461. Driskell, J., Copper, C., & Moran, A. (1994). Does mental practice enhance performance? Journal of Applied Psychology, 79, 481–492. Fadiga, L., Fogassi, L., Pavesi, G., & Rizzolatti, G. (1995). Motor facilitation during action observation: A magnetic stimulation study. Journal of Neurophysiology, 73, 2608–2611. Farah, M. J. (1984). The neurological basis of mental imagery: A componential analysis. Cognition, 18, 245–272. Farah, M. J., Soso, M. J., & Dasheiff, R. M. (1992). Visual angle of the mind’s eye before and after unilateral occipital lobectomy. Journal of Experimental Psychology: Human Perception and Performance, 18, 241–246. Gangitano, M., Mottaghy, F. M., & Pascual-Leone, A. (2001). Phase-specific modulation of cortical motor output during movement observation. NeuroReport, 12, 1489–1492. Ganis, G., Keenan, J. P., Kosslyn, S. M., & Pascual-Leone, A. (2000). Transcranial magnetic stimulation of primary motor cortex affects mental rotation. Cerebral Cortex, 10, 175–180. Ganis, G., Thompson, W. L., & Kosslyn, S. M. (2004). Brain areas underlying visual mental imagery and visual perception: An fMRI study. Brain Research: Cognitive Brain Research, 20, 226–241.
8/17/09 2:11:56 PM
References 393 Ganis, G., Thompson, W. L., Mast, F. W., & Kosslyn, S. M. (2003). Visual imagery in cerebral visual dysfunction. Neurologic Clinics, 21, 631–646. Georgopoulos, A. P., Lurito, J. T., Petrides, M., Schwartz, A. B., & Massey, J. T. (1989, January 13). Mental rotation of the neuronal population vector. Science, 243, 234–236.
Kosslyn, S. M., Pascual-Leone, A., Felician, O., Camposano, S., Keenan, J. P., Thompson, W. L., Ganis, G., Sukel, K. E., & Alpert, N. M. (1999, April 2). The role of area 17 in visual imagery: Convergent evidence from PET and rTMS. Science, 284, 167–170.
Grafton, S. T., Arbib, M. A., Fadiga, L., & Rizzolatti, G. (1996). Localization of grasp representations in humans by positron emission tomography: Vol. 2. Observation compared with imagination. Experimental Brain Research, 112, 103–111.
Kosslyn, S. M., Shin, L. M., Thompson, W. L., McNally, R. J., Rauch, S. L., Pitman, R. K., & Alpert, N. M. (1996). Neural effects of visualizing and perceiving aversive stimuli: A PET investigation. NeuroReport, 7, 1569–1576.
Griffiths, T. D. (2000). Musical hallucinosis in acquired deafness: Phenomenology and brain substrate. Brain, 123, 2065–2076.
Kosslyn, S. M., & Thompson, W. L. (2003). When is early visual cortex activated during visual mental imagery? Psychological Bulletin, 129, 723–746.
Halpern, A. R., & Zatorre, R. J. (1999). When that tune runs through your head: A PET investigation of auditory imagery for familiar melodies. Cerebral Cortex, 9, 697–704. Hari, R., Forss, N., Avikainen, S., Kirveskari, E., Salenius, S., & Rizzolatti, G. (1998). Activation of human primary motor cortex during action observation: A neuromagnetic study. Proceedings of the National Academy of Sciences, USA, 95, 15061–15065. Hasson, U., Harel, M., Levy, I., & Malach, R. (2003). Large-scale mirrorsymmetry organization of human occipito-temporal object areas. Neuron, 37, 1027–1041. Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001, September 28). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293, 2425–2430.
Kosslyn, S. M., Thompson, W. L., & Alpert, N. M. (1997). Neural systems shared by visual imagery and visual perception: A positron emission tomography study. NeuroImage, 6, 320–334. Kosslyn, S. M., Thompson, W. L., & Ganis, G. (2006). The case for mental imagery. New York: Oxford University Press. Kosslyn, S. M., Thompson, W. L., Kim, I. J., & Alpert, N. M. (1995, November 30). Topographical representations of mental images in primary visual cortex. Nature, 378, 496–498. Kosslyn, S. M., Thompson, W. L., Kim, I. J., Rauch, S. L., & Alpert, N. M. (1996). Individual differences in cerebral blood flow in area 17 predict the time to evaluate visualized letters. Journal of Cognitive Neuroscience, 8, 78–82.
James, K. H., & Gauthier, I. (2006). Letter processing automatically recruits a sensory-motor brain network. Neuropsychologia, 44, 2937–2949.
Kosslyn, S. M., Thompson, W. L., Wraga, M., & Alpert, N. M. (2001). Imagining rotation by endogenous versus exogenous forces: Distinct neural mechanisms. NeuroReport, 12, 2519–2525.
Jankowiak, J., Kinsbourne, M., Shalev, R. S., & Bachman, D. L. (1992). Preserved visual imagery and categorization in a case of associative visual agnosia. Journal of Cognitive Neuroscience, 4, 119–131.
Kreiman, G., Koch, C., & Fried, I. (2000, November 16). Imagery neurons in the human brain. Nature, 408, 357–361.
Jeannerod, M. (1994). The representing brain: Neural correlates of motor intention and imagery. Behavioral and Brain Sciences, 17, 187–245.
Lang, P. J., Greenwald, M. K., Bradley, M. M., & Hamm, A. O. (1993). Looking at pictures: Affective, facial, visceral, and behavioral reactions. Psychophysiology, 30, 261–273.
Jeannerod, M., & Decety, J. (1995). Mental motor imagery: A window into the representational stages of action. Current Opinion in Neurobiology, 5, 727–732.
LeDoux, J. E. (1995). Emotion: Clues from the brain. Annual Reviews of Psychology, 46, 209–235.
Jordan, K., Heinze, H. J., Lutz, K., Kanowski, M., & Jancke, L. (2001). Cortical activations during the mental rotation of different visual objects. NeuroImage, 13, 143–152. Kammer, T., Puls, K., Erb, M., & Grodd, W. (2005). Transcranial magnetic stimulation in the visual system: Pt. II. Characterization of induced phosphenes and scotomas. Experimental Brain Research, 160, 129–140. Kammer, T., Puls, K., Strasburger, H., Hill, N. J., & Wichmann, F. A. (2005). Transcranial magnetic stimulation in the visual system: Pt. I. The psychophysics of visual suppression. Experimental Brain Research, 160, 118–128. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311.
c19.indd 393
Kosslyn, S. M., Ganis, G., & Thompson, W. L. (2001). Neural foundations of imagery. Nature Reviews Neuroscience, 2, 635–642.
LeDoux, J. E. (1996). The emotional brain: The mysterious underpinnings of emotional life. New York: Simon & Schuster. Levine, D. N., Warach, J., & Farah, M. J. (1985). Two visual systems in mental imagery: Dissociation of ‘what’ and ‘where’ in imagery disorders due to bilateral posterior cerebral lesions. Neurology, 35, 1010–1018. MacIntyre, T., Moran, A., & Jennings, D. J. (2002). Are mental imagery abilities related to canoe-slalom performance? Perceptual & Motor Skills, 94, 1245–1250. Malouin, F., Belleville, S., Richards, C. L., Desrosiers, J., & Doyon, J. (2004). Working memory and mental practice outcomes after stroke. Archives of Physical Medicine and Rehabilitation, 85, 177–183. Maring, J. R. (1990). Effects of mental practice on rate of skill acquisition. Physical Therapy, 70, 165–172.
Kanwisher, N., & Yovel, G. (2006). The fusiform face area: A cortical region specialized for the perception of faces. Philosophical Transactions of the Royal Society of London, Series B, 361, 2109–2128.
Mechelli, A., Price, C. J., Friston, K. J., & Ishai, A. (2004). Where bottomup meets top-down: Neuronal interactions during perception and imagery. Cerebral Cortex, 14, 1256–1265.
Klein, I., Dubois, J., Mangin, J. F., Kherif, F., Flandin, G., Poline, J. B., Denis, M., Kosslyn, S. M., & Le Bihan, D. (2004). Retinotopic organization of visual mental images as revealed by functional magnetic resonance imaging. Brain Research: Cognitive Brain Research, 22, 26–31.
Mellet, E., Briscogne, S., Tzourio-Mazoyer, N., Ghaëm, O., Petit, L., Zago, L., Etard, O., Berthoz, A., Mazoyer, B., & Denis, M. (2000). Neural correlates of topographic mental exploration: The impact of route versus survey perspective learning. NeuroImage, 12, 588–600.
Klein, I., Paradis, A.-L., Poline, J.-B., Kosslyn, S. M., & Le Bihan, D. (2000). Transient activity in human calcarine cortex during visual imagery. Journal of Cognitive Neuroscience, 12, 15–23.
Mellet, E., Tzourio, N., Crivello, F., Joliot, M., Denis, M., & Mazoyer, B. (1996). Functional anatomy of spatial mental imagery generated from verbal instructions. Journal of Neuroscience, 16, 6504–6512.
Kosslyn, S. M., DiGirolamo, G., Thompson, W. L., & Alpert, N. M. (1998). Mental rotation of objects versus hands: Neural mechanisms revealed by positron emission tomography. Psychophysiology, 35, 151–161.
Mellet, E., Tzourio, N., Denis, M., & Mazoyer, B. (1995). A positron emission tomography study of visual and mental spatial exploration. Journal of Cognitive Neuroscience, 4, 433–445.
8/17/09 2:11:57 PM
394
Mental Imagery
Morganti F., Gaggioli A., Castelnuovo G., Bulla D., Vettorello, M., & Riva G. (2003). The use of technology-supported mental imagery in neurological rehabilitation: A research protocol. Cyberpsychology and Behavior, 6, 421–427. Ng, V. W., Bullmore, E. T., de Zubicaray, G. I., Cooper, A., Suckling, J, & Williams, S. C. R. (2001). Identifying rate-limiting nodes in large-scale cortical networks for visuospatial processing: An illustration using fMRI. Journal of Cognitive Neuroscience, 13, 537–545. O’Craven, K. M., & Kanwisher, N. (2000). Mental imagery of faces and places activates corresponding stimulus-specific brain regions. Journal of Cognitive Neuroscience, 12, 1013–1023. Paivio, A. (1971). Imagery and verbal processes. New York: Holt, Rinehart and Winston. Parsons L. M., Fox, P. T., Downs, J. H., Glass, T., Hirsch, T. B., Martin, C. C., Jerabek, P. A., & Lancaster, J. L. (1995, May 4). Use of implicit motor imagery for visual shape discrimination as revealed by PET. Nature, 375, 54–58. Pylyshyn, Z. W. (1973). What the mind’s eye tells the mind’s brain: A critique of mental imagery. Psychological Bulletin, 80, 1–24. Pylyshyn, Z. W. (1981). Psychological explanations and knowledgedependent processes. Cognition, 10(1/3), 267–274.
Slotnick, S. D., Thompson, W. L., & Kosslyn, S. M. (2005). Visual mental imagery induces retinotopically organized activation of early visual areas. Cerebral Cortex, 15, 1570–1583. Thirion, B., Duchesnay, E., Hubbard, E., Dubois, J., Poline, J.-B., Lebihan, D., & Dehaene, S. (2006). Inverse retinotopy: Inferring the visual content of images from brain activation patterns. NeuroImage, 33, 1104–1116. Thompson, W. L., & Kosslyn, S. M. (2000). Neural systems activated during visual mental imagery. A review and meta-analyses. In A. W. Toga & J. C. Mazziotta (Eds.), Brain mapping: Pt. II. The systems (pp. 535–560). San Diego, CA: Academic Press. Thompson, W. L., Kosslyn, S. M., Sukel, K. E., & Alpert, N. M. (2001). Mental imagery of high- and low-resolution gratings activates Area 17. NeuroImage, 14, 454–464. Tomasino, B., Borroni, P., Isaja, A., & Rumiati, R. I. (2005). The role of the primary motor cortex in mental rotation: A TMS study. Cognitive Neuropsychology, 22, 348–363. Tootell, R. B. H., Hadjikani, N. K., Mendola, J. D., Marrett, S., & Dale, A. M. (1998). From retinotopy to recognition: FMRI in human visual cortex. Trends in Cognitive Sciences, 2, 174–183.
Pylyshyn, Z. W. (2002). Mental imagery. In search of a theory. Behavioral and Brain Sciences, 25, 157–237.
Ungerleider, L. G., & Mishkin, M. (1982). Two cortical visual systems. In D. J. Ingle, M. A. Goodale, & R. J. W. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press.
Pylyshyn, Z. W. (2003a). Return of the mental image: Are there really pictures in the head? Trends in Cognitive Sciences, 7, 113–118.
Watson, J. B. (1913). Psychology as the behaviorist views it. Psychological Review, 20, 158–177.
Pylyshyn, Z. W. (2003b). Seeing and visualizing: It’s not what you think. Cambridge, MA: MIT Press.
Weiss, T., Hansen, E., Rost, R., & Beyer, L. (1994). Mental practice of motor skills used in poststroke rehabilitation has own effects on central nervous activation. International Journal of Neuroscience, 78(3–4), 157–166.
Richter, W., Somorjai, R., Summers, R., Jarmasz, M., Tegeler, C., Ugurbil, K., Menon, R., Gati, J. S. Georgopoulos, A. P. & Kim, S.-G. (2000). Motor area activity during mental rotation studied by time-resolved single-trial fMRI. Journal of Cognitive Neuroscience, 12, 310–320. Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., & Fazio, F. (1996). Localization of grasp representations in humans by PET: Pt. 1. Observation versus execution. Experimental Brain Research, 111, 246–252.
Wraga, M., Shephard, J. M., Church, J. A., Inati, S., & Kosslyn, S. M. (2005). Imagined rotations of self versus objects: An fMRI study. Neuropsychologia, 43, 1351–1361. Wraga, M. J., Thompson, W. L., Alpert, N. M., & Kosslyn, S. M. (2003). Implicit transfer of motor strategies in mental rotation. Brain and Cognition, 52, 135–143.
Rizzolatti, G., Luppino, G., & Matelli, M. (1998). The organization of the cortical motor system: New concepts. Electroencephalography and Clinical Neurophysiology, 106, 283–296.
Young, A. W., Humphreys, G. W., Riddoch, M. J., Hellawell, D. J., & de Haan, E. H. (1994). Recognition impairments and face imagery. Neuropsychologia, 32, 693–702.
Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., Rosen, B. R, & Tootell, R. B. H. (1995, May 12). Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging. Science, 268, 889–893.
Zatorre, R. J., & Halpern, A. R. (1993). Effect of unilateral temporal-lobe excision on perception and imagery of songs. Neuropsychologia, 31, 221–232.
Shepard, R. N., & Metzler, J. (1971, February 19). Mental rotation of threedimensional objects. Science, 171, 701–703. Skinner, B. F. (1977). Why I am not a cognitive psychologist. Behaviorism, 5, 1–10.
c19.indd 394
Zatorre, R. J., & Halpern, A. R. (2005). Mental concerts: Musical imagery and auditory cortex. Neuron, 47, 9–12. Zatorre, R. J., Halpern, A. R., Perry, D. W., Meyer, E., & Evans, A. C. (1996). Hearing in the mind’s ear: A PET investigation of musical imagery and perception. Journal of Cognitive Neuroscience, 8, 29–46.
8/17/09 2:11:57 PM
Chapter 20
Categorization MICHAEL L. MACK, JENNIFER J. RICHLER, THOMAS J. PALMERI, AND ISABEL GAUTHIER
The survival of most organisms demands that they discriminate predator from prey, edible from inedible, or family from foe. Organisms have to be able to recognize things as kinds of things, not as isolated instances, because what is learned about one thing should generalize to other things of the same kind. We call these kinds of things categories. Recognizing something in the world as a kind of thing is categorization. Organisms may also identify unique objects as individuals, but arguably this identification can be considered a fine-grained form of categorization because matching different views of the same object, or even the same object changing over time, requires labeling different experiences as belonging to the same category. Once a thing is categorized or identified, all of the knowledge we might have about that category can be brought to bear. What’s the most appropriate course of action? Flee? Eat it? Pick up and dial? Humans take categorization to dizzying degrees. First there is the mundane. We easily categorize chairs from tables, trees from shrubs, and birds from dogs. And there is the remarkable. Experts from various domains may easily discriminate subspecies of particular kinds of plants or animals, judge cancerous from noncancerous growths, or distinguish Porsche models just by the shape of the headlight. While this may seem impressive, remember that many everyday categorizations prove remarkable when you consider the processing demands involved. We easily identify the people we know at a glance. Yet structurally, people may be as similar to one another as different chimpanzees. For most people, all chimpanzees look the same but people look much more different. Right now you are engaging in another everyday categorization: With remarkable speed and ease, the letters and words in this sentence are categorized
as just the first step of comprehending (at least we hope) our written language. Face and letter perception are examples of domains in which most people have gained considerable expertise and are very important domains of study. This chapter mainly addresses how people categorize visual objects. People can also categorize things based on their sound, touch, taste, or smell. But outside of speech perception, the majority of categorization research has focused on the visual modality. More complex visual events can also be categorized, such as “a nod” or “a touchdown” or “an armstand back double somersault tuck,” but this has been for the most part studied separately from object categorization (e.g., Zacks, Speer, Swallow, Braver, & Reynolds, 2007). In keeping with the aims of this Handbook, in each section of this chapter we lay out a variety of fundamental behavioral manifestations of object categorization and review some of the key findings from neurophysiology, electrophysiology, neuropsychology, and functional brain imaging that have deepened our understanding of object categorization. We also look to computational cognitive neuroscience models grounded in neuroanatomy and neurophysiology. We begin our discussion with the issue of abstraction. By its very nature, categorization is abstraction. We live in a world of particular experiences. Yet recognizing an object as not simply an isolated perceptual experience but also as an instance of a kind of thing that has been experienced before—as a member of a category—is to abstract from the particular to the general. Does this ability to abstract from particular experience mean that what we know about an object category is itself an abstraction? At first blush, it may seem like the answer is obviously yes. How could we categorize objects abstractly if our knowledge about categories was not itself abstract? But as we will see, decades of behavioral research wrestled with this basic issue and recent neuroscientific evidence has shed important light on this question. We then turn to two parallel issues that have dominated much of the recent research on object categories: (a) The
This work was supported by a grant from the James S. McDonnell Foundation, grant HSD-DHBS05 from the National Science Foundation, and the Temporal Dynamics of Learning Center (NSF Science of Learning Center SBE-0542013). The order of the first two authors was decided by a flip of a coin. 395
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c20.indd 395
8/17/09 2:12:45 PM
396
Categorization
study of when different kinds of knowledge representations, abstract or not, come into play, and whether the variety of categorization behaviors we can observe is best explained by different learning and memory systems in the brain; and (b) whether objects from different domains may be categorized in different ways and by different brain systems. For instance, there may be specialized systems in the brain to process objects from especially important categories such as letters and faces. Whether and how we acquire such specialization through learning, or whether we have evolved systems for some special categories, has been a topic of debate.
ROLE OF ABSTRACTION IN CATEGORIZATION Categorization is abstraction. To begin with, we never see the same object twice, even if it is the very same physical object. When an object is viewed from a different position or under different lighting, the projection of that object onto our retina will vary, often quite dramatically. What is remarkable is that, despite the visual signal being very different, we perceive the same object (Palmeri & Gauthier, 2004; Palmeri & Tarr, 2008). Moreover, physically different objects can be perceived as very different, yet even very young children know that they are the same thing—not the same object, but the same kind of thing (Quinn, 1999). Humans have developed complex systems that permit objects to be categorized at multiple levels of abstraction, from specific (e.g., “Gladys” or “American White Pelican”), to basic level categories (e.g., “chair” or “bird”), to extremely abstract superordinate categories (e.g., “living thing”; Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). Categorization is a form of abstraction, but does this necessarily imply that the mental representations and processes involved are inherently abstract? Early theories of object categorization took it as nearly an axiom that the goal of visual cognition was to create an abstract representation of the varying world. Early structural description theories of object recognition assumed that the goal of vision was to mentally reconstruct the abstract three-dimensional structure of objects (Marr & Nishihara, 1978). Recognition-by-components (Biederman, 1987) assumes that objects are mentally represented in terms of a small set of qualitative threedimensional primitives known as geons (Figure 20.1). Geons are uniquely recovered by attending to various configurations of view-invariant properties in the twodimensional retinal image. Objects are represented in terms of their geon components and their relative spatial
c20.indd 396
Object Representations Image fragments
Geons
Figure 20.1 Illustration of object representations in imagebased versus structural description models. Note: (Top) Image-based models. The object (lamp) is represented in terms of image-based fragments of intermediate complexity. (Bottom) Structural description models. The object (lamp) is represented in terms of geometric primitives (geons) and the spatial relations between them.
configuration. As a consequence of this reconstruction into a geon-defined structure, many sources of variability are eliminated entirely from a mental representation of an object. Different views of the same object and different exemplars within a category such as dog or lamp map onto the same object representation. Early concept models also assumed that our knowledge about object categories is abstract. Semantic network models (e.g., Collins & Quillian, 1969) conceptually organized one kind of thing with another kind of thing through propositional structures. Knowledge is stored efficiently, so that object properties that are true of a superordinate category are only stored at the most general level and only properties unique to subordinate categories or specific individuals are stored at lower levels of the conceptual hierarchy (E. Smith, Shoben, & Rips, 1974). According to this view, what we know about particular object categories is also abstracted away from our experience with objects. By such abstractionist views, categorization of an object requires applying logical rules to object properties (e.g., Bruner, Goodnow, & Austin, 1956; Johansen & Palmeri, 2002) or comparing an object to an abstract category prototype or schema (e.g., Lakoff, 1987). Category abstraction is achieved because our knowledge about categories is abstract. However, later work showed that we do not need viewpoint-invariant and instance-invariant representations in order to achieve categorization that appears invariant across viewing conditions and invariant across instances of a category. Careful experimentation revealed that object categorization can be systematically affected by the particular
8/17/09 2:12:45 PM
Role of Abstraction in Categorization 397
viewpoints and category instances that have been experienced (see Palmeri & Gauthier, 2004; Palmeri & Tarr, 2008). While there are conditions under which humans readily recognize and categorize objects irrespective of viewpoint (Biederman & Gerhardstein, 1993; Tarr & Bülthoff, 1998), numerous studies have found that if observers learn to recognize novel objects from specific viewpoints, they are both faster and more accurate at recognizing these same objects from familiar viewpoints relative to unfamiliar viewpoints (Bülthoff & Edelman, 1992; Tarr & Bülthoff, 1995; Tarr & Pinker, 1989). Even the recognition of single geons, originally proposed to support view-invariant performance with more complex objects, is sensitive to changes in viewpoint (Tarr, Williams, Hayward, & Gauthier, 1998). Instead, human object recognition was proposed to rely on multiple views, where each view encodes the appearance of an object under specific viewing conditions, including viewpoint, pose, configuration, and lighting (Tarr, 1995; Tarr, Kersten, & Bülthoff, 1998) and a collection of such views constitutes the enduring visual representation of a given object. These ideas are instantiated in image-based models of object recognition (e.g., Edelman, 1997, 1999; Poggio & Edelman, 1990). Rather than assume that the goal of vision is to reconstruct the three-dimensional world, image-based models stress the importance of generalizing from past experiences to the present experience (Shepard, 1987, 1994). This is done by remembering past views of objects and generalizing based on similarity to those stored views. Such models account well for patterns of interpolation and extrapolation to new views. Furthermore, since physically similar objects in the world viewed under similar conditions will be similar to the same set of stored views, generalization to new objects can occur without any explicit representation of three-dimensional shape. For purposes of object recognition and categorization, representation of three-dimensional shape may not be necessary. Instead, such information may be stored in parts of the brain involved in acting on objects (Goodale & Milner, 1992). Similar computational principles are also at work in exemplar-based models of object categorization. The core principle of these models is that object categories are mentally represented in terms of the specific category exemplars that have been previously experienced (Kruschke, 1992; Medin & Schaffer, 1978; Nosofsky, 1986). Categorization is based on the relative similarity of an object to these stored exemplars. In that sense, you judge that a certain object is a cell phone because of its similarity to many other cell phones in memory. While no abstraction occurs, exemplar models can readily account for a range of prototypicality effects that might at first blush appear to demonstrate abstract prototype representations for categories (Busemeyer, Dewey, & Medin, 1984; Hintzman, 1986;
c20.indd 397
Exemplar-based
Prototype-based
Rule-based
1
2
3 Feature A
4
5
Figure 20.2 Top: Exemplar-based models of categorization assume that object categories are represented by storing category exemplars that were previously experienced. Middle: Prototype models assume category knowledge is based on a stored prototype abstracted from the experienced category examples. Bottom: Rule-based models represent category knowledge with logical rules along individual features. Note: (Top) Exemplars are represented as points in multidimensional psychological space, with similarity a function of distance in that space, and the generalization gradient around an object indicated by the graded shading around each exemplar. The exemplars on the left (darker circles) represent one category and the exemplars on the right (lighter circles) represent a different category. A probed object (question mark) is categorized based on the relative similarity to stored exemplars in each category. (Middle) A probed object (question mark) is categorized based on the relative similarity to the different category prototypes. (Bottom) These rules partition psychological space into different regions. A probed object (question mark) is categorized according to what region it lies within relative to the category rule.
Nosofsky, 1988; Shin & Nosofsky, 1992; see Figures 20.2 and 20.3). These models also account for a range of category exemplar effects (Nosofsky, Kruschke, & McKinley, 1992) and the time course of categorization (Lamberts, 2000; Nosofsky & Palmeri, 1997; Palmeri, 1997). Neurophysiological evidence supports many important assumptions underlying a host of image-based and exemplarbased models of object categorization (refer to Figure 20.4
8/17/09 2:12:45 PM
398
Categorization
(A)
P (Category)
Prototypicality Effect
Prototype
Prototype
Member
Nonmember
Member
(B) Recognition/Categorization Dissociation Control Group
Amnesic Group
Accuracy (%)
Recognition
100 75 50 25 0 Control Amnesic
Figure 20.3 Exemplar-based models can account for two important phenomena that on the surface seem to challenge exemplar-based models. Note: A: The top panel illustrates prototypicality effects. Category prototypes are usually categorized as well as and sometimes better than category exemplars, even if the category prototypes have never been seen before (far right graph). This result is typically viewed as strong evidence for prototype abstraction. How else could an object that’s never been seen before be classified as well as objects that have been trained on, unless that unseen prototype is in fact abstracted during learning and stored just like an experienced exemplar? But assuming that categorization is based on similarity to stored exemplars only, this prototypicality effect falls out quite naturally. The left and middle figures illustrate how category exemplars and the category prototype might be represented in a psychological space, with the prototype in the middle, the exemplars around the prototype, and distance between objects related to their psychological similarity; the cloud around each point represents the generalization gradients around each stored exemplar. As shown in the left figure, the prototype to be classified (indicated by ?) is similar to many exemplars, yielding a lot of evidence in favor of category membership. By contrast, as shown in the middle figure, an individual category exemplar to be classified (indicated by ?) may only be similar to a subset of exemplars, yielding smaller
c20.indd 398
Accuracy (%)
Categorization
100 75 50 25 0 Control Amnesic
category evidence compared to that for an unseen prototype. B: The bottom panel illustrates dissociations between categorization and recognition memory. As discussed in the text, whereas amnesic individuals and controls show similar performance on categorization, amnesic individuals are significantly impaired at recognition memory (far right graphs). This behavior dissociation suggests a functional dissociation between categorization and recognition. As in the top panel, individual exemplars are represented as points in a psychological space with the clouds around each point representing the generalization gradients. Following Nosofsky and Zaki (1998) we assume here that amnesic individuals have far poorer exemplar memories than controls, as indicated by the far more diffuse generalization gradients because of impaired memory for amnesic individuals. For categorization, all of the category exemplars are crowded together in the same general region of psychological space. Having finely tuned or diffuse exemplar memories has little impact on categorization because all of the category members are in the same part of the psychological space. However, having finely tuned or diffusion exemplar memories does have significant impact on recognition because the space of old and new patterns is distributed uniformly throughout psychological space; having more diffuse memories makes it far more difficult to discriminate between old objects than have been seen and stored, albeit poorly, from new objects.
8/17/09 2:12:46 PM
Role of Abstraction in Categorization 399 Response Units Premotor/motor cortex
Decision Units PFC
Category representations
Rule-Based Units AC, PFC
Top-Down Control
Exemplar Units AIT, BG, Hipp.
Viewpoint Units AIT Object representations Object Feature Units V4/LOC/IT
Edge extraction Low-level visual areas V1, V2 Shape/form processing
Figure 20.4 Stages of processing in object recognition and categorization according to a range of models. Note: Low-level features such as edges are processed in early visual areas. Object representations are created by processing object feature units in the V4, lateral occipital cortex (LOC), and/or inferotemporal cortex (IT) and by processing viewpoint in anterior inferotemporal cortex (AIT). Category representations arise from rule-based units in the anterior cingulate (AC) and prefrontal cortex (PFC), or from exemplar units in the anterior inferotemporal cortex (AIT), the basal ganglia (BG), and the hippocampus (Hipp). Information from the category representations is passed to decision units in PFC, which determines category membership and initiates the selection of the appropriate category response in the premotor cortex for execution by the motor cortex. Low-level processing, object representations, and category representations can all be modulated by factors such as attention and various task demands via topdown control.
for brain areas discussed in this section). The responses of inferotemporal (IT) neurons to objects depends on stimulus size and viewpoint (Perrett, Oram, & Ashbridge, 1998; K. Tanaka, 1996). Even accepted notions of retinal position invariance in IT (Tovee, Rolls, & Azzopardi, 1994) have been challenged (DiCarlo & Maunsell, 2003; Op de Beeck & Vogels, 2000). Surprisingly few neural responses in IT are invariant to position, size, or viewpoint (DiCarlo & Maunsell,
c20.indd 399
2003; Logothetis & Sheinberg, 1996; but also see Booth & Rolls, 1998). When trained on particular object views, monkeys recognize novel object views according to their similarity to experienced views, and neurons respond in a similarly graded fashion to particular trained views (Logothetis & Pauls, 1995; Logothetis, Pauls, Bülthoff, & Poggio, 1994; Logothetis, Pauls, & Poggio, 1995). Perrett et al. (1998) provided one suggestion for how object recognition could take the form of an accumulation of evidence across all neurons selective for aspects of a given object. By assuming a neural variant of stochastic accumulation of evidence models (Nosofsky & Palmeri, 1997; P. Smith & Ratcliff, 2004) and by assuming that the rate of accumulation depends on the similarity between visible features in the presented viewpoint and those to which individual neurons are tuned, systematic effects of object recognition time and accuracy with changes in viewpoint can be well accounted for. When monkeys are trained to categorize objects, their behavior is consistent with exemplar generalization and not with the abstraction of a prototype (Sigala, Gabbiani, & Logothetis, 2002). IT neurons will respond selectively to specific exemplars that have been studied, not to an average category prototype that was never studied (Freedman, Riesenhuber, Poggio, & Miller, 2003; Op de Beeck, Wagemans, & Vogels, 2001, 2008; Vogels, Biederman, Bar, & Lorincz, 2001). Furthermore, many exemplar models of object categorization assume that similarity between objects is heavily influenced by matches or mismatches along dimensions that are diagnostic of category membership (Gauthier & Palmeri, 2002; Kruschke, 1992; Lamberts, 2000; Nosofsky, 1984, 1986), and neural responses are modulated by dimensional diagnosticity in a similar manner (Sigala & Logothetis, 2002). While responses of IT neurons can be specific to particular exemplars that have been experienced, IT neurons do not seem to respond in a category-specific manner. Instead, category-specific, but not exemplar-specific, neural responses are observed in the prefrontal cortex (Freedman et al., 2003; Jiang et al., 2007; Rotshtein, Henson, Treves, Driver, & Dolan, 2005). These neurophysiological results may seem at odds with the apparent category specificity observed using functional magnetic resonance imaging (fMRI) and in the patterns of deficits in category-specific agnosia due to focal brain injury, which we discuss later in this chapter. One way to reconcile these results is to first consider the vast differences in spatial resolution between single-unit recordings and fMRI or brain lesions. Although individual neurons may respond in a way that highlights exemplar-specific (not category-specific) information, neighboring regions of the cortex may respond to similar objects or objects that are processed in a similar fashion. So objects in the same category may recruit the same area
8/17/09 2:12:47 PM
400
Categorization
of the cortex as measured by fMRI or may be impaired in a category-specific fashion by brain injury, yet the underlying neural activity may respect exemplar-specific and view-specific coding, not category-specific coding per se. Some recent neurally plausible computational models have instantiated this division of labor between learned object representations in IT and learned category representations elsewhere in the brain. For example, the theoretical work of Riesenhuber and Poggio (2000) represents a recent instantiation of a tradition of image-based models of object recognition (Edelman, 1997; Poggio & Edelman, 1990). This model builds on classical models where complex cells are built from simple cells in early visual areas, extending this hierarchy of processing throughout the higher-level visual cortex to view-tuned and exemplartuned units. At each level of the hierarchy, these units have Gaussian-shaped receptive fields (radial-basis functions) that respond preferentially to a particular stimulus property, whether that be edges or junctions at the lowest level, or views or exemplars at the highest level. Categoryspecific units that can represent knowledge of the basiclevel category of an object or the subordinate-level identity of an object are thought to reside in the prefrontal cortex. Other computational models have proposed a similar division of labor between exemplar-like object representations in IT and category representations elsewhere, implicating brain structures such as the basal ganglia as well as the prefrontal cortex in mapping object-specific representations to category-specific representations (Ashby, Ennis, & Spiering, 2007; but see Love & Gureckis, 2007). The hierarchical object representations instantiated in such models make us reflect on one key difference between classic structural description and image-based theories: Under the cartoon view of the world, structural descriptions represent objects in terms of viewpoint-independent threedimensional parts and their spatial relations (Biederman, 1987), and views represent objects in terms of holistic images of the entire object (Edelman, 1997). However, intuition and empirical evidence (e.g., Garner, 1974; Stankiewicz, 2002; Tversky, 1977) suggest that we often represent complex objects in a compositional manner—objects are decomposable into parts. In addition, most exemplar-based and related models of object categorization assume that objects have parts, features, or dimensions that can be selectively attended according to how diagnostic they are for categorization decisions. Is there a way to marry the best qualities of image-based theories with the compositional representations seen in structural-description theories? Some studies attempt to uncover image features that are most informative for classification, based on the mutual information (or mutual dependence) of features and specified categories (Schyns & Rodet, 1997; Ullman, Vidal-Naquet,
c20.indd 400
& Sali, 2002). Some of this work has found that features of “intermediate complexity” are best for basic-level classification (see Figure 20.1). For faces, what features emerge from this analysis are those we would generally call the “parts of a face” such as the eyes or the nose, even though the features are not selected a priori to correspond to meaningful parts per se; and for cars, parts such as the wheels or the driver ’s side window emerge. In this context, we mean “emerge” in the sense that these features are uncovered by a computational analysis of hundreds of images as they relate to categories of objects without any kind of intervention from a human teacher. It is tempting to speculate about the relationship between such “ad hoc” image-based features to the observed feature selectivity of neurons in IT (K. Tanaka, 1996, 2003). The best responses for individual IT neurons are elicited by somewhat odd patterns that do not correspond to what we might typically think of as distinct object parts. These appear to be ad hoc. And they appear to be of intermediate complexity. So representations of object parts, as well as objects themselves, seem to be tuned by specific experience with objects in the world; object parts are not general-purpose parts such as those instantiated in models like recognition-by-components.
MEMORY AND LEARNING SYSTEMS THAT SUPPORT CATEGORIZATION The role of abstraction in categorization defined much of the early research and debates about categorization (Murphy, 2002). Initial accounts assumed that categories are represented by abstracting logical rules (Figure 20.2) that define the necessary and sufficient conditions for category membership (Bourne, 1970; Bruner et al., 1956; Levine, 1975; Trabasso & Bower, 1968). While rule-based accounts described well how people learned categories defined by explicit rules, natural categories were found to have a graded structure that suggested instead notions like “family resemblance” and “similarity” as core constructs (Barsalou, 1985; Rosch, 1973; Wittgenstein, 1953). It is easier to categorize a robin as a bird than a penguin as a bird, the argument goes, because a robin is more similar to the prototypical bird (Rosch & Mervis, 1975). Such results suggested that prototypes (Figure 20.2), not rules, define natural categories and that prototypes are learned by abstracting core properties of the category from experience with category members (Homa, Cross, Cornell, Goldman, & Schwartz, 1973; Posner & Keele, 1968; J. D. Smith & Minda, 1998). But as discussed earlier, later work showed that models assuming specific exemplar representations (Figure 20.2), instead of abstract prototype representations, can account well for prototype effects,
8/17/09 2:12:49 PM
Memory and Learning Systems That Support Categorization 401
a whole host of other behavioral effects, and are consistent with a significant amount of the neurophysiological data (Figure 20.3). Arguably, most successful models of categorization have an exemplar model as a critical component (Erickson & Kruschke, 1998; Palmeri, 1997) or fall on a continuum between prototype abstraction models and pure exemplar models (Ashby & Waldron, 1999; Love, Medin, & Gureckis, 2004; Rosseel, 2002). Much of this early work was grounded in an assumption—a perfectly reasonable parsimonious assumption—that all kinds of categories are represented the same way at all stages of learning. Categories are represented by rules or prototypes or exemplars. More recent work has instead asked whether different kinds of category representations are used for learning different kinds of categories, under different kinds of conditions, and at different stages of learning. Some kinds of categories can be learned using rules, but others cannot. Perhaps people try to use rules when they first learn a category, but make use of other less explicit kinds of category knowledge with experience. The burgeoning interest in cognitive neuroscience over the past decade has led researchers quite naturally to ask how categories are represented in the brain. If categories can be represented in different ways at different points in learning under different conditions, it is likely that there are multiple memory and learning systems in the brain that support categorization. We should note that in this context we use the term system in the broadest possible sense: A system could reflect functionally independent kinds of representations and processes, or interacting systems, or different critical subcomponents of a single processing architecture (e.g., Palmeri & Flanery, 2002; Roediger, Buckner, & McDermott, 1999). Categorization and Rules Despite the success of exemplar models of categorization, there have always been some lingering concerns about the processing and storage requirements that come with theories that demand individual memory traces of each and every experience with an object (e.g., Logan, 1988). One response to this criticism has been to view pure exemplar models as a sort of theoretical ideal point, whereas in reality categories may be represented by a subset of the space of experienced exemplars that produces a sufficient level of performance (e.g., Ashby & Waldron, 1999; Kruschke, 1992; Rosseel, 2002). But an alternative response has been to reconsider whether people might use simple rules to categorize objects. What possessed researchers to reconsider an idea that was largely abandoned decades earlier? To begin with, many subjects asked to learn novel categories will say they are forming rules, even if the rules they verbalize
c20.indd 401
do not account all that well for their own categorization behavior. In addition, it is clear that novices are often taught categories using rules. For example, field guides for identifying birds, butterflies, or mushrooms certainly include many pictures but they also include lists of critical features for distinguishing different species. In the case of mushrooms, these explicit rules can be particularly important because edible and poisonous mushrooms often look quite similar. One important factor driving this theoretical shift was the finding that when subjects were told to use a particular categorization rule, exemplar-based models could not account for the observed categorization behavior (e.g., Nosofsky, Clark, & Shin, 1989). The RULEX model (Nosofsky, Palmeri, & McKinley, 1994) posits that even when people are not given a rule or are not told to create a rule they form simple rules anyway when learning a category. What distinguishes RULEX from earlier rule-based models is that it is a rule-plus-exception model, hence the name RULEX: People form simple rules that may work pretty well and then store in memory any exceptions to those rules (see also Nosofsky & Palmeri, 1998; Palmeri & Nosofsky, 1995; Sakamoto & Love, 2004). RULEX accounts extremely well for a wide array of phenomena that are also consistent with prototype and exemplar models; and under some conditions individual subject behavior is more consistent with RULEX than exemplar or prototype models (Johansen & Palmeri, 2002; Nosofsky et al., 1994). RULEX was perhaps the first of a class of hybrid categorization models combining rules with other nonanalytic forms of category representations (Ashby, AlfonsoReese, Turken, & Waldron, 1998; Erickson & Kruschke, 1998; Goodman, Tenebaum, Feldman, & Griffiths, 2008; Nosofsky & Palmeri, 1998; Palmeri, 1997). The success of a model like RULEX provides just one illustration of how difficult it can be to distinguish abstract rule-based from exemplar-based (or more generally similarity-based) models of categorization (see also Johansen & Palmeri, 2002; Nosofsky & Johansen, 2000). What are arguably polar extremes of the representational continuum can produce remarkably similar behavioral predictions. Researchers have more recently looked to cognitive neuroscience data for evidence for a rule-based mode of categorization. Motivated by hypotheses about the underlying neural systems supporting different kinds of categorization, Ashby, Maddox, and colleagues have conducted a series of behavioral experiments that attempt to selectively influence rule-based versus similarity-based categorization. For example, introducing certain kinds of secondary distractor tasks during category learning can selectively interfere with rule-based but not similarity-based categorization (Waldron & Ashby, 2001, but see Nosofsky & Kruschke, 2001), whereas delaying corrective feedback
8/17/09 2:12:49 PM
402
Categorization
can selectively interfere with similarity-based but not rulebased categorization (Maddox, Ashby, & Bohil, 2003). Neuropsychological evidence also suggests a role for rule-based categorization and provides clues as to the specific brain structures involved. For example, patients with prefrontal cortex lesions are impaired at the Wisconsin Card Sorting Test (WCST), a task that requires sorting cards according to logically defined rules (Milner, 1963; Robinson, Heaton, Lehan, & Stilson, 1980). Parkinson’s disease patients also seem to show selective impairment in rule-based but not similarity-based categorization (Ashby, Noble, Filoteo, Waldron, & Ell, 2003; Brown & Stubbs, 1988; Cools, van den Bercken, van Spaendonck, & Berger, 1984; Downes et al., 1989). Parkinson’s disease has been linked to basal ganglia damage, specifically in the head of the caudate nucleus, which has reciprocal connections to the prefrontal cortex. Additional evidence for a rulebased system comes from neuroimaging data in healthy adults. One early study contrasted similarity-based versus rule-based categorization strategies (Allen & Brooks, 1991) that seemed to recruit different networks of brain areas as revealed by PET (E. E. Smith, Patalano, & Jonides, 1998). fMRI during rule-based categorization reveals activation in the right dorsal-lateral prefrontal cortex (Konishi et al., 1998; Seger & Cincotta, 2005) and the head of the right caudate nucleus (Konishi et al., 1998; see also Lombardi et al., 1999; Monchi, Petrides, Petre, Worsley, & Dagher, 2001; Seger & Cincotta, 2005). A variety of computational cognitive neuroscience models have implicated an interactive role for the prefrontal cortex and the basal ganglia (specifically the caudate nucleus of the striatum) in important aspects of various cognitive tasks (Ashby et al., 1998; Frank & Claus, 2006; Houk & Wise, 1995), but these models differ in important details regarding whether the basal ganglia is the core locus of learning or plays a more modulatory role. Overall, the converging results from behavioral, neuropsychological, neuroimaging, and computational studies suggest the existence of a network of brain areas, including the prefrontal cortex and the caudate, that are critically involved in rule-based categorization (Ashby & O’Brien, 2005). Categorization as a Skill While some categorizations require explicit rules—and sometimes complex rules at that—other categorizations are made quickly and effortlessly, and perhaps without conscious intention. Such categorization has a qualitatively different flavor from rule use and can be considered something more like a habit or a skill that can be executed automatically. Palmeri (1997) explored how categorizations as skills can become automatized through an elaboration of Logan’s
c20.indd 402
(1988) instance theory of automaticity. Instance theory is a general theory of automaticity of cognitive skills that posits a shift from more algorithmic or rule-based processing early in learning to memory retrieval of specific experienced instances later in learning (for some fMRI evidence consistent with instance theory, see Dobbins, Schnyer, Verfaellie, & Schacter, 2004; see also Logan, 1990, 2002; Palmeri, Wong, & Gauthier, 2004). Palmeri (1997) conceptualized the development of automaticity as a race between a rule-based categorization process and an exemplar-based categorization process (Nosofsky & Palmeri, 1997). Early in learning, rules are executed faster than category exemplars can be retrieved. But as more and more exemplars are experienced and are stored as part of the category representation, the exemplar-based categorization process eventually wins the race. Categorization is automatic when it’s based on exemplar retrieval instead of rule use. Ashby et al. (2007) proposed a computational cognitive neuroscience model called Subcortical Pathways Enable Expertise Development (SPEED) that shares some important computational principles with instance theory and exemplar-based models of categorization (Nosofsky & Palmeri, 1997; Palmeri, 1997). Like exemplar models, SPEED is a member of a family of computational theories called “nonparametric classifiers” (Ashby & AlfonsoReese, 1995). These models are nonparametric in the sense of a contrast with so-called “parametric classifiers” like prototype theories that assume a specific (often normal) distribution of category members (Ashby, 1992). But SPEED specifically assumes a shift from category representations mediated by cortico-striatal loops to category representations mediated by direct cortico-cortico connections. Cortico-striatal loops appear to play an important role in category learning (Ashby et al., 1998), even if more permanent long-term category knowledge may ultimately rely on direct cortical representations. Significant evidence suggests an important role for the basal ganglia, specifically the striatum, in categorization— at least for certain kinds of categorization and at certain points in learning (Shohamy, Myers, Kalanithi, & Gluck, 2008). Huntington’s disease (HD) and Parkinson’s disease (PD) are characterized by damage to the basal ganglia (for HD there is direct damage to the striatum whereas for PD there is damage to the substantia nigra that interacts critically with the striatum). HD and PD are classically characterized by their severe motor impairments, but it has long been known that these diseases also more generally impair motor skill learning and other procedural learning tasks (e.g., Mishkin, Malamut, & Bachevalier, 1984; Saint-Cyr, Taylor, & Lang, 1988). HD and PD also impair certain kinds of category learning as well, such as those involving a probabilistic association of cues to categories
8/17/09 2:12:49 PM
Memory and Learning Systems That Support Categorization 403
(e.g., Knowlton, Mangels, & Squire, 1996; Knowlton, Squire, et al., 1996) and those involving an integration of information across multiple stimulus dimensions (e.g., Ashby et al., 2003; Filoteo, Maddox, & Davis, 2001; Maddox, Aparicio, Marchant, & Ivry, 2005; Maddox & Filoteo, 2001). These patterns of deficits in HD and PD implicate an important role of the striatum in novel category learning (Ashby & O’Brien, 2005). In addition to such neuropychological studies, a body of fMRI data also implicates the basal ganglia, specifically the striatum, in these kinds of novel category learning tasks (Poldrack et al., 2001; Poldack, Prabhakaran, Seger, & Gabrieli, 1999; Poldrack & Rodriguez, 2004; Seger & Cincotta, 2005). Categorization and Episodic Memory Having an episodic memory allows us to recognize when we have seen particular objects in particular situations. For example, in order to recognize that you have previously seen a yawning, orange cat sitting on a green bench in a grassy park, you must be able to access a coherent memory trace that includes all of the characteristics of this scene. The relationship, both computational and neuroanatomical, between the memories used to support explicit recognition of objects and the representations used to support object categorization has been vigorously debated. On the one hand, exemplar-based models propose that the same exemplar memories used to support categorization are used to support explicit recognition as well (e.g., Nosofsky, 1991, 1992). On the other hand, some have argued that while exemplar memories may be used to support some relatively ad hoc categories (Ashby & O’Brien, 2005), they play little or no role in most kinds of categorization (e.g., Ashby et al., 2007). The primary source of evidence against any close relationship between episodic memory and categorization and their underlying neural underpinnings comes from studies testing individuals with anterograde amnesia, a condition characterized by profound explicit memory deficits caused by damage to the hippocampus and neighboring medial temporal brain areas. Specifically, Knowlton and Squire (1993; Squire & Knowlton, 1995; see also Reed, Squire, Patalano, E. E. Smith, & Jonides, 1999) observed a behavioral dissociation between recognition and categorization, whereby individuals with anterograde amnesia who are significantly impaired at explicit recognition memory perform normally at categorization. According to Knowlton and Squire, this behavioral dissociation between categorization and recognition provided a direct falsification of exemplar-based models. But dissociations, and even double dissociations, are only weak evidence in favor of modular theories (Plaut, 1995; Shallice, 1988). A direct instantiation of an exemplar
c20.indd 403
model, whereby simulated individuals with amnesia have significantly degraded exemplar memories compared to simulated controls, predicts the very dissociation Knowlton and Squire claimed as a falsification of exemplar models (Nosofsky & Zaki, 1998; Palmeri & Flanery, 2002). Other research supporting a functional dissociation between categorization and recognition (Filoteo et al., 2001; Reed et al., 1999; J. D. Smith & Minda, 2001) suffers from a variety of theoretical, statistical, and methodological problems (Kinder & Shanks, 2001; Palmeri & Flanery, 1999, 2002; Zaki, 2005; Zaki & Nosofsky, 2001, 2004). Moreover, there is research showing that individuals with explicit memory deficits show impairments in categorization as well (Graham et al., 2006; Hopkins, Myers, Shohamy, Grossman, & Gluck, 2004; Zaki, Nosofsky, Ramercad, & Unverzagt, 2003; see also Meeter, Myers, Shohamy, Hopkins, & Gluck, 2006). The most widely studied cases of anterograde amnesia are caused by damage to the hippocampus and associated medial temporal lobe (MTL) structures (e.g., Squire, 2004). So debates about the relationship between categorization and episodic memory engender debates about the role of the hippocampus in categorization. According to some multiple memory systems theories, explicit episodic memory is supported by the hippocampus whereas categorization involves implicit procedural memory that is supported by the basal ganglia and cortex (Squire & Zola, 1996). Some computational cognitive neuroscience models eschew entirely any role for the hippocampus in categorization (e.g., Ashby et al., 1998, 2007; Ashby & O’Brien, 2005) or do not discuss whether the hippocampus has any role (e.g., Riesenhuber & Poggio, 1999, 2002). But evidence is building for a role of the hippocampus in categorization. As discussed previously, hippocampal damage in individuals with anterograde amnesia does lead to significant categorization deficits. These results mirror other neuropsychological findings that suggest the hippocampus is involved in purportedly implicit forms of memory (e.g., Chun & Phelps, 1999). In addition, functional brain imaging provides evidence that the hippocampus is recruited during categorization. Reber, Gitelman, Parrish, and Mesulam (2003) found greater MTL activation when healthy adults learned categories intentionally compared to when they learn them implicitly. Poldrack et al. (1999, 2001; see also Foerde, Knowlton, & Poldrack, 2006; Seger & Cincotta, 2005) observed a trade off between hippocampal and basal ganglia activation during novel category learning, suggesting an interaction in the computations performed by these two important neural systems (Poldrack & Rodriguez, 2004). The recruitment of the hippocampus appears to be more pronounced early in learning novel categories. One hypothesis is that the MTL helps set up the representations
8/17/09 2:12:49 PM
404
Categorization
of novel stimuli that are then used by other brain areas (such as the basal ganglia or prefrontal cortex) to assign those stimuli to categories. This role as a novel representational engine has been proposed in computational models (e.g., Gluck & Myers, 1993; Meeter, Myers, & Gluck, 2005). Specifically, SUSTAIN (Supervised and Unsupervised STratified Adaptive Incremental Network; Love et al., 2004) is a cognitive model of categorization that shares properties with various exemplar, prototype, and rule-based models, and has accounted for an array of fundamental categorization phenomena. More recently, the computational mechanisms within SUSTAIN have been grounded in a network of brain areas, with the hippocampus playing a critical role in encoding novel stimuli that cannot be accommodated by the current category representations (Love & Gureckis, 2007). Instead of linking specific brain areas with particular kinds of cognitive tasks, whether episodic memory or categorization or priming, it seems more fruitful to consider the computations performed by those brain areas in the service of complex tasks (Palmeri & Flanery, 2002; Turke-Browne, Yi, & Chun, 2006).
CATEGORY-SPECIFIC SYSTEMS FOR CATEGORIZATION Some arguments for multiple systems for categorization are based on structural aspects of the categories to be learned (e.g., whether they permit single rules or not), aspects of the task (e.g., the timing and quality of feedback), and the amount of learning. In the following section, we introduce work from a different tradition that studies the organization of the neural substrates responsible for the perception of different object categories in the brain. In this work, claims of multiple categorization systems have also been made. Specifically, that some categories are special in that they engage specialized brain areas. Specialized systems dedicated to perception of specialized categories have been claimed for faces (Kanwisher, McDermott, & Chun, 1996, 1997), places (Epstein, Harris, Stanley, & Kanwisher, 1999), body parts (Downing, Jiang, Shuman, Kanwisher, 2001), words (Cohen et al., 2000; Nobre, Allison, & McCarthy, 1994), letterstrings (Polk et al., 2002), and even single letters (K. H. James, James, Jobard, Wong, & Gauthier, 2005). We provide an overview of the evidence that has led researchers to postulate category-specific perceptual systems and then discuss some alternative interpretations of these results. To the extent that categorization studies are performed with visual stimuli such as faces (e.g., Goldstone & Styvers, 2001) or novel items that may be animal-like (Allen & Brooks, 1991; Reed et al., 1999) or not (Knowlton & Squire, 1993; Posner & Keele, 1968),
c20.indd 404
understanding the systems involved in their perception may be crucial. We often use face processing as the main example domain in what follows because it has been studied the most extensively. Studies of patients with brain damage resulting in deficits in the visual recognition of objects suggest that the visual system, at least on a fairly coarse scale, may be organized around categories. While most cases of brain damage to the visual cortex result in deficits with virtually any category tested, in relatively rare cases, category-specific deficits are observed. These patients have difficulty identifying visually presented objects from certain categories, despite good basic visual skills. For example, when shown a picture of a banana, a patient may be unable to say “banana” or retrieve semantic information about bananas, but they may be able to describe its shape and identify that the object is yellow. Category-specific agnosias have been found for biological objects (e.g., Hillis & Caramazza, 1991; McCarthy & Warrington, 1988; Warrington & Shallice, 1984) artifacts (e.g., Hillis & Caramazza, 1991; Warrington & McCarthy, 1983, 1987), faces (e.g., Farah, 1996; Farah, Levinson, & Klein, 1995; Henke, Schweinberger, Grigo, Klos, & Sommer, 1998), and words (e.g., Warrington & Shallice, 1980). One patient presented with deficits in recognizing any object or word, except for extremely well-preserved face recognition skills (Moscovitch, Winocur, & Behrmann, 1997). At the other end of the spatial scale, neurophysiology in the monkey reveals selectivity of single cells for particular objects, such as faces, in several regions of the temporal lobe (e.g., Baylis & Rolls, 1987; Desimone, Albright, Gross, & Bruce, 1984; Gross, Bender, & Rocha-Miranda, 1969) and elsewhere in the brain such as the amygdala (e.g., Rolls, 1992) and the frontal cortex (e.g., Wilson, Scalaidhe, & GoldmanRakic, 1993) although the cells selective for any category are only a fraction, typically about 20%, of the population of neurons recorded from. Recent work, however, suggests that when using single cell recording within the face-selective patches localized with fMRI in the monkey brain, virtually all neurons are selective for faces (Tsao, Freiwald, Tootell, & Livingstone, 2006). Thus, neuropsychology and neurophysiology together suggest categoryselective responses that are distributed over the ventral cortex, with at least some categories showing a high degree of spatial clustering. Much of our knowledge about the organization of the visual recognition system in the human brain comes from much less invasive work using brain imaging in normal subjects. For instance, scalp recordings reveal faceselective (e.g., Bentin, Allison, Puce, Perez, & McCarthy, 1996; Rossion et al., 2000) and letter-selective (e.g., Wong, Gauthier, Woroch, DeBuse, & Curran, 2005) potentials that peak about 170 ms after the presentation of the image.
8/17/09 2:12:50 PM
Category-Specific Systems for Categorization
But the evidence that has perhaps received the most attention comes from studies using fMRI, a technique with better spatial resolution than event-related potentials (ERP), and which reveals brain regions selectively engaged by faces (Gauthier, Tarr, Moylan, Anderson, & Gore, 2000; fusiform gyri, lateral occipital gyri, superior temporal sulcus; e.g., Kanwisher et al., 1997; Puce, Allison, Bentin, Gore, & McCarthy, 1998; see also Sergent, Ohta, MacDonald, & Zuck, 1994), animals (lateral fusiform, e.g., Chao, Haxby, & Martin, 1999; Martin, Wiggs, Ungerleider, & Haxby, 1996), tools (left premotor area, medial fusiform gyrus; e.g., Chao et al., 1999; Martin et al., 1996), words, letter strings, and single letters (left fusiform, left occipito-temporal junction, e.g., Cohen et al., 2000; Flowers et al., 2004; K. H. James et al., 2005; Polk et al., 2002; Puce, Allison, Asgari, Gore, & McCarthy, 1996). Categories that are even more rarely selectively impaired in brain damage also reveal similar specialization. For instance, a “place area” was discovered in the parahippocampal gyrus that responds strongly to scenes, buildings, and other spatial landmarks (Aguirre, Zarahn, & D’Esposito, 1998; Epstein et al., 1999; Epstein & Kanwisher, 1998). Regions of the lateral occipitotemporal cortex (Downing et al., 2001) and fusiform gyrus (Peelen, Wiggett, & Downing, 2006) were found to selectively respond to body parts and areas of the superior temporal sulcus respond selectively to biological motion (Grossman & Blake, 2002). The typical locus for some of these areas is shown in Figure 20.5. There are several possible explanations for the apparent category specialization in the brain. One option is to take the compartmentalization observed in fMRI maps at face value and conclude that there may be separate modules responsible for processing different object categories. In this context, modularity does not simply refer to an anatomically distinct neural area, but instead invokes a Fodorian (Fodor, 1983) sense of modules as specialized, encapsulated mental subsystems that handle specific information—they are domain-specific entities that function independently of one another and of background beliefs of the subject. Modular claims are found throughout psychology and cognitive neuroscience and it is rare that they do not lead to heated debates. We briefly summarize some of the evidence that has led researchers to question the idea that category specialization in the ventral visual system represents modular organization. Modular Accounts Modular accounts of category specialization often suggest that evolutionary pressures caused the creation of specific modules for processing categories that are relevant to survival, like animals, plants, and conspecifics, more quickly
c20.indd 405
405
5
13
11
1
10
12
6 2
9
4 6 6
17
16 15
14
3 8 7
Figure 20.5 Typical location of some category-selective peak activations shown on a ventral view of brain. Note: An individual brain was segmented and then inflated so as to make the sulci (dark grey) as well as the gyri (light grey) visible. 1 ⫽ Right fusiform face area (Gauthier, Skudlarski, et al., 2000); 2 ⫽ Left fusiform face area (Gauthier, Skudlarski, et al., 2000); 3 ⫽ Right occipital face area (Gauthier, Skudlarski, et al., 2000); 4 ⫽ Visual word form area (K. H. James et al., 2005); 5 ⫽ Single letters (K. H. James et al., 2005); 6 ⫽ Letterstrings (K. H. James et al., 2005); 7, 8, and 9 ⫽ Animals (Chao et al., 1999); 10⫽tools (Chao et al., 1999); 11 ⫽ Left parahippocampal place area (Epstein et al., 1999); 12 ⫽ Right parahippocampal area (Epstein et al., 1999); 13 ⫽ Fusiform body area (Peelen et al., 2006); 14 ⫽ Left extrastriate body area (Peelen et al., 2006); 15 ⫽ Right extrastriate body area (Peelen et al., 2006); 16 ⫽ Left biological motion area (Grossman & Blake, 2002); 17 ⫽ Right biological motion area (Grossman & Blake, 2002).
(Caramazza & Shelton, 1998): Is that animal a potential predator, a potential food source, or a potential mate? Is this plant poisonous, edible, or medicinal? Similarly, if you are walking alone at night, recognizing the face of the person coming toward you as either a friend or an enemy is a decision you would want to make rapidly and accurately. A specialized processing module for important categories of objects would confer survival advantage. It is clear, however, that for some domains of apparent modularity, such as reading, it begs reason to suggest that such specialization would be innate. Therefore, modules, if they exist, can be either innate or learned. Generally, modular accounts do not predict that there is a module in the brain for every object category we interact with. Instead, a few categories are thought to have a special status either because of evolutionary pressures or experience. For instance, there is a double dissociation between living and nonliving things, with some patients showing an impairment for living but not nonliving things (e.g., Farah, McMullen, & Meyer, 1991; Hillis & Caramazza, 1991; McCarthy & Warrington, 1988; Sheridan & Humphreys,
8/17/09 2:12:50 PM
406
Categorization
1993) and other patients show the opposite deficit (e.g., Hillis & Caramazza, 1991; Sacchett & Humphreys, 1992; Warrington & McCarthy, 1983, 1987). However, the cases of deficits recognizing living things far outnumber the reported cases of deficits recognizing nonliving things. This suggests that it is the processing of living things that is specialized, or at least more localized (Caramazza & Shelton, 1998). A similar double dissociation has been observed with faces and objects, where patients with either acquired or congenital deficits with a condition known as prosopagnosia are impaired at recognizing faces, although recognition of other objects is relatively unimpaired (e.g., Duchaine, 2000; Farah, 1996; Farah et al., 1995). In very rare cases, when object recognition is impaired, face recognition can be spared (Moscovitch et al., 1997; Rumiati & Humphreys, 1997). Though rare, the existence of patients who show a selective impairment in a domain that is more frequently preserved is crucial to the modularity argument: Their existence refutes the idea that one domain (e.g., face perception) may simply be more difficult than another domain (e.g., object perception). Distributed Representations Modular explanations of the mind and the brain capture the imagination and capture the attention of the press. The apparent discovery of brain modules responsible for recognizing body parts (Downing et al., 2001), intelligence (Duncan et al., 2000), and moral reasoning (Greene, Sommerville, Nystrom, Darley, & Cohen, 2001) are covered by the press in much the same way as the discovery of a new dinosaur skeleton, a new planet, or a new bird species. Yet, neuropsychologists have long recognized significant challenges for inferring modularity from patterns of behavioral deficits caused by brain damage: Deficits result from large lesions that vary considerably between patients and the behavioral dissociations are rarely all that “clean.” For example, in the case of the living/nonliving dissociation, the majority of patients present deficits that cross the living/nonliving boundaries (Bukach, Bub, Masson, & Lindsay, 2004; Warrington & McCarthy, 1987; Warrington & Shallice, 1984). Similarly, prosopagnosic patients, whether acquired by brain damage or through congenital defect, often present with problems in nonface perception (Behrmann, Avidan, Marotta, & Kimchi, 2005; Gauthier, Behrmann, & Tarr, 1999). A common interpretation of this pattern of results is that the lesions in most patients extend beyond the boundaries of a single module (e.g., Farah, 1990). And even if this is correct, it is clear that dissociations may be caused by a different modular organization from what might be apparent at first blush. For example, the living/nonliving dissociation
c20.indd 406
may actually represent modular organization along visual features versus functional features (Farah & McClelland, 1991; Warrington & Shallice, 1984). But another interpretation of double dissociations based on rare patients is that these rare patients are simply outliers who are not representative of the underlying population of brain structures. Unfortunately, brain insults happen on a daily basis. Yet, category-specific deficits occur in just a tiny fraction of cases. Simulated brain damage in neural networks that have no modular organization whatsoever can yield a small number of cases that appear to suggest modularity (Plaut, 1995). If modules exist, then we should expect double dissociations. But double dissociations are not sufficient to prove the existence of modules (Shallice, 1988). This makes it necessary to use converging evidence from many techniques to help interpret patterns of deficits. Category representations can be fairly distributed and overlapping in the brain yet brain damage can produce, in some rare cases, quite selective deficits that suggest modularity. There is now considerable evidence that the representations of different categories are distributed and overlapping. In a classic study, Haxby et al. (2001) found that objects from different categories elicit replicable (and partly overlapping) patterns of activation across the entire ventral temporal cortex, rather than selective activation in a localized region. Subjects in the scanner were shown images of objects from various categories such as faces, houses, bottles, cats, and shoes. The pattern of activity for these categories over thousands of voxels was found to replicate between two halves of the data set, demonstrating how one could decode what a subject is seeing from the brain activity alone. This demonstration led many scientists to consider the importance of more distributed patterns of cortical activity. Some of the most exciting methods for analyzing fMRI data were inspired by that work (Kamitani & Tong, 2005; Norman, Polyn, Detre, & Haxby, 2006). Nonetheless, other researchers still emphasize the significance of the maximal response elicited in a specific brain area rather than the distributed pattern (Op de Beeck, 2008; Spiridon & Kanwisher, 2002). While the finding of distributed and partly overlapping maps for different categories is generally accepted, what remains vigorously debated is whether all categories are represented in this manner or if some special categories, such as faces, are much more localized (Hanson & Halchenko, 2008; Spiridon & Kanwisher, 2002). That category representations are distributed within the visual system may seem even less surprising when considering evidence that categories are in fact distributed over the whole brain. For instance, according to Barsalou’s (1999) perceptual symbol systems theory (Barsalou, 2008; Martin, 2007), concepts are represented in the collection
8/17/09 2:12:51 PM
Category-Specific Systems for Categorization
of modal systems for perception and action, rather than amodal symbols. Concepts, even abstract concepts, are thought to recruit a distributed representation across the brain because information from different sensory modalities is stored in modality-specific systems. When participants engage in a verbal conceptual task with words from different categories (e.g., animals and tools), the resulting activation is highly similar to the patterns evoked by the presence of physical objects from different categories (Chao et al., 1999). Modality-specific information associated with a concept appears to be automatically engaged, regardless of the task. Such findings are relevant to the interpretation of studies where objects from different categories are contrasted. Not only do objects from the same category look alike, but they are likely associated with similar semantic knowledge. These associations influence the pattern of brain activity observed in response to the presentation of the object. This was demonstrated in a study where arbitrary semantic information was associated with novel objects through a short training task, and where these features appeared to be engaged automatically upon object perception (T. W. James & Gauthier, 2003). Outside of the scanner, objects were first associated with verbal labels describing auditory features (e.g., “whistles,” “hisses”) or motion features (e.g., “hops,” “crawls”). Later in the scanner, subjects performed visual matching judgments on pairs of objects. Strikingly, modality-specific cortices (the auditory cortex and an area that responds to biological motion) were engaged automatically based on prior associations that were completely irrelevant to the visual matching task. If these effects can emerge after a short training procedure, there could be a challenge in interpreting patterns of selectivity to visually presented familiar objects that subjects have acquired a lifetime of associations. Cats and faces and bottles have different shapes and they are also associated with different semantic information, making it difficult to know whether the distributed object maps in the visual system are maps of shape per se or maps of other dimensions (Op de Beeck, 2008). Experience and Expertise Another alternative to a modular account for how different categories are represented in the brain is that the observed cortical representation of categories represents the interaction between processing biases in the cortex and the varied task demands associated with the objects. One specific account, the process map hypothesis (Gauthier, 2000), argues that category-selectivity reflects the automatization of strategies that are learned during experience with a category. Automatic strategies associated with category membership could produce patterns of category-selectivity
c20.indd 407
407
in the brain even if there were no maps of object shape or of object categories. This could happen if the ventral temporal cortex shows organization that reflects intersecting gradients in processing. For example, a gradient of eccentricity exists over the topographic extent of the visual cortex in the temporal lobe, and a continuum from local parts to holistic representations has been proposed (Hasson, Levy, Behrmann, Hendler, & Malach, 2001; Lerner, Hendler, Ben-Bashat, Harel, & Malach, 2001). Whatever the nature of the underlying dimensions relevant to processing (and they are largely unknown), the general idea is that any point in such a map would be unique and best suited to learn a specific visual categorization task. For instance, faces have to be identified at the subordinate level, and for that purpose, metric relations between parts (also called configural information) appear to be particularly useful (Tanaka & Sengco, 1997; Young, Hellawell, & Hay, 1987). Training at the subordinate level encourages participants to use a more “holistic” strategy (Diamond & Carey, 1986), in which participants find it more difficult to ignore task-irrelevant parts of the object (Young et al., 1987). The process-map hypothesis suggests that faces come to engage the fusiform face area (FFA) because it is best suited for holistic processing, the default mode of processing for faces, and predicts that other objects recognized using the same strategy, regardless of their shape, should also engage the same area. This prediction was first tested in a perceptual expertise training study with a set of artificial stimuli called “Greebles.” Greebles were designed to replicate some critical aspect of faces, such as the fact that they share a small number of parts in a common configuration (Figure 20.6). The training was modeled after the constraints of face recognition and other types of real-world expertise. That is, subjects learned to categorize Greebles in families and to name individual Greebles and to discriminate them from other visually similar Greebles, as we do every day with faces. Training continued until subjects were as quick to categorize Greebles at the individual level as they were at categorizing them at a more abstract “family” level. Fast individuation is a hallmark of expertise in real-world domains (e.g., Tanaka & Taylor, 1991). Behavioral studies of Greeble training showed that these objects were processed more like faces following training. In particular, Greeble experts processed Greebles more holistically, finding it difficult to selectively attend to part of these objects (Gauthier & Tarr, 1997; Gauthier, Williams, Tarr, & Tanaka, 1998). A comparison of brain activity before and after Greeble training revealed an increase of activity for upright Greebles in faceselective areas in the occipital lobe (what is now called occipital face area or OFA), the mid-fusiform face area (FFA; Figure 20.6) and a face-selective region of the anterior
8/17/09 2:12:51 PM
408
Categorization
Genders
(A)
Families
Posttraining
Pretraining
(B)
Figure 20.6 (Figure C.28 in color section) A: Examples of the Greeble objects used in the Gauthier and Tarr (1997), Gauthier et al. (1998), Gauthier and Tarr (2002), and Gauthier, Behrmann, et al., (1999) expertise studies. B: Average fMRI results before and after Greeble expertise training. Note: (A) Greeble objects share a general configuration of parts, and the set is organized hierarchically with two genders (defined by all parts pointing up versus down) and several families (defined by body shapes). Training required subjects to learn to discriminate Greebles of the same gender and family (red arrow) as fast as they could discriminate two objects from different families (yellow arrow). (B) The highlighted region is centered on the FFA. Red and yellow areas responded more to upright than upside-down stimuli, while blue to purple areas responded more to upside-down images. Upright faces elicit more activity in this area than upside-down faces. However, the same effect is only observed for Greebles after expertise training with upright Greebles. From Gauthier, Tarr, Anderson, Skudlarski, & Gore (1999). Adapted with permission.
temporal lobe (Gauthier, Tarr, Moylan, Anderson, & Gore, 2000). Later work showed that behavioral increases in configural processing were correlated with changes of activity in the FFA across subjects (Gauthier & Tarr, 2002). The Greeble work suggests that changes in the way that a category is processed with the acquisition of perceptual expertise are critical in recruiting specific areas of the ventral temporal cortex for its processing. The recruitment of the FFA in expert perception has been confirmed in studies of real-world expertise with cars or birds, where the degree of FFA activity in response to images of cars, for example, shows a very strong correlation with a behavioral measure of expertise over several independent experiments (Gauthier, Skudlarski, Gore, & Anderson, 2000; Gauthier, Curby, Skudlarski, & Epstein,
c20.indd 408
2005; Xu, 2005). As might be predicted based on such results, individuals with Autism, who show abnormalities in face processing that can be apparent early in development (e.g., Klin & Jones, 2008), show reduced selectivity to faces in the fusiform gyrus (e.g., Hubl et al., 2003; Pierce, Muller, Ambrose, Allen, & Courshenes, 2001; Schultz et al., 2000). Consistent with the idea that this hypoactivity is due to a lack of expertise, a boy with Autism who acquired perceptual expertise with Digimon cartoon characters showed specialization for Digimon but not faces in the fusiform gyrus (Grelotti et al., 2005). Finally, consistent with an expertise account of face-selective effects, the N170 face-selective ERP component is larger in amplitude for various nonface homogenous objects in expert observers (Busey & Vanderkolk, 2005; Gauthier, Curran, Curby, & Collins, 2003; Rossion, Gauthier, Goffaux, Tarr, & Crommelinck, 2002; J. W. Tanaka & Curran, 2001). However, extensive practice with a category does not always recruit face-selective areas. A handful of fMRI training studies with object categories have been conducted and have led to inconsistent results in terms of the specific regions engaged. With close examination of the particulars of these studies, this inconsistency may not be surprising given that the studies varied greatly on several dimensions, including object geometry, amount of training, and the specific training task practiced by subjects (Jiang et al., 2007; Moore, Cohen, & Ranganath, 2006; Op de Beeck, Baker, DiCarlo, & Kanwisher, 2006; Xue & Poldrack, 2007; Yue, Tjan, & Biederman, 2006). Despite these differences, one region, the lateral occipital complex, is a more consistent locus of change across studies, suggesting that it may be more sensitive to exposure to a category than to the specific constraints of the training. Human ERPs and recordings in monkeys reveal that responses to objects can change in the ventral occipital cortex due to mere exposure (Peissig, Singer, Kawasaki, & Sheinberg, 2007; Scott, Tanaka, Sheinberg, & Curran, 2006). In contrast, the FFA may be more important when experts process objects holistically, a strategy that was only assessed directly in the Greeble training study. The adoption of a holistic strategy by subjects was suggested in one study (Moore et al., 2006) where training led to an inversion effect (inversion disrupts holistic processing with faces; Tanaka & Sengco, 1997; Young et al., 1987) and in that study, a small training effect was obtained in the FFA. Clearly, there are domains of expertise with visual categories, such as print, that do not rely on configural perception and lead to specialization outside of the face-selective system (McCandliss, Cohen, & Dehaene, 2003). Thus, exposure with objects may be enough to produce some changes in the visual system (Freedman, Riesenhuber, Poggio, &
8/17/09 2:12:51 PM
Category-Specific Systems for Categorization
Miller, 2006) but there may also be a record of the manner in which experience with a category is acquired, in terms of the perceptual strategy and neural substrates that come to be automatically engaged by category members. Our ability to interpret patterns of differences across training studies is seriously limited by the fact that fMRI training studies almost never compare two types of trainings with the same object category. Wong (2007) trained two groups of subjects with the same set of objects. One group learned to individuate objects as in Greeble training, while the other group was given equal exposure to objects but learned to classify them rapidly at the basic level. Only the individuation group demonstrated a switch to configural processing and an increase of activity near the FFA, with the behavioral and neural changes correlated across subjects. In contrast, rapid basic-level processing led to changes in more lateral areas of the occipito-temporal cortex, near the standard visual word form area. This work is unique in contrasting different types of experiences for the same category, as the majority of fMRI studies contrast different object categories, leading to effects that can be interpreted as indicating that the pattern of selectivity in ventral temporal cortex codes for variations in the shape of objects. Although there is no question that objects with similar shapes tend to recruit similar neural substrates in the same subject, which part of the neural network is recruited for objects with a given geometry in a given individual may be to some extent determined by experience processing objects from that category. Computational modeling supports the claim that the FFA is a subordinate-level, fine-grained visual discrimination area, whose main feature is performing transformations that magnify differences between highly similar visual items (Joyce & Cottrell, 2004). Tong, Joyce, and Cottrell (2007) first trained neural networks to discriminate several basic-level categories (e.g., cups, Greebles, and cans). “Expert” networks were additionally trained to discriminate items within one of these categories at the subordinate level. In the second phase, the learned weights from the first phase of training were saved, and both the basic-level and expert-level networks were trained on new subordinate-level discriminations. Results showed that although in the first phase basic-level discriminations were learned more quickly than subordinate-level discriminations, once the “expert” network was trained, learning new subordinate-level discriminations occurred more rapidly for the expert network than the basic-level network. This suggests that a neural network trained to perform subordinate-level discriminations on one class of objects shows an advantage in learning a new class at the subordinate level—because of extensive early experience with faces,
c20.indd 409
409
the FFA becomes a skilled subordinate-level classifier for faces that is later recruited by other domains of visual expertise. So far we have only considered the case of expertise for objects in homogeneous categories such as faces, cars, and birds, where the goal is rapid individuation. Recent work has also explored expertise for letters and words. In contrast to faces, birds, and cars, which are typically individuated at the subordinate level by experts, for letters the goal of experts is basic-level categorization (an A is an A regardless of changes in font or style; Wong & Gauthier, in press). However, to facilitate reading, one wants to rapidly perceive a sequence of items to make a word. This is made easier by regularity in font style—it is easier to READ THIS than it is to rEaD tHiS (Sanocki, 1987, 1988). Furthermore, this effect is not limited to Roman characters: Chinese readers are faster to serially scan a matrix of Chinese characters for targets when the characters are all in the same font, whereas subjects who do not read Chinese do not show this sensitivity to style (Gauthier, Wong, Hayward, & Cheung, 2006). Such sensitivity to font is one example of a perceptual strategy that is more useful for letter perception than for the processing of most other categories. Neurally, several brain regions have been implicated in letter and word expertise: The visual word form area (VWFA; Cohen et al., 2000) responds more to words and pseudowords than nonpronounceable consonant strings. Surprisingly, this area does not show visual selectivity for letters or letter strings, for instance it is equally recruited by strings of Chinese characters in non-Chinese readers (K. H. James et al., 2005). In contrast, visual selectivity for letter strings and single letters is obtained in other parts of the left fusiform gyrus (Flowers et al., 2004; K. H. James et al., 2005; Polk et al., 2002). These findings are not restricted to one particular character set because Chinese-character and Roman-character selective areas overlapped in Chinese-English bilinguals (Baker et al., 2007; Wong, Jobard, James, James, & Gauthier, submitted). The N170 ERP potential is also obtained for words or letter strings (Bentin, Deouell, & Soroker, 1999) and for letters or other characters of expertise (Wong et al., 2005). Because of its selectivity for two very different types of expertise, the N170 may be a general marker of expert processes that can be localized in different brain areas. Scott et al. (2006) compared different trainings with bird categories revealing that both basic- and subordinate-level training enhanced the early N170 component, but only subordinate-level training amplified a later N250 component. Further comparisons of trainings in both ERP and fMRI could lead to a better understanding of the dynamics of perceptual expertise.
8/17/09 2:12:52 PM
410
Categorization
High-Resolution Imaging and Competition Studies In recent years, two different lines of research offer new data for interpreting category selectivity in the FFA. The first uses high-resolution imaging in an attempt to separate patterns of responses to faces and objects, while the second attempts to measure neural (and behavioral) competition that could result from functional overlap. Standard fMRI has a resolution around 3 mm3. At that resolution, each voxel (3D pixel) in the FFA yields a maximal response to faces and a nonzero response to nonface objects. Recent work using higher resolution imaging looked “inside the voxel” to reveal the functional organization of the FFA at a finer spatial scale (1-mm3; GrillSpector, Sayres, & Ress, 2006); this represents a 27-fold increase in resolution. The results revealed that all voxels were maximally selective to faces, but highly face-selective voxels are intermingled with voxels that also showed comparable responses for at least some nonface category, such as animals or cars. The reproducibility of face-selectivity at a finer scale in the FFA is consistent with single-cell recordings in macaque monkeys, within face-selective regions identified by fMRI where 97% of cells are found to be face-selective (Tsao et al., 2006). Analyses in a prior expertise study with car and bird experts had revealed that the single most face-selective FFA voxel at standard resolution showed a clear expertise effect (Gauthier, Skudlarski, et al., 2000), which suggests that expert object responses in the FFA would overlap with face-selectivity at highresolution, and perhaps even at the single-cell level. If a considerable number of neurons in the fusiform gyrus are selective for both faces and objects of expertise, interference between these two domains may be expected in some situations. There could also be interference between face and object perception even if there were no shared neurons, as long as the two populations were strongly interconnected. In other words, instead of focusing on spatial overlap, one can address functional overlap: Is face perception functionally independent from the perception of nonface objects, especially for cases of expertise where a face-like configural strategy is recruited? In one study (Gauthier et al., 2003), subjects with a range of car expertise saw a sequence of faces alternating with cars. Each car or face was made out of two parts (top and bottom) and subjects selectively attended to the bottom of these images and made 1-back judgments for both categories; in this way, the degree of holistic processing could be measured for both categories. In this dual task situation, car experts processed cars more holistically then car novices and processed faces less holistically in the context of cars: Simultaneous processing of faces and cars by car experts
c20.indd 410
appears to create a competition for common resources. This behavioral interference was correlated with the magnitude of the N170 face-selective ERP potential (see also Rossion, Kung, & Tarr, 2004; Rossion, Collins, Goffaux, & Curran, 2007). In more recent work, competition between car and face perception was also obtained in tasks where the cars were completely task-irrelevant (McKeeff, Tong, & Gauthier, 2007; Williams, 2007). Competition between face perception and objects of expertise suggests one or more functional bottlenecks in the brain for configural processing, and because the FFA responds to both faces and objects of expertise, it is tempting to assume that the FFA is one such bottleneck. This is difficult to verify with fMRI at standard resolution because the response to cars and faces cannot be separated, but this could be addressed in future work using high-resolution imaging.
SUMMARY Understanding how objects are categorized is a complex challenge that requires bridging the study of visual perception and visual cognition and cannot be studied without also considering how objects are perceived, identified, and remembered (Palmeri & Tarr, 2008). To date, different aspects of this problem, such as the format of visual object representations and the principles that govern decisions about the categories to which these objects belong, have been explored in separate fields. But more than once, such as on the issue of abstraction or modularity, these independent lines of research have faced similar debates or reached similar conclusions (Palmeri & Gauthier, 2004). The advent of cognitive neuroscience, which provides evidence and constraints from techniques as diverse as psychophysics, brain imaging, neuropsychology, and neurophysiology, may help blur old boundaries between approaches to produce more complete models of object categorization.
REFERENCES Aguirre, G. K., Zarahn, E., & D’Esposito, M. (1998). An area within human ventral cortex sensitive to building stimuli: Evidence and implications. Neuron, 21, 373–383. Allen, S., & Brooks, L. (1991). Specializing the operation of an explicit rule. Journal of Experimental Psychology: General, 120, 3–19. Ashby, F. G. (1992). Multidimensional models of perception and cognition. Hillsdale, NJ: Erlbaum. Ashby, F. G., & Alfonso-Reese, L. (1995). Categorization as probability density estimation. Journal of Mathematical Psychology, 39, 216–233. Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1998). A formal neuropsychological theory of multiple systems in category learning. Psychological Review, 105, 442–481.
8/17/09 2:12:52 PM
References 411 Ashby, F. G., Ennis, J. M., & Spiering, B. J. (2007). A neurobiological theory of automaticity in perceptual categorization. Psychological Review, 114, 632–656.
Caramazza, A., & Shelton, J. (1998). Domain-specific knowledge systems in the brain the animate-inanimate distinction. Journal of Cognitive Neuroscience, 10, 1–34.
Ashby, F. G., Noble, S., Filoteo, J., Waldron, E., & Ell, S. (2003). Category learning deficits in parkinson’s disease. Neuropsychology, 17, 115–124.
Chao, L. L., Haxby, J. V., & Martin, A. (1999). Attribute-based neural substrates in temporal cortex for perceiving and knowing about objects. Nature Neuroscience, 2, 913–919.
Ashby, F. G., & O’Brien, J. B. (2005). Category learning and multiple memory systems. Trends in Cognitive Sciences, 9, 83–89.
Chun, M. M., & Phelps, E. A. (1999). Memory deficits for implicit contextual information in amnesic subjects with hippocampal damage. Nature Neuroscience, 2, 844–847.
Ashby, F. G., & Waldron, E. M. (1999). On the nature of implicit categorization. Psychonomic Bulletin and Review, 6, 363–378. Baker, C. I., Liu, J., Wald, L. L., Kwong, K. K., Benner, T., & Kanwisher, N. (2007). Visual word processing and experiential origins of functional selectivity in human extrastriate cortex. Proceedings of the National Academy of Sciences, USA, 104, 9087–9092. Barsalou, L. W. (1985). Ideals, central tendency, and frequency of instantiation as determinants of graded structure in categorie. Journal of Experimental Psychology: Learning, Memory, and Cognitive, 11, 629–654. Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain Sciences, 22, 577–660. Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645. Baylis, G. C., & Rolls, E. T. (1987). Responses of neurons in the inferior temporal cortex in short term and serial recognition memory tasks. Experimental Brain Research, 65, 614–622. Behrmann, M., Avidan, G., Marotta, J. J., & Kimchi, R. (2005). Detailed exploration of face-related processing in congenital prosopagnosia: Pt. 1. Behavioral findings. Journal of Cognitive Neuroscience, 17, 1130–1149. Bentin, S., Allison, T., Puce, A., Perez, E., & McCarthy. (1996). Electrophysiological studies of face perception in humans. Journal of Cognitive Neuroscience, 8, 551–565. Bentin, S., Deouell, L. Y., & Soroker, N. (1999). Selective visual streaming in face recognition: Evidence from developmental prosopagnosia. NeuroReport, 10, 823–827. Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115–147. Biederman, I., & Gerhardstein, P. C. (1993). Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance. Journal of Experimental Psychology: Human Perception and Performance, 19, 1162–1182.
Collins, A. M., & Quillian, M. R. (1969). Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 8, 240–248. Cools, A., van den Bercken, J., van Spaendonck, K., & Berger, H. (1984). Cognitive and motor shifting aptitude disorder in parkinson’s disease. Journal of Neurology, Neurosurgery, and Psychiatry, 47, 443–453. Desimone, R., Albright, T., Gross, C., & Bruce, C. (1984). Stimulusselective properties of inferior temporal neurons in the macaque. Journal of Neurosciences, 4, 2051–2062. Diamond, R., & Carey, S. (1986). Why faces are and are not special: An effect of expertise. Journal of Experimental Psychology: General, 115, 107–117. DiCarlo, J. J., & Maunsell, J.-H. R. (2003). Anterior inferotemporal neurons of monkeys engaged in object recognition can be highly sensitive to object retinal position. Journal of Neurophysiology, 89, 3264–3278. Dobbins, I. G., Schnyer, D. M., Verfaellie, M., & Schacter, D. L. (2004, March 18). Cortical activity reductions during repetition priming can result from rapid response learning. Nature, 428, 316–319. Downes, J. J., Roberts, A. C., Sahakian, B. J., Evenden, J. L., Morris, R. G., & Robbins, T. W. (1989). Impaired extra-dimensional shift performance in medicated and unmedicated parkinson’s disease: Evidence for a specific attentional dysfunction. Neuropsychologia, 27, 1329–1343. Downing, P. E., Jiang, Y., Shuman, M., & Kanwisher, N. (2001, September 28). A cortical area selective for visual processing of the human body. Science, 293, 2470–2473. Duchaine, B. C. (2000). Developmental prosopagnosia with normal configural processing. Cognitive Neuroscience and Neuropsychology, 11, 79–83.
Booth, M. C. A., & Rolls, E. T. (1998). View-invariant representations of familiar objects by neurons in the inferior temporal visual cortex. Cerebral Cortex, 8, 510–523.
Duncan, J., Seitz, R. J., Kolodny, J., Bor, D., Herzog, H., Ahmed, A., et al. (2000, July 21). A neural basis for general intelligence [see comment]. Science, 289, 457–460.
Bourne, L. (1970). Knowing and using concepts. Psychological Review, 77, 546–556.
Edelman, S. (1997). Computational theories of object recognition. Trends in Cognitive Sciences, 1, 296–304.
Brown, S. W., & Stubbs, D. A. (1988). The psychophysics of retrospective and prospective timing. Perception, 17, 297–310.
Edelman, S. (1999). Representation and recognition in vision. Cambridge, MA: MIT Press.
Bruner, J., Goodnow, J., & Austin, A. (1956). A study of thinking. New York: Wiley.
Epstein, R., Harris, A., Stanley, D., & Kanwisher, N. (1999). The parahippocampal place area: Recognition, navigation, or encoding? Neuron, 23, 115–125.
Bukach, C. M., Bub, D. N., Masson, M. E., & Lindsay, D. S. (2004). Category specificity in normal episodic learning: Applications to object recognition and category-specific agnosia. Cognitive Psychology, 48, 1–46.
c20.indd 411
Cohen, L., Dehaene, S., Naccache, L., Lehericy, S., Dehaene-Lambertz, G., Henaff, M. A., et al. (2000). The visual word form area: Spatial and temporal characterization of an initial stage of reading in normal subjects and posterior split-brain patients. Brain, 123(Pt. 2), 291–307.
Epstein, R., & Kanwisher, N. (1998, April 9). A cortical representation of the local visual environment. Nature, 392, 598–601.
Bülthoff, H. H., & Edelman, S. (1992). Psychophysical support for a two-dimensional view interpolation theory of object recognition. Proceedings of the National Academy of Sciences, USA, 89, 60–64.
Erickson, M. A., & Kruschke, J. K. (1998). Rules and exemplars in category learning. Journal of Experimental Psychology: General, 127, 107–140.
Busemeyer, J., Dewey, G., & Medin, D. (1984). Evaluation of exemplarbased generalization and the abstraction of categorical information. Journal of Experimental Psychology: Learning Memory and Cognition, 10, 638–648.
Farah, M. J. (1990). Visual agnosia: Disorders of object recognition and what they tell us about normal vision. Cambridge, MA: MIT Press.
Busey, T. A., & Vanderkolk, J. R. (2005). Behavioral and electrophysiological evidence for configural processing in fingerprint experts. Vision Research, 45, 431–448.
Farah, M. J., Levinson, K. L., & Klein, K. L. (1995). Face perception and within-category discrimination in prosopagnosia. Neuropsychologia, 33, 661–674.
Farah, M. J. (1996). Is face recognition special? Evidence from neuropsychology, Behavioural Brain Research, 76(1–2), 181–189.
8/17/09 2:12:52 PM
412
Categorization
Farah, M. J., & McClelland, J. L. (1991). A computational model of semantic memory impairment: Modality-specificity and emergent category-specificity. Journal of Experimental Psychology: General, 120, 339–357. Farah, M. J., McMullen, P. A., & Meyer, M. M. (1991). Can recognition of living things be selectively impaired? Neuropsychologia, 29, 185–193. Filoteo, J. V., Maddox, T. W., & Davis, J. D. (2001). Quantitative modeling of category learning in amnesic patients. Journal of the International Neuropsychological Society, 7, 1–19. Flowers, D. L., Jones, K., Noble, K., VanMeter, J., Zeffiro, T. A., Wood, F. B., et al. (2004). Extrastriate representation of letter recognition. Neuroimage, 21, 829–839. Fodor, J. A. (1983). Modularity of mind. Cambridge, MA: MIT Press. Foerde, K., Knowlton, B. J., & Poldrack, R. A. (2006). Modulation of competing memory systems by distraction. Proceedings of the National Academy of Sciences, USA, 103, 11778–11783. Frank, M., & Claus, E. (2006). Anatomy of a decision: Striato-orbitofrontal interactions in reinforcement learning, decision making and reversal. Psychological Review, 113, 300–326. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2003). A comparison of primate prefrontal and temporal cortices during visual categorization. Journal of Neuroscience, 23, 5235–5246. Freedman, D. J., Riesenhuber, M., Poggio, T., & Miller, E. K. (2006). Experience-dependent sharpening of visual shape selectivity in inferior temporal cortex. Cerebral Cortex, 16, 1631–1644. Garner, W. R. (1974). The processing of information and structure. Potomac, MD: Erlbaum.
Gauthier, I., Wong, A. C., Hayward, W. G., & Cheung, O. S. (2006). Font tuning associated with expertise in letter perception. Perception, 35, 541–559. Gluck, M. A., & Myers, C. E. (1993). Hippocampal mediation of stimulus representation: A computational theory. Hippocampus, 3, 491–516. Goldstone, R. L., & Styvers, M. (2001). The sensitization and differentiation of dimensions during category learning. Journal of Experimental Psychology: General, 130, 116–139. Goodale, M. A., & Milner, D. A. (1992). Separate visual pathways for perception and action. Trends in Neuroscience, 15, 20–25. Goodman, N. D., Tenenbaum, J. B., Feldman, J., & Griffiths, T. L. (2008). A rational analysis of rule-based concept learning. Cognitive Science, 32, 108–154. Graham, K. S., Scahill, V. L., Hornberger, M., Barense, M. D., Lee, A. C. H., Bussey, T. J., et al. (2006). Abnormal categorization and perceptual learning in patients with hippocampal damage. Journal of Neuroscience, 26, 7547–7554. Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001, September 14). An fMRI investigation of emotional engagement in moral judgment. Science, 293, 2105–2108. Grelotti, D. J., Klin, A. J., Gauthier, I., Skudlarski, P., Cohen, D. J., Gore, J. C., et al. (2005). fMRI activation of the fusiform gyrus and amygdala to cartoon characters but not to faces in a boy with autism. Neuropsychologia, 43, 373–385. Grill-Spector, K., Sayres, R., & Ress, D. (2006). High-resolution imaging reveals highly selective nonface clusters in the fusiform face area. Nature Neuroscience, 9, 1177–1185.
Gauthier, I. (2000). What constrains the organization of the ventral temporal cortex? Trends in Cognitive Science, 4, 1–2.
Gross, C., Bender, D., & Rocha-Miranda, C. (1969, December 5). Visual receptive fields of neurons in inferotemporal cortex of the monkey. Science, 166, 1303–1306.
Gauthier, I., Behrmann, M., & Tarr, M. J. (1999). Can face recognition really be dissociated from object recognition? Journal of Cognitive Neuroscience, 11, 349–370.
Grossman, E. D., & Blake, R. (2002). Brain areas active during visual perception of biological motion. Neuron, 36, 1167–1175.
Gauthier, I., Curby, K. M., Skudlarski, P., & Epstein, R. A. (2005). Individual differences in ffa activity suggest independent processing at different spatial scales. Cognitive and Affective Behavioral Neuroscience, 5, 222–234. Gauthier, I., Curran, T., Curby, K. M., & Collins, D. (2003). Perceptual interference supports a non-modular account of face processing. Journal of Neuroscience, 6, 428–432. Gauthier, I., & Palmeri, T. J. (2002). Visual neurons: Categorization-based selectivity. Current Biology, 12, R282–R284. Gauthier, I., Skudlarski, P., Gore, J. C., & Anderson, A. W. (2000). Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neuroscience, 3, 191–197. Gauthier, I., & Tarr, M. J. (1997). Becoming a “greeble” expert: Exploring mechanisms for face recognition. Vision Research, 37, 1673–1682. Gauthier, I., & Tarr, M. J. (2002). Unraveling mechanisms for expert object recognition: Bridging brain activity and behavior. Journal of Experimental Psychology: Human Perception and Performance, 28, 431–446. Gauthier, I., Tarr, M. J., Anderson, A. W., Skudlarski, P., & Gore, J. C. (1999). Activation of the middle fusiform ‘face area’ increases with expertise in recognizing novel objects. Nature Neuroscience, 2, 568–573. Gauthier, I., Tarr, M. J., Moylan, J., Anderson, A. W., & Gore, J. C. (2000). Does subordinate-level categorization engage the functionallydefined fusiform face area? Cognitive Neuropsychology, 17(1/2/3), 143–163. Gauthier, I., Williams, P., Tarr, M. J., & Tanaka, J. (1998). Training “greeble” experts: A framework for studying expert object recognition processes. Vision Research, 38(15/16), 2401–2428.
c20.indd 412
Hanson, J., & Halchenko, Y. (2008). Brain reading using full brain support vector machines for object recognition: There is no face identification area. Neural Computation, 20, 486–503. Hasson, U., Levy, I., Behrmann, M., Hendler, T., & Malach, M. (2001). Eccentricity bias as an organizing principle for human high order object areas. Neuron, 34(3), 479–490. Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001, September 28). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293, 2425–2430. Henke, K., Schweinberger, S. R., Grigo, A., Klos, T., & Sommer, W. (1998). Specificity of face recognition: Recognition of exemplars of non-face objects in prosopagnosia. Cortex, 34, 289–296. Hillis, A., & Caramazza, A. (1991). Category-specific naming and comprehension impairment: A double dissociation. Brain, 114, 2081–2094. Hintzman, D. L. (1986). “Schema abstraction” in a mutiple-trace memory model. Psychology Review, 93, 411–428. Homa, D., Cross, J., Cornell, D., Goldman, D., & Schwartz, S. (1973). Prototype abstraction and classification of new instances as a function of number of instances defining the prototype (concept formation and learning). Journal of Experimental Psychology: Learning Memory and Cognition, 101, 116–122. Hopkins, R. O., Myers, C. E., Shohamy, D., Grossman, S., & Gluck, M. A. (2004). Impaired probabilistic category learning in hypoxic subjects with hippocampal damage. Neuropsychologia, 42, 524–535. Houk, J., & Wise, S. (1995). Distributed modular architectures linking basal ganglia, cerebellum, and cerebral cortex: Their role in planning and controlling action. Cerebral Cortex, 5, 95–110. Hubl, D., Bolte, S., Feineis-Matthews, S., Lanfermann, H., Federspiel, A., Strik, W., et al. (2003). Functional imbalance of visual pathways indicates alternative face processing strategies in autism. Neurology, 61, 1232–1237.
8/17/09 2:12:53 PM
References 413 James, K. H., James, T. W., Jobard, G., Wong, A. C.-N., & Gauthier, I. (2005). Letter processing in the visual system: A different activation pattern for single letters and strings. Cognitive and Affective Behavioral Neuroscience, 5, 452–466.
Logothetis, N. K., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5, 552–563.
James, T. W., & Gauthier, I. (2003). Auditory and action semantic features activate sensory-specific perceptual brain regions. Current Biology, 13, 1792–1796.
Lombardi, W. J., Andreason, P. J., Sirocco, K. Y., Rio, D. E., Gross, R. E., Umhau, J. C., et al. (1999). Wisconsin card sorting test performance following head injury: Dorsolateral fronto-striatal circuit activity predicts perseveration. Journal of Clinical and Experimental Neuropsychology, 21, 2–16.
Jiang, X., Bradley, E., Rini, R. A., Zeffiro, T., Vanmeter, J., & Riesenhuber, M. (2007). Categorization training results in shape- and category-selective human neural plasticity. Neuron, 53, 891–903. Johansen, M. K., & Palmeri, T. J. (2002). Are there representational shifts during category learning? Cognitive Psychology, 45, 482–553. Joyce, C., & Cottrell, G. W. (2004). Solving the visual expertise mystery in connectionist models of cognition and perception II. Paper presented at the proceedings of the eighth neural computation and psychology workshop. Kamitani, Y., & Tong, F. (2005). Decoding the visual and subjective contents of the human brain. Nature Neuroscience, 8, 679–685. Kanwisher, N., McDermott, J., & Chun, M. M. (1996). A module for the visual representation of faces. NeuroImage, 3(Suppl. 3), S361. Kanwisher, N., McDermott, J., & Chun, M. M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311.
Love, B., & Gureckis, T. (2007). Models in search of a brain. Cognitive, Affective, and Behavioral Neuroscience, 7(2), 90–108. Love, B., Medin, D. L., & Gureckis, T. M. (2004). Sustain: A network model of category learning. Psychological Review, 111, 309–332. Maddox, W. T., Aparicio, P., Marchant, N. L., & Ivry, R. B. (2005). Rulebased category learning is impaired in patients with parkinson’s disease but not in patients with cerebellar disorders. Journal of Cognitive Neuroscience, 17, 707–723. Maddox, W. T., Ashby, F. G., & Bohil, C. (2003). Delayed feedback effects on rule-based and information-integration category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 650–662.
Kinder, A., & Shanks, D. (2001). Amnesia and the declarative/nondeclarative distinction: A recurrent network model of classification, recognition, and repetition priming. Journal of Cognitive Neuroscience, 13, 1–22.
Maddox, W. T., & Filoteo, J. V. (2001). Striatal contributions to category learning: Quantitative modeling of simple linear and complex nonlinear rule learning in patients with parkinson’s disease. Journal of the International Neuropsychological Society, 7, 710–727.
Klin, A., & Jones, W. (2008). Altered face scanning and impaired recognition of biological motion in a 15-month-old infant with autism. Developmental Science, 11, 40–46.
Marr, D., & Nishihara, H. K. (1978). Representation and recognition of the spatial organization of three-dimensional shapes. Proceedings of the Royal Society of London, B, 200, 269–294.
Knowlton, B. J., Mangels, J. A., & Squire, L. R. (1996, September 6). A neostriatal habit learning system in humans. Science, 273, 1399–1402.
Martin, A. (2007). The representation of object concepts in the brain. Annual Review of Psychology, 58, 25–45.
Knowlton, B. J., & Squire, L. R. (1993, December 10). The learning of categories: Parallel brain systems for item memory and category knowledge. Science, 262, 1747–1749.
Martin, A., Wiggs, C. L., Ungerleider, L. G., & Haxby, J. V. (1996, February 15). Neural correlates of category-specific knowledge. Nature, 379, 649–652.
Knowlton, B. J., Squire, L. R., Paulsen, J. S., Swerdlow, N. R., Swenson, M., & Butters, N. (1996). Dissociations within nondeclarative memory in huntington’s disease. Neuropsychology, 10, 538–548.
McCandliss, B. D., Cohen, L., & Dehaene, S. (2003). The visual word form area: Expertise for reading in the fusiform gyrus. Trends in Cognitive Sciences, 7, 293–299.
Konishi, S., Nakajima, K., Uchida, I., Kameyama, M., Nakahara, K., Sekihara, K., et al. (1998). Transient activation of inferior prefrontal cortex during cognitive set shifting. Nature Neuroscience, 1, 80–84.
McCarthy, R., & Warrington, E. (1988, August 4). Evidence for modalityspecific meaning systems in the brain. Nature, 334, 428–430.
Kruschke, J. K. (1992). Alcove: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22–44.
McKeeff, T., Tong, F., & Gauthier, I. (2007). Perceptual expertise with cars leads to greater perceptual interference with faces but not objects [abstract]. Journal of Vision, 7, 1032.
Lakoff, G. (1987). Women, fire, and dangerous things: What categories reveal about the mind. Chicago: University of Chicago Press.
Medin, D. L., & Schaffer, M. M. (1978). Context theory of classification learning. Psychological Review, 85, 207–238.
Lamberts, K. (2000). Information-accumulation theory of speeded categorization. Psychological Review, 107, 227–260.
Meeter, M., Myers, C. E., & Gluck, M. A. (2005). Integrating incremental learning and episodic memory models of the hippocampal region. Psychological Review, 112, 560–585.
Lerner, Y., Hendler, T., Ben-Bashat, D., Harel, M., & Malach, R. (2001). A hierarchical axis of object processing stages in the human visual cortex. Cerebral Cortex, 11, 287–297. Levine, M. (1975). A cognitive theory of learning: Research on hypothesis testing. Hillsdale, NJ: Erlbaum. Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492–527. Logan, G. D. (1990). Repetition priming and automaticity: Common underlying mechanisms? Cognitive Psychology, 22, 1–35. Logan, G. D. (2002). An instance theory of attention and memory. Psychological Review, 109, 376–400. Logothetis, N. K., & Pauls, J. (1995). Psychophysical and physiological evidence for viewer-centered object representations in the primate. Cerebral Cortex, 5, 270–288. Logothetis, N. K., Pauls, J., Bülthoff, H. H., & Poggio, T. (1994). Viewdependent object recognition in monkeys. Current Biology, 4, 401–414.
c20.indd 413
Logothetis, N. K., & Sheinberg, D. L. (1996). Visual object recognition. Annual Review of Neuroscience, 19, 577–621.
Meeter, M., Myers, C. E., Shohamy, D., Hopkins, R. O., & Gluck, M. A. (2006). Strategies in probabilistic categorization: Results from a new way of analyzing performance. Learning and Memory, 13, 230–239. Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of Neurology, 9, 90–100. Mishkin, M., Malamut, B., & Bachevalier, J. (1984). Memories and habits: Two neural systems. In Lynch G., McGaugh J. L., Weinberger M. (Eds.), Neurobiology of learning and memory (pp. 65–77). New York: Guilford Press. Monchi, O., Petrides, M., Petre, V., Worsley, K., & Dagher, A. (2001). Wisconsin card sorting revisited: Distinct neural circuits participating in different stages of the task identified by event-related functional magnetic resonance imaging. Journal of Neuroscience, 21, 7733–7741. Moore, C. D., Cohen, M. X., & Ranganath, C. (2006). Neural mechanisms of expert skills in visual working memory. Journal of Neuroscience, 26, 11187–11196.
8/17/09 2:12:53 PM
414
Categorization
Moscovitch, M., Winocur, G., & Behrmann, M. (1997). What is special about face recognition? Nineteen experiments on a person with visual object agnosia and dyslexia but normal face recognition. Journal of Cognitive Neuroscience, 9, 555–604. Murphy, G. (2002). The big book of concepts. Cambridge, MA: MIT Press. Nobre, A. C., Allison, T., & McCarthy, G. (1994, November 17). Word recognition in the human inferior temporal lobe. Nature, 372, 260–263. Norman, K. A., Polyn, S. M., Detre, G. J., & Haxby, J. V. (2006). Beyond mind-reading: Multi-voxel pattern analysis of fMRI data [see comment]. Trends in Cognitive Sciences, 10, 424–430. Nosofsky, R. M. (1984). Choice, similarity, and the context theory of classification. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 104–114. Nosofsky, R. M. (1986). Attention, similarity, and the identificationcategorization relationship. Journal of Experimental Psychology: General, 115, 39–61. Nosofsky, R. M. (1988). Exemplar-based accounts of relations between classification, recognition, and typicality. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 700–708. Nosofsky, R. M. (1991). Tests of an exemplar model for relating perceptual classification and recognition memory. Journal of Experimental Psychology: Human Perception and Performance, 17, 3–27. Nosofsky, R. M. (1992). Exemplar-based approach to relating categorization, identification, and recognition. In F. G. Ashby (Ed.), Multidimensional models of perception and cognition. Hillsdale, NJ: Erlbaum.
Op de Beeck, H. P., Wagemans, J., & Vogels, R. (2008). The representation of perceived shape similarity and its role for category learning in monkeys: A modeling study. Vision Research, 48, 598–610. Palmeri, T. J. (1997). Exemplar similarity and the development of automaticity. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 324–354. Palmeri, T. J., & Flanery, M. A. (1999). Learning about categories in the absence of training: Profound amnesia and the relationship between perceptual categorization and recognition memory. Psychological Science, 10, 526–530. Palmeri, T. J., & Flanery, M. A. (2002). Memory systems and perceptual categorization. In B. H. Ross (Ed.), The psychology of learning and motivation (Vol. 41, pp. 141–187). Elsevier, San Diego. Palmeri, T. J., & Gauthier, I. (2004). Visual object understanding. Nature Reviews: Neuroscience, 5, 291–303. Palmeri, T. J., & Nosofsky, R. M. (1995). Recognition memory for exceptions to the category rule. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 548–568. Palmeri, T. J., & Tarr, M. (2008). Visual object perception and longterm memory. In S. Luck & A. Hollingworth (Eds.) Visual Memory (pp. 163–207). Visual Memory. Oxford University Press. Palmeri, T. J., Wong, A. C., & Gauthier, I. (2004). Computational approaches to the development of perceptual expertise. Trends in Cognitive Sciences, 8, 378–386. Peelen, M. V., Wiggett, A., & Downing, P. (2006). Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion. Neuron, 16, 815–822.
Nosofsky, R. M., Clark, S. E., & Shin, H. J. (1989). Rules and exemplars in categorization, identification, and recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 282–304.
Peissig, J. J., Singer, J., Kawasaki, K., & Sheinberg, D. L. (2007). Effects of long-term object familiarity on event-related potentials in the monkey. Cerebral Cortex, 17, 1323–1334.
Nosofsky, R. M., & Johansen, M. K. (2000). Exemplar-based accounts of “multiple-system” phenomena in perceptual categorization. Psychonomic Bulletin and Review, 7, 375–402.
Perrett, D. I., Oram, M. W., & Ashbridge, E. (1998). Evidence accumulation in cell populations responsive to faces: An account of generalisation of recognition without mental transformations. Cognition, 67(1, 2), 111–145.
Nosofsky, R. M., & Kruschke, J. (2001). Single-system models and interference in category learning: Commentary on waldron and ashby. Psychonomic Bulletin and Review, 9, 175–180.
Pierce, K., Muller, R. A., Ambrose, J., Allen, G., & Courshenes, E. (2001). Face processing occurs outside the fusiform face area. In autism: Evidence form functional MRI. Brian, 124, 2059–2073.
Nosofsky, R. M., Kruschke, J. K., & McKinley, S. C. (1992). Combining exemplar-based category representations and connectionist learning rules. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 211–233.
Plaut, D. C. (1995). Double dissociation without modularity: Evidence from connectionist neuropsychology. Journal of Clinical and Experimental Neuropsychology, 17, 291–321.
Nosofsky, R. M., & Palmeri, T. J. (1997). An exemplar-based random walk model of speeded classification. Psychological Review, 104, 266–300. Nosofsky, R. M., & Palmeri, T. J. (1998). A rule-plus-exception model for classifying objects in continuous-dimension spaces. Psychonomic Bulletin and Review, 5, 345–369. Nosofsky, R. M., Palmeri, T. J., & McKinley, S. C. (1994). Rule-plusexception model of classification learning. Psychological Review, 101, 53–79. Nosofsky, R. M., & Zaki, S. R. (1998). Dissociations between categorization and recognion in amnesic and normal individuals: An exemplarbased interpretation. Psychological Science, 9, 247–255. Op de Beeck, H. P. (2008). Interpreting fMRI data: Maps, modules and dimensions. Nature Review Neuroscience, 9, 123–135. Op de Beeck, H. P., Baker, C. I., DiCarlo, J. J., & Kanwisher, N. G. (2006). Discrimination training alters object representations in human extrastriate cortex. Journal of Neuroscience, 26, 13025–13036.
Poggio, T., & Edelman, S. (1990, January 18). A network that learns to recognize three-dimensional objects. Nature, 343, 263–266. Poldrack, R. A., Clark, J., Pare-Blagoev, E. J., Shohamy, D., Creso Moyano, J., Myers, C., et al. (2001). Interactive memory systems in the human brain. Nature, 29, 546–550. Poldrack, R. A., Prabhakaran, V., Seger, C. A., & Gabrieli, J. D. (1999). Striatal activation during acquisition of a cognitive skill. Neuropsychology, 13, 564–574. Poldrack, R. A., & Rodriguez, P. (2004). How do memory systems interact? Evidence from human classification learning. Neurobiology of Learning and Memory, 82, 324–332. Polk, T. A., Stallcup, M., Aguirre, G. K., Alsop, D. C., D’Esposito, M., Detre, J. A., et al. (2002). Neural specialization for letter recognition. Journal of Cognitive Neuroscience, 14, 145–159. Posner, M. I., & Keele, S. W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353–363.
Op de Beeck, H. P., & Vogels, R. (2000). Spatial sensitivity of macaque inferior temporal neurons. Journal of Comparative Neurology, 426, 505–518.
Puce, A., Allison, T., Asgari, M., Gore, J. C., & McCarthy, G. (1996). Differential sensitivity of human visual cortex to faces, letterstrings, and textures: A functional magnetic resonance imaging study. Journal of Neuroscience, 16, 5205–5215.
Op de Beeck, H. P., Wagemans, J., & Vogels, R. (2001). Inferotemporal neurons represent low-dimensional configurations of parameterized shapes. Journal of Neuroscience, 4, 1244–1252.
Puce, A., Allison, T., Bentin, S., Gore, J. C., & McCarthy, G. (1998). Temporal cortex activation in humans viewing eye and mouth movements. Journal of Neuroscience, 18, 2188–2199.
c20.indd 414
8/17/09 2:12:54 PM
References 415 Quinn, P. C. (1999). Development of recognition and categorization of objects and their spatial relations in young infants. In L. Balter & C. S. Tamis-LeMonda (Eds.), Child psychology: A handbook of contemporary issues. Philadelphia: Psychology Press/Taylor & Francis. Reber, P., Gitelman, D., Parrish, T., & Mesulam, M. (2003). Dissociating explicit and implicit category knowledge with fMRI. Journal of Cognitive Neuroscience, 15, 574–583. Reed, J. M., Squire, L. R., Patalano, A. L., Smith, E. E., & Jonides, J. (1999). Learning about categories that are defined by object-like stimuli despite impaired declarative memory. Behavioral Neuroscience, 113, 411–419. Riesenhuber, M., & Poggio, T. (1999). Hierarchical models of object recognition in cortex. Nature Neuroscience, 2, 1019–1025. Riesenhuber, M., & Poggio, T. (2000). Models of object recognition. Journal of Neuroscience, 3(Suppl), 1199–1204. Riesenhuber, M., & Poggio, T. (2002). Neural mechanisms of object recognition. Current Opinion in Neurobiology, 12, 162–168. Robinson, A., Heaton, R., Lehan, R., & Stilson, D. (1980). The utility of the wisconsin card sorting test in detecting and localizing frontal lobe lesions. Journal of Consulting and Clinical Psychology, 48, 605–614. Roediger, H., Buckner, R., & McDermott, K. (1999). Components of processing. In J. K. Foster & M. Jelicic (Eds.), Memory: Systems, process, or function? (pp. 31–65). Oxford: Oxford University Press. Rolls, E. (1992). Neurophysiology and functions of the primate amygdala. In J. Aggleton (Ed.), The amygdala (pp. 143–165). New York: Wiley-Liss.
Sakamoto, Y., & Love, B. (2004). Schematic influences on category learning and recognition memory. Journal of Experimental Psychology: General, 133, 534–553. Sanocki, T. (1987). Visual knowledge underlying letter perception: Fontspecific, schematic tuning. Journal of Experimental Psychology: Human Perception and Performance, 13, 267–278. Sanocki, T. (1988). Font regularity constraints on the process of letter recognition. Journal of Experimental Psychology: Human Perception and Performance, 14, 472–480. Schultz, R. T., Gauthier, I., Klin, A., Fulbright, R. K., Anderson, A. W., Volkmar, F., et al. (2000). Abnormal ventral temporal cortical activity during face discrimination among individuals with autism and asperger syndrome. Archives of General Psychiatry, 37, 331–340. Schyns, P. G., & Rodet, L. (1997). Categorization creates functional features. Journal of Experimental Psychology: Learning, Memory and Cognition, 23, 681–696. Scott, L. S., Tanaka, J. W., Sheinberg, D. L., & Curran, T. (2006). A reevaluation of the electrophysiological correlates of expert object processing. Journal of Cognitive Neuroscience, 18, 1453–1465. Seger, C. A., & Cincotta, C. M. (2005). Dynamics of frontal, striatal, and hippocampal systems during rule learning. Cerebral Cortex, 16, 1546–1555. Sergent, J., Ohta, S., MacDonald, B., & Zuck, E. (1994). Segregated processing of facial identity and emotion in the human brain: A pet study. Visual Cognition, 1(2/3), 349–369.
Rosch, E. (1973). On the internal structure of perceptual and semantic categories. In T. E. Moore (Ed.), Cognitive development and the acquisition of language (pp. 111–144). San Diego, CA: Academic Press.
Shallice, T. (1988). From neuropsychology to mental structure. Cambridge, England: Cambride University Press.
Rosch, E., & Mervis, C. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 573–605.
Shepard, R. N. (1987, September 11). Toward a universal law of generalization for psychological science. Science, 237, 1317–1323.
Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., & Boyes-Braem, P. (1976). Basic objects in natural categories. Cognitive Psychology, 8, 382–439.
Shepard, R. N. (1994). Perceptual-cognitive universals as reflections of the world. Psychonomic Bulletin and Review, 1, 2–28.
Rosseel, Y. (2002). Mixture models of categorization. Journal of Mathematical Psychology, 46, 178–210. Rossion, B., Collins, D., Goffaux, V., & Curran, T. (2007). Long-term expertise with artificial objects increases visual competition with early face categorization processes. Journal of Cognitive Neuroscience, 19, 543–555. Rossion, B., Gauthier, I., Goffaux, V., Tarr, M. J., & Crommelinck, M. (2002). Expertise training with novel objects leads to left lateralized face-like electrophysiological responses. Psychological Science, 13, 250–257. Rossion, B., Gauthier, I., Tarr, M. J., Despland, P., Bruyer, R., Linotte, S., et al. (2000). The n170 occipito-temporal component is delayed and enhanced to inverted faces but not to inverted objects: An electrophysiological account of face-specific processes in the human brain. NeuroReport, 11, 69–74.
Sheridan, J., & Humphreys, G. (1993). A verbal-semantic category-specific recognition impairment. Cognitive Neuropsychology, 10, 143–184. Shin, H. J., & Nosofsky, R. (1992). Similarity-scaling studies of dot-pattern classification and recognition. Journal of Experimental Psychology: General, 121, 278–304. Shohamy, D., Myers, C., Kalanithi, J., & Gluck, M. (2008). Basal ganglia and dopamine contributions to probabilistic category learning. Neuroscience and Biobehavioral Reviews, 32(2), 219–236. Sigala, N., Gabbiani, F., & Logothetis, N. K. (2002). Visual categorization and object representation in monkeys and humans. Journal of Cognitive Neuroscience, 14, 1–12. Sigala, N., & Logothetis, N. K. (2002, January 17). Visual categorization shapes feature selectivity in the primate temporal cortex. Science, 415, 318–320. Smith, E. E., Patalano, A. L., & Jonides, J. (1998). Alternative strategies of categorization. Cognition, 65, 167–196.
Rossion, B., Kung, C. C., & Tarr, M. J. (2004). Visual expertise with nonface objects leads to competition with the early perceptual processing of faces in the human occipitotemporal cortex. Proceedings of the National Academy of Sciences, USA, 101, 14521–14526.
Smith, E., Shoben, E., & Rips, L. (1974). Structure and process in semantic memory: A featural model for semantic decisions. Psychological Review, 81, 214–241.
Rotshtein, P., Henson, R., Treves, A., Driver, J., & Dolan, R. (2005). Morphing marilyn into maggie dissociates physical and identity face representations in the brain. Nature Neuroscience, 8, 107–113.
Smith, J. D., & Minda, J. P. (1998). Prototypes in the mist: The early epochs of category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 1411–1436.
Rumiati, R. I., & Humphreys, G. W. (1997). Visual object agnosia without alexia or prosopagnosia: Arguments for separate knowledge stores. Visual Cognition, 4, 207–217.
Smith, J. D., & Minda, J. P. (2001). Journey to the center of the category: The dissociation in amnesia between categorization and recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 984–1002.
Sacchett, C., & Humphreys, G. (1992). Calling a squirrel a squirrel but a canoe a wigwam: A category specific deficit for artefactual objects and body parts. Cognitive Neuropsychology, 9, 73–86.
c20.indd 415
Saint-Cyr, J. A., Taylor, A. E., & Lang, A. E. (1988). Procedural learning and neostyrial dysfunction in man. Brain, 111, 941–960.
Smith, P., & Ratcliff, R. (2004). Psychology and neurobiology of simple decisions. Trends in Neuroscience, 27, 161–168.
8/17/09 2:12:54 PM
416
Categorization
Spiridon, M., & Kanwisher, N. (2002). How distributed is visual category information in human occipito-temporal cortex? An fMRI study. Neuron, 35, 1157–1165.
Ullman, S., Vidal-Naquet, M., & Sali, E. (2002). Visual features of intermediate complexity and their use in classification. Journal of Neuroscience, 5, 682–687.
Squire, L. R. (2004). Memory systems of the brain: A brief history and current perspective. Neurobiology of Learning and Memory, 82, 171–177.
Vogels, R., Biederman, I., Bar, M., & Lorincz, A. (2001). Inferior temporal neurons show greater sensitivity to nonaccidental than to metric shape differences. Journal of Cognitive Neuroscience, 13, 444–453.
Squire, L. R., & Knowlton, B. (1995). Learning about categories in the absence of memory. Proceedings of the National Academy of Sciences, USA, 92, 12470–12474. Squire, L. R., & Zola, S. M. (1996). Structure and function of declarative and nondeclarative memory systems. Proceedings of the National Academy of Sciences, USA, 93, 13515–13522. Stankiewicz, B. J. (2002). Empirical evidence for independent dimensions in the visual representation of three-dimensional shape. Journal of Experimental Psychology: Human Perception and Performance, 28, 913–932. Tanaka, J. W., & Curran, T. (2001). A neural basis for expert object recognition. Psychological Science, 12, 43–47. Tanaka, J. W., & Sengco, J. A. (1997). Features and their configuration in face recognition. Memory and Cognition, 25, 583–592. Tanaka, J. W., & Taylor, M. (1991). Object categories and expertise: Is the basic level in the eye of the beholder? Cognitive Psychology, 23, 457–482. Tanaka, K. (1996). Inferotemporal cortex and object vision. Annual Review of Neuroscience, 19, 109–139. Tanaka, K. (2003). Columns for complex visual object features in the inferotemporal cortex: Clustering of cells with similar but slightly different stimulus selectivities. Cerebral Cortex, 13, 90–99. Tarr, M. J. (1995). Rotating objects to recognize them: A case study of the role of viewpoint dependency in the recognition of three-dimensional objects. Psychonomic Bulletin and Review, 2, 55–82. Tarr, M. J., & Bülthoff, H. H. (1998). Image-based object recognition in man, monkey and machine. Cognition, 67(1–2), 1–20. Tarr, M. J., & Bülthoff, H. H. (1995). Is human object recognition better described by geon-structural-descriptions or by multiple-views? Journal of Experimental Psychology: Human Perception and Performance, 21, 1494–1505. Tarr, M. J., Kersten, D., & Bülthoff, H. H. (1998). Why the visual system might encode the effects of illumination. Vision Research, 38(15/16), 2259–2275. Tarr, M. J., & Pinker, S. (1989). Mental rotation and orientation-dependence in shape recognition. Cognitive Psychology, 21, 233–282. Tarr, M. J., Williams, P., Hayward, W. G., & Gauthier, I. (1998). Threedimensional object recognition is viewpoint dependent. Journal of Neuroscience, 1, 275–277.
Waldron, E. M., & Ashby, F. G. (2001). The effects of concurrent task interference on category learning: Evidence for multiple category learning systems. Psychonomic Bulletin and Review, 8, 168–176. Warrington, E., & McCarthy, R. (1983). Category specific access dysphasia. Brain, 106, 859–878. Warrington, E., & McCarthy, R. (1987). Categories of knowledge: Further fractionations and an attempted integration. Brain, 110, 1273–1296. Warrington, E., & Shallice, T. (1980). Word-form dyslexia. Brain, 103, 99–112. Warrington, E., & Shallice, T. (1984). Category specific semantic impairments. Brain, 107, 829–854. Williams, N. R. (2007). Competition between domains of expertise in a visual search task [abstract]. Journal of Vision, 7, 335. Wilson, F., O., Scalaidhe, S., & Goldman-Rakic, P. (1993, June 25). Dissociation of object and spatial processing domains in primate prefrontal cortex. Science, 260, 1955–1958. Wittgenstein, L. (1953). Philosophical investigations. Oxford: Blackwell. Wong, A. C.-N. (2007). The effect of different training experiences on object recognition in the visual system. Nashville, TN: Vanderbilt University. Wong, A. C.-N., & Gauthier, I. (in press). An analysis of letter expertise in a levels-of-categorization framework. Visual Cognition. Wong, A. C.-N., Gauthier, I., Woroch, B., DeBuse, C., & Curran, T. (2005). An early electrophysiological response associated with expertise in letter perception. Cognitive and Affective Behavioral Neuroscience, 5, 306–318. Wong, A. C.-N., Jobard, G., James, K. H., James, T. W., & Gauthier, I. (submitted). Expertise with characters in alphabetic and non-alphabetic writing systems engage the same occipito-temporal area. Xu, Y. (2005). Revisiting the role of the fusiform face area in visual expertise. Cerebral Cortex, 15, 1234–1242. Xue, G., & Poldrack, R. (2007). The neural substrates of orthographic learning: Implication for the vwfa hypothesis. Journal of Cognitive Neuroscience, 19, 1643–1655. Yue, X., Tjan, B. S., & Biederman, I. (2006). What makes faces special? Vision Research, 46, 3802–3811. Young, A. W., Hellawell, D., & Hay, D. (1987). Configural information in face perception. Perception, 10, 747–759.
Tong, M., Joyce, C., & Cottrell, G. (2007). Why is the fusiform face area recruited for novel categories of expertise? A neurocomputational investigation. Brain Research, 1202, 14–24.
Zacks, J. M., Speer, N. K., Swallow, K. M., Braver, T. S., & Reynolds, J. R. (2007). Event perception: A mind-brain perspective. Psychological Bulletin, 133, 273–293.
Tovee, M. J., Rolls, E. T., & Azzopardi, P. (1994). Translation invariance in the responses to faces of single neurons in the temporal visual cortical areas of the alert macaque. Journal of Neurophysiology, 72, 1049–1060.
Zaki, S. R. (2005). Is categorization really intact in amnesia? A metaanalysis. Psychonomic Bulletin and Review, 11, 1048–1054.
Trabasso, T., & Bower, G. (1968). Attention in learning: Theory and research. New York: Wiley. Tsao, D. Y., Freiwald, W. A., Tootell, R., & Livingstone, M. (2006, February 3). A cortical region consisting entirely of face-selective cells. Science, 311, 670–674. Turke-Browne, N., Yi, D.-J., & Chun, M. M. (2006). Linking implicit and explicit memory: Common encoding factors and shared representations. Neuron, 49, 917–927. Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.
c20.indd 416
Zaki, S. R., & Nosofsky, R. M. (2001). A single-system interpretation of dissociations between recognition and categorization in a task involving object-like stimuli. Cognitive, Affective and Behavioral Neuroscience, 1, 344–359. Zaki, S. R., & Nosofsky, R. M. (2004). False prototype enhancement effects in dot pattern categorization. Memory and Cognition, 32, 390–398. Zaki, S. R., Nosofsky, R. M., Ramercad, R., & Unverzagt, F. (2003). Categorization and recognition performance in probable alzheimer ’s disease: Evidence for single-system models. Journal of the International Neuropsychological Society, 9, 394–406.
8/17/09 2:12:55 PM
Chapter 21
Cognitive Neuroscience of Thinking VINOD GOEL
have been designed to evaluate processes that may modulate complex cognitive processes. For example, Stroop-type tasks have been used to measure selective attention (Perret, 1974), while boring/monotonous tasks have been used to measure sustained attention (Wilkins, Shallice, & McCarthy, 1987). Maze tracing has been used to measure instruction following (Corkin, 1965). Drawing tasks have been used to measure perseveration (Goldberg & Bilder, 1987). The A-not-B (Diamond, 1990) and the Antisaccade tasks (Roberts, Hager, & Heron, 1994) have been used to study inhibitory mechanisms. Traditionally, there has not been a large overlap between the neuropsychology instruments and the tasks used in cognitive psychology. For example, logical reasoning tasks and judgment and decision-making tasks have been largely absent from the lesion literature. In terms of problemsolving tasks, only the Towers of London and Hanoi tasks are extensively used. However, over the past 15 to 20 years, crossover fertilization between neuropsychology and cognitive psychology is beginning to alter the landscape. Neuroimaging and patient studies of reasoning (Acuna, Eliassen, Donoghue, & Sanes, 2002; Christoff et al., 2001; Goel, 2005; Goel, Buchel, Frith, & Dolan, 2000; Goel & Dolan, 2003; Goel, Gold, Kapur, & Houle, 1997; Goel, Shuren, Sheesley, & Grafman, 2004; Houde et al., 2000; Knauff, Fangmeier, Ruff, & Johnson-Laird, 2003; Monti, Osherson, Martinez, & Parsons, 2007; Noveck, Goel, & Smith, 2004; Osherson et al., 1998; Parsons & Osherson, 2001; Prado & Noveck, 2007), problem solving (Cardoso & Parks, 1998; Carlin et al., 2000; Colvin, Dunbar, & Grafman, 2001; Fincham, Carter, van Veen, Stenger, &Anderson, 2002; Goel, 2002; Goel & Grafman, 1995, 2000; Goel, Grafman, Tajik, Gana, & Danto, 1997; Goel, Pullara, & Grafman, 2001; Morris, Miotto, Feigenbaum, Bullock, & Polkey, 1997; Newman, Carpenter, Varma, & Just, 2003; Owen, Doyon, Petrides, & Evans, 1996; Rowe, Owen, Johnsrude, & Passingham, 2001), and decision making (Bechara, Damasio, Damasio, & Anderson, 1994; Bechara, Damasio, Tranel, & Damasio, 2005; De Neys & Goel, in press;
The study of thinking in psychology is distributed over three largely independent branches: problem solving, reasoning, and judgment and decision making. These domains are delineated by the type of tasks they study and the underlying formal apparatus they appeal to in their explanatory framework. The problem-solving literature (Newell & Simon, 1972) studies tasks such as cryptarithmetic, theorem proving, Tower of Hanoi, and also more open-ended, real-world problems such as planning, design, and even scientific induction, among others. The basic theoretical framework is one of search through a problem space using the formal apparatus of production rules (and more generally, recursive function theory). The reasoning literature (Evans, Newstead, & Byrne, 1993) is largely focused on deductive inference tasks and draws on the formal apparatus of deductive inference. The judgment and decision-making literature (Kahneman, Slovic, & Tversky, 1988) uses such tasks as the base rate fallacy, conjunction fallacy, and so on, and draws on the formal apparatus of probability theory. The goal of these psychological enterprises is to articulate the underlying cognitive mechanisms of thinking. Unfortunately, there is little or no communication across the subdomains. Neuropsychology has focused on measuring the impact of brain injury and disease on various aspects of thinking, using IQ and memory tests, along with numerous specifically developed tasks (Lezak, 1995). Some of these tasks have directly targeted complex cognitive processes. For example, card sorting (Milner, 1963), word similarity, proverbs (Rylander, 1939), and word definition tasks have been used to measure abstraction and generalization ability. Nonsense drawing (Smith & Milner, 1988) and word generation tasks have been used to measure nonverbal and verbal fluency, respectively. Shell games have been used to measure rule/pattern induction (McCarthy & Warrington, 1990). Choice reaction time studies have been used to measure the use of advance information (Alivisatos & Milner, 1989). The Tower of London has been used to measure look ahead/anticipatory abilities (Shallice, 1982) and cognitive estimation has been used to measure judgment (Shallice & Evans, 1978). Other studies 417
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c21.indd 417
8/17/09 2:19:57 PM
418
Cognitive Neuroscience of Thinking
De Neys, Vartanian, & Goel, 2008; Fellows & Farah, 2003; Glimcher, 2002; Manes et al., 2002; Paulus et al., 2001; Sanfey, Rilling, Aronson, Nystrom, & Cohen, 2003; Tranel, Bechara, & Denburg, 2002; Wolford, Miller, & Gazzaniga, 2000) have generated considerable data on the neural systems underlying these processes. A direct review of this literature would consist of lists of neural activations associated with the different tasks in the various studies. While this may be of some pedagogical value, it is not clear that it would deepen or enrich our understanding of the underlying cognitive systems. Goel, (2007) provides such a summary for the deductive reasoning literature. My goals, in this review, are somewhat different. I want to (a) discuss neuropsychology’s contributions to the three domains within a common framework and (b) do so in a way that is relevant and informative to the development of cognitive theories of thinking. In terms of the former goal, I propose to organize the data generated from recent neuroimaging and lesion studies into the following three themes derived from the behavioral literature on reasoning, decision making, and problem solving: (a) heuristics versus formal/universal methods; (b) conflict detection and inhibition; and (c) ill-structured versus well-structured problems. These themes crop up in each of the three areas to varying degrees. In terms of the latter goal, I have previously argued (Goel, 2005) that the most immediate and valuable contribution that cognitive neuroscience can make to the understanding of cognitive processes is not in terms of listing a series of neural activations but in terms of identifying double dissociations (see Appendix A). Double dissociations are indicative of causal joints in the neural machinery (Shallice, 1988) that cognitive theories will ultimately need to respect. Putting these two aspects together, I argue that the neuropsychological evidence indicates dissociations respecting the three themes identified from the cognitive literature and suggests that these themes/ issues pick out real causal joints in the neural machinery that cognitive theories will need to respect. As such, this review is selective and nonexhaustive.
HEURISTICS VERSUS FORMAL/ UNIVERSAL PROCESSES The distinction between heuristic and formal/universal processes is an important, common theme in the problemsolving, decision-making, and reasoning literatures. Within the modern cognitive literature, a useful place to begin tracing the distinction is with Simon’s notion of bounded rationality (Simon, 1981, 1983) and the incorporation of this idea into Newell & Simon’s (1972) models of human problem solving. The key idea was the introduction of the
c21.indd 418
notion of the problem space, a computational modeling space shaped by the constraints imposed by the structures of a time- and memory-bound serial information processing system and the task environment. The built-in strategies for searching this problem space include such content-free universal methods as Means Ends Analysis, Breadth First Search, Depth First Search, and so on. But the universal applicability of these methods comes at the cost of enormous computational resources. Given that the cognitive agent is a time-and-memory bound serial processor, it would often not be able to respond in real time, if it had to rely on formal, context independent processes. So the first line of defense for such a system is the deployment of task-specific knowledge to circumvent formal search procedures. Consider the following example: I arrive at the airport in Paris and need to make a telephone call before catching my connecting flight in an hour. I notice that the public telephones require a special calling card. The airport is a multistory building with shops on several floors. If I know nothing about France, I could start on the top floor at one end of the building, enter a store and ask for a telephone card. If I find one, I can terminate my search and make my phone call. If I don’t, I can proceed to the next store, continuing until I have visited every store on the floor (or found a telephone card). I could then go down to the next floor and continue in the same fashion. Following this breadth first (British Museum) search strategy, I will systematically visit each store and find a telephone card if one of them sells it. The search will terminate when I have found the telephone card or visited the last store. This may take several hours, and I may miss my connecting flight. However, I may have a specific piece of knowledge about France that may help me circumvent this search: Telephone cards are sold by Tabac Shops. If I know this, I merely have to search the directory of shops, find the Tabac shop and go directly there, circumventing the search procedure. Notice this knowledge is very powerful, but very situation specific. It will not help me find a pair of socks in Paris or make a telephone call in Delhi. On this account, heuristics are situation specific, learned, and consciously applied procedures, not the type of things one makes a science out of. So not surprisingly, much of the subsequent research effort of this program was devoted to developing formal search algorithms and computational constraints on cognitive architecture (Newell, 1990) rather than heuristic procedures. The work of Tversky and Kahneman (1974) in the judgment and decision-making literature can be viewed as a development of the heuristic branch of the cognitive system. Perhaps their most important contribution was the identification of a number of content general “biases” or fallacies that we are all subject to, such as the conjunction fallacy and base rate fallacy (Gilovich & Kahneman, 2002;
8/17/09 2:19:57 PM
Heuristics versus Formal/Universal Processes
Kahneman et al., 1988; Kahneman & Tversky, 1996). Here is an example of the latter (Kahneman & Tversky, 1973): A psychologist wrote thumbnail descriptions of a sample of 100 participants consisting of 15 engineers and 85 lawyers. The description that follows was chosen at random from the 100 available descriptions: Jack is a 45-year-old man. He is married and has four children. He is generally conservative, careful, and ambitious. He shows no interest in political and social issues and spends most of his free time on his many hobbies that include home carpentry, sailing, and mathematical puzzles.
Which one of the following statements is most likely? 1. Jack is an engineer. 2. Jack is a lawyer. 3. Equally likely that Jack is an engineer or lawyer. In such problems, the correct normative answers are given by the base rates, so the response should be 2 in this example. However, the accompanying description engages heuristic processes that often override the normative response, so many subjects will respond 1 based on the description. Kahneman and Tversky identified several heuristics including representativeness, availability, and anchoring and adjustment to explain such fallacies. The one at work in this example is “representativeness.” The description is more representative of a typical engineer than a lawyer and this often overrides the base rate information. Similar phenomenon have been identified in the logical reasoning literature under the guise of the belief-bias effect. Logical arguments with believable conclusions are accepted much more readily than arguments with unbelievable conclusions (Wilkins, 1928). For example, the following valid argument with a believable conclusion is accepted as valid 96% of the time: No cigarettes are inexpensive. Some addictive things are inexpensive. ⬖ Some addictive things are not cigarettes. By contrast, a logically identical argument, but with an unbelievable conclusion, is accepted as valid only 46% of the time (Evans, Barston, & Pollard, 1983): No addictive things are inexpensive. Some cigarettes are inexpensive. ⬖ Some cigarettes are not addictive. The explanation is that instead of engaging in formal logical analysis, subjects are falling victim to a believability
c21.indd 419
419
heuristic that leads them astray in this particular case (Evans, 2003; Evans & Over, 1996; Sloman, 1996). Although there are some interesting and important differences—in terms of ontological commitments—in the development of these ideas in the three literatures, they are beyond the scope of this review. The important point for our purposes is the distinction between using knowledge and beliefs to solve a problem versus using more general or “universal,” content-free procedures. This is an important theme in all three areas under consideration. The neuropsychological data provides compelling evidence for these two systems in terms of double dissociations respecting the distinction. In the balance of this section, I review studies where the experimental design involves the manipulation between heuristic and formal strategies. In the next section, studies involving explicit conflict between the two strategies are considered. Perhaps the most extensive evidence for anatomical dissociation along the heuristic and formal dimension comes from neuroimaging work on deductive reasoning. Goel, Buchel, et al. (2000) presented subjects with logical arguments containing familiar content (i.e., propositions that they would have beliefs about) such as: All dogs are pets. All poodles are dogs. ⬖ All poodles are pets. and logically identical arguments lacking any meaningful content (i.e., subjects can have no beliefs about the truth or falsity of these propositions) such as: All P are B. All C are P. ⬖ All C are B. These studies indicate that two distinct systems are involved in reasoning about familiar and unfamiliar material. More specifically, a left lateralized frontal-temporal conceptual/language system processes familiar, conceptually coherent material, corresponding to the heuristic system, while a bilateral parietal visuospatial system processes unfamiliar, nonconceptual material, corresponding to the formal/universal system (see Figure 21.1). The involvement of the left frontal-temporal system in reasoning about familiar or meaningful content has also been demonstrated in neurological patients with focal unilateral lesions to the prefrontal cortex (parietal lobes intact) using the Wason card selection task (see Figure 21.2; Goel, Shuren, et al., 2004). These patients performed as well as normal controls on the arbitrary version of the task, but unlike the normal controls they failed to benefit from the
8/17/09 2:19:58 PM
420
Cognitive Neuroscience of Thinking (A)
(B) Reasoning with familiar material
Figure 21.1 Dissociation of systems involved in heuristic and formal reasoning processes. Note: A: Reasoning about familiar material (All apples are red fruit. All red fruit are nutritious. All apples are nutritious) activates a left frontal (BA 47) temporal (BA 21/22) system. B: Reasoning about familiar material (All A
Arbitrary condition M
A
8
3
“if a card has a vowel on one side, then it has an even number on the other side” Familiar content condition Beer
7-Up
12
are B. All B are C. All A are C) activates bilateral parietal lobes (BA 7, 40) and the dorsal prefrontal cortex (BA 6). From “Dissociation of Mechanisms Underlying Syllogistic Reasoning,” by V. Goel, C. Buchel, C. Frith, and R. J. Dolan, 2000, NeuroImage, 12, p. 510. Reprinted with permission.
There is even some evidence to suggest that the response of the frontal-temporal system to familiar situations may be content-specific to some degree (in keeping with some content specificity in the organization of temporal lobes; McCarthy & Warrington, 1990). One source of evidence comes from a study involving landmarks in familiar and unfamiliar environments (Goel, Makale, & Grafman, 2004). Arguments such as:
25
“if someone is drinking beer, then they’re at least 21 years old”
Figure 21.2 Wason card selection task. Note: The Wason card selection task is perhaps the most widely used task in the cognitive literature to assess content effects in reasoning. In the original (or arbitrary) version of the task, four cards are placed on a table, each having a letter on one side and a number on the other side. Two letters and two numbers are visible. The task is to determine which cards need to be turned over in order to verify the rule “if a card has a vowel on one side, then it has an even number on the other side.” (The correct answer is to turn over the first and fourth cards. The first card provides confirmation for the statement while the fourth card provides evidence of disconfirmation. Information on the other two cards is irrelevant.) The basic finding is that if the task involves arbitrary material, as in the first example, performance is relatively poor (25% to 30%). However, the presence of meaningful content (e.g., “if someone is drinking beer, then they are at least 21 years old”), as in the second example, dramatically improves performance (90% to 95%).
presence of familiar content in the meaningful version of the task. In fact, consistent with the neuroimaging data, the latter result was driven by the exceptionally poor performance of patients with left frontal lobe lesions. Patients with lesions to the right prefrontal cortex performed as well as normal controls.
c21.indd 420
Reasoning with unfamiliar material
Paris is south of London. London is south of Edinburgh. ⬖ Paris is south of Edinburgh. involving propositions that subjects would have beliefs about (as confirmed by a postscan questionnaire), were compared with arguments such as: The AI lab is south of the Roth Center. Roth Center is south of Cedar Hall. ⬖ The AI lab is south of Cedar Hall. containing propositions that subjects could not have beliefs about because they describe a fictional, unknown environment. These stimuli not only resulted in a dissociation between a frontal temporal system for the familiar environments and a parietal system for the unfamiliar environments, the temporal lobe activation in the former case included posterior hippocampus and parahippocampal gyrus, regions implicated in spatial memory and navigation tasks. These data provide support for the generalization of the previous results to transitive reasoning and indicate variability in temporal lobe activation as a function of content. Perhaps the most studied example of content specificity in the organization of
8/17/09 2:19:58 PM
Conflict Detection/Inhibition
the heuristic system is the Theory of Mind reasoning system identified by a number of studies (Fletcher et al., 1995; Goel, Grafman, Sadato, & Hallet, 1995). There is at least one study (De Neys & Goel, in press) that suggests a similar breakdown for problems from the decision-making literature, in particular the lawyer-engineer type base rate problems described previously. Participants were scanned while they solved lawyer-engineer type problems. Results showed that, just as during deductive reasoning, belief-mediated decisions based on stereotypical descriptions activate a left temporal lobe system whereas a bilateral parietal system is activated when the response is in line with the base rates. While the hypothesis for dissociation of heuristic and universal/formal processes at the neural level has not been tested directly with tasks from the problem-solving literature, there is at least some suggestive data, and at least one theory of frontal lobe functioning predicts that this should be the case. Grafman (2002; Wood & Grafman, 2003) has long maintained that the prefrontal cortex is critical for the storage and retrieval of large-scale knowledge structures generally called “scripts.” Scripts guide our behavior in daily, routine situations like going to work, ordering a meal at the restaurant, going shopping, and so on. There are many studies showing patients with lesions to the prefrontal cortex often show greater impairment in problem-solving tasks involving real world knowledge than in the more abstract neuropsychological test batteries. One such task tapping simple real world knowledge is the scripts task (Sirigu, Zalla, Pillon, Grafman, Agid, et al., 1995; Sirigu, Zalla, Pillon, Grafman, Dubois, et al., 1995). Subjects are given familiar situations and asked to list what they would do. For example, “you’re going on a date tonight, list all the things that you would do.” The responses of patients are compared with the normed response of controls. Patients with lesions to the prefrontal cortex have greater difficulty with this type of task than would be anticipated by their neuropsychological testing profile. However, the interpretation of these data is complicated by the fact that real world problems and neuropsychological test batteries differ on many dimensions (including task structure, which is discussed later), not just the presence of real world knowledge. Ideally, what is required is a task that can be undertaken in a condition where the subject’s real world knowledge is very relevant and another condition where it is less relevant. I’m unaware of such manipulation in the neuropsychology literature.
CONFLICT DETECTION/INHIBITION A natural consequence of dual systems/processes is the potential for conflict and facilitation between the two
c21.indd 421
421
systems. For example, in the case of logical reasoning, given arguments such as: All apples are fruit. All fruit are poisonous. ⬖ All apples are poisonous. There is a conflict that arises between the validity of the argument and the truth of the conclusion (valid argument but false conclusion). A robust consequence of the content effect is that subjects perform better on reasoning tasks when the logical conclusion is consistent with their beliefs about the world than when it is inconsistent with their beliefs (Evans et al., 1983; Wilkins, 1928). A very similar situation arises in many decision-making tasks. Consider the base rate fallacy task. The base rates point to one response (95% chance that Jack is a lawyer) even though the description of Jack is more prototypical of an engineer. This generates a conflict that the subject must recognize and resolve. Is the description sufficiently poignant/salient to overcome the odds in this particular instance? The rectangle and polygon task (Stavy, Goel, Critchley, & Dolan, 2006) provides an example from the problem-solving literature. Subjects are shown a rectangle followed by a polygon derived from the rectangle by a minor modification (see Figure 21.3). They are asked to compare the Original Rectangle
Lesser area ⫽ Same perimeter (Incongruent trial)
Greater area ⫽ Greater perimeter (Congruent trial)
Figure 21.3 The rectangle and polygon task. Note: In this task, subjects are shown a rectangle followed by a polygon, generated by a modification to the rectangle. They are asked to compare the perimeters of the two figures and determine whether the second is larger than the first. The area of the second figure also changes with the perimeter. This change in area seems to be a salient que for most subjects. When the perimeter changes in the same direction as the area (congruent trials), subjects perform very well on the task. However, when the perimeter does not change in the same direction as the area (incongruent trials), task performance suffers.
8/17/09 2:19:59 PM
422
Cognitive Neuroscience of Thinking
perimeters of the two figures and determine whether the second is larger than the first. In some trials (congruent condition), the perimeter and area change in the same direction (i.e., both increase or decrease as a result of the modification). In other trials (incongruent condition), the area changes but the perimeter stays the same (e.g., when a small square is removed from the upper right hand corner of the triangle). Young adults accurately respond to the congruent trials but many (46%) claim that the perimeter of the derived polygon in the incongruent condition is smaller than that of the original rectangle (Stavy et al., 2006). They explain this response in terms such as “a corner has been taken away,” suggesting they’re using a strategy that might be referred to as “more A (area) ⫽ more B (perimeter).” The data suggest that they do the task by attending to both the area and perimeter of the rectangle. But for most subjects, the area seems to be the more salient feature. In the congruent condition, both processing streams result in the same response. In the incongruent condition, a conflict arises between the responses generated by processing the area and the perimeter. To generate a correct response in this condition, the conflict must be detected and the salient response based on the area must be inhibited. The neural basis of conflict detection issue has been extensively explored within the reasoning domain. Within inhibitory belief trials, the prepotent response is the incorrect response associated with belief-bias (e.g., all children are nasty; all nasty people should be punished; therefore all children should be punished). Incorrect responses in such trials indicate that subjects failed to detect the conflict between their beliefs and the logical inference and/or inhibit the prepotent response associated with the belief-bias. These beliefbiased responses activate the ventral medial prefrontal cortex (BA 11, 32), highlighting its role in nonlogical, belief-based responses (Goel & Dolan, 2003). The correct response indicates that subjects detected the conflict between their beliefs and the logical inference, inhibited the prepotent response associated with the belief-bias, and engaged the formal reasoning mechanism. The detection of this conflict requires engagement of the right lateral/dorsal lateral prefrontal cortex (BA 45, 46; see Figure 21.4; Goel, Buchel, et al., 2000; Goel & Dolan, 2003; Prado & Noveck, 2007). This conflict detection role of the right lateral/dorsal prefrontal cortex is a generalized phenomenon that has been documented in a wide range of paradigms in the cognitive neuroscience literature (Fink et al., 1999; Picton, Stuss, Shallice, Alexander, & Gillingham, 2006; Vallesi, Mussoni, et al., 2007; Vallesi, Shallice, & Walsh, 2007). One demonstration of this system with lesion data was carried out by Caramazza, Gordon, Zurif, and DeLuca (1976) using simple two-term reasoning problems such as the following: “Mike is taller than George”; who is taller? They reported that left hemisphere patients were impaired in
c21.indd 422
Figure 21.4 Conflict detection system. Note: The right lateral/dorsal lateral prefrontal cortex (BA 45, 46) is activated during conflict detection. For example, in the following argument “All apples are red fruit; all red fruit are poisonous; all apples are poisonous” the correct logical answer is “valid”/“true” but the conclusion is inconsistent with our world knowledge, resulting in a belief-logic conflict. From “Explaining Modulation of Reasoning by Belief,” by V. Goel and R. J. Dolan, 2003, Cognition, 87, p. B18. Reprinted with permission.
all forms of the problem but—consistent with imaging data (Goel, Buchel, et al., 2000; Goel & Dolan, 2003)—right hemisphere patients were only impaired when the form of the question was incongruent with the premise (e.g., who is shorter?). The involvement of the right lateral/dorsolateral prefrontal cortex in conflict detection in decision-making tasks is illustrated by De Neys, Vartanian, et al. (2008). They scanned normal healthy volunteers with fMRI while participants engaged in the lawyer-engineer type base rate problems (introduced above). As in the reasoning paradigm, activation of right lateral prefrontal cortex was evident when participants inhibited the stereotypical heuristic responses and correctly completed the decision making task. In terms of problem-solving tasks, there are several relevant examples one can choose from, though the results are more mixed. Above we introduced the rectangle and polygon task. Neuroimaging studies of this task (Stavy et al., 2006) found activation in bilateral prefrontal cortex in the incongruent condition compared to the congruent condition, where the conflict between two strategies needs to be detected and overcome. Reverberi, Lavaroni, Gigli, Skrap, and Shallice (2005) provided a second example of the role of the right prefrontal cortex and content detection. They carried out a revised version of the Brixton Task with neurological patients with focal lesions. In the first half of this task subjects are presented with a series of cards, one at a time. Each card contains a 2 ⫻ 5 matrix of numbered circles. One circle on each card is colored blue, the others are white. The position of the blue circle moves from card to card following one of seven rules. The rule is switched every five to seven cards without warning. Upon being presented with a card the subject’s task is to indicate the position of the blue circle on the next card,
8/17/09 2:19:59 PM
III-Structured and Well-Structured Situations
thus indicating their ability to induce the current rule. The second half of the task is similar to the first, except for the following important differences: (a) rules stay active for 6 to 10 trials and (b) before the end of the particular series of rules an interfering rule is introduced. This consists of the sequence of four cards from the first part (only they contain red-filled circles rather than blue ones). These four cards follow a previously presented rule, but differ from the current rule thus introducing a conflict between the interfering rule and the previously active rule. This conflict must be detected, the interfering rule inhibited, and the response generated based on the active rule. They report that while patients with lesions to the left prefrontal cortex show an impairment in rule induction, patients with lesions to the right prefrontal cortex are impaired specifically in the ruleconflict condition.
III-STRUCTURED AND WELL-STRUCTURED SITUATIONS The issue of ill-structured and well-structured task environments or situations has been a crucial point of debate and contention in the problem-solving literature for 40 years. The distinction originates with Reitman (1964) who classified problems based on the distribution of information within the three components (start state, goal state, and the transformation function) of a problem vector. Problems where the information content of each of the vector components is absent or incomplete are said to be ill-structured. To the extent the information is completely specified, the problem is well-structured. A mundane example of an ill-structured problem is provided by the task of planning a meal for a guest. The start state is the current state of affairs. While some of the salient facts are apparent, it is not clear that all the relevant aspects can be immediately specified or determined (e.g., How hungry will they be? How much time and effort do I want to expend?). The goal state, while clear in the broadest sense (i.e., have a successful meal), cannot be fully articulated (e.g., How much do I care about impressing the guest? Should there be 3 or 4 courses? Would salmon be appropriate? Would they prefer a barbecue or an indoor meal?). And finally, the transformation function is also incompletely specified (e.g., Should I have the meal catered, prepare it myself, or ask everyone to bring a dish? If I prepare it, should I use fresh or frozen salmon?). Well-structured problems are characterized by the presence of information in each of the components of the problem vector. The Tower of Hanoi (see Figure 21.5) provides a relevant example (Goel & Grafman, 1995). The start state is completely specified (e.g., the disks are stacked in
c21.indd 423
2
1 A B
3
1
2
3 A BC
C Start
423
Goal
Figure 21.5 Tower of Hanoi Task. Note: The Tower of Hanoi puzzle consists of three pegs and several disks of varying size. Given a start state, in which the disks are stacked on one or more pegs, the task is to reach a goal state in which the disks are stacked in descending order on a specified peg. There are three constraints on the transformation of the start state into the goal state. (1) Only one disk may be moved at a time. (2) Any disk not being currently moved must remain on the pegs. (3) A larger disk may not be placed on a smaller disk.
descending order on peg 1). There is a clearly defined test for the goal state (e.g., stack the disks in descending order on peg 3). The transformation function is restricted to moving disks within the following constraints: (1) Only one disk may be moved at a time. (2) Any disk not being currently moved must remain on a peg. (3) A larger disk may not be placed on a smaller disk. Goel (1995) has extended Reitman’s original characterization along the number of dimensions and articulated the cognitive consequences of these differences. In particular, it has been argued that qualitatively different cognitive and computational machinery is required to deal with illstructured and well-structured situations/problems (Goel, 1995). Contrary to this position, others have argued that there are no qualitative differences between ill- and wellstructured problem situations and that the information processing theory machinery developed to deal with wellstructured problems can also account for ill-structured problems (Simon, 1973). The neuropsychological data, however, support a distinction. These issues can also arise in the reasoning and decisionmaking literature, though they have not garnered equivalent attention in these domains. The most natural place for illstructured situations in the reasoning literature is in induction tasks, where, by definition, the information provided in the premises always underdetermines the conclusion. However, subjects often make assumptions that eliminate uncertainty from the conclusion. For example, consider the following argument: “Sand can be red; the planet Mars is red; the sand on Mars is red.” In pilot studies, some subjects confidently responded “no” to this argument. When asked to explain, they made responses such as “there is no sand on Mars.” Deductive reasoning (when undertaken by nonexperts) is a prototypical example of a well-structured task. However, there are certain logical forms that result in indeterminate conclusions. For example, given A ⬎ B and A ⬎ C what is the relationship between B and C? Technically, this is not an ill-structured inference. Any proposed relationship between B and C is undetermined and therefore
8/17/09 2:20:00 PM
424
Cognitive Neuroscience of Thinking
any proposed conclusion is invalid. However, it may not be construed as such by subjects. Cognitive theories of reasoning do not make much of this and treat these indeterminate forms in the same way as determinate forms (A ⬎ B; B ⬎ C; A ⬎ C). The neuropsychological data discussed below suggest otherwise. In terms of tasks from the decision-making literature, elements of ill-structured situations will arise where the information is inconclusive. For example, a base rate problem with the base rate of 50:50 and ambiguous or neutral descriptions would result in an ill-structured problem. To my knowledge, these types of conditions have not been explored in the decision-making literature. There is an interesting puzzle in the neuropsychology literature that can be explained in terms of the different cognitive resources required to deal with ill-structured and well-structured problems (Goel, 1995). A subset of patients with frontal lobe lesions perform very well on neuropsychological test batteries (including IQ and memory measures) but encounter serious problems in coping with real life situations (see Goel & Grafman, 2000; Eslinger & Damasio, 1985; Shallice & Burgess, 1991; among others). Different explanations have been offered for the phenomenon. Damasio (1994) argues that the cause of this difficulty is the patient’s inability to inform cognitive processes by visceral, noncognitive factors. Grafman’s (1989) underlying intuition, already mentioned previously, is that the crucial issue is patients’ inability to perform in routine, over-learned situations. His structured-event complex (SEC) theory proposes that much of our world knowledge is stored in scriptlike data structures and frontal lobe patients have difficulty in accessing/retrieving these structures. Shallice (1988) suggests that the key deficit in frontal lobe lesion patients is dealing with task novelty. The idea is that there is a builtin contention scheduler that determines responses in overlearned, routine situations. However, when the organism is confronted with a novel situation, the contention scheduler is unable to cope. At this point, control passes to the more sophisticated supervisory attentional system (SAS), which is damaged in frontal lobe patients, thus rendering them incapable of coping with novel situations. Goel (2002; Goel, Grafman, et al., 1997) has argued that neuropsychological test batteries contain largely wellstructured problems while problems encountered in real life situations contain both ill-structured and well-structured components. Given that different cognitive mechanisms are required to deal with the two situations, there may be an anatomical dissociation corresponding to the cognitive and computational dissociations. In particular, I am suggesting that when the task environment contains either facilitative patterns (real or imaginary) that can be locked onto and extrapolated for successful
c21.indd 424
solution, or at least does not contain built-in hindrances to pattern extraction, the left prefrontal cortex may be necessary and sufficient for task solution. However, in cases where the start state pattern obstructs/hinders or totally underspecifies a solution path through the problem space, the left hemisphere interpreter may prematurely lock on to erroneous solutions. In such situations, the right prefrontal cortex plays a necessary role in generating possibilities that can aid in navigating through the problem space. It does so by supporting the encoding and processing of illstructured representations that facilitate lateral transformations (Goel, 1995). An apt example of the patient profile under discussion is provided by Goel and Grafman’s (2000) patient PF. PF was an accomplished professional architect with a right prefrontal cortex lesion. This patient scored 128 on the WAIS-R, but was simply unable to cope in the world. At age 56, he found himself unemployable and living at home with his mother. Because the patient was an architect, a task that required him to develop a new design for a lab space was administered. His performance was compared to two age- and educationmatched controls (an architect and a lawyer). The patient had superior memory and IQ and understood the task, and even observed that “this is a very simple problem.” His sophisticated architectural knowledge base was still intact and he used it quite skillfully during the problem-structuring phase. However, the patient’s problem-solving behavior differed from the controls’ behavior in the following ways: (a) he had difficulty in making the transition from problem structuring to problem solving; (b) as a result the preliminary planning phase did not start until two-thirds of the way into the session; (c) when it did occur it was minimal and erratic, consisting of three independently generated fragments; (d) there was no progression or lateral development of these fragments; (e) there was no carryover of abstract information into the preliminary planning or later phases; and (f) the patient did not make it to the detailing phase. This suggests that the key to understanding this patient’s deficit is to understand the cognitive processes and mechanisms involved in the preliminary (ill-structured) planning phase of the task. Another relevant example comes from the predicaments task (Channon & Crawford, 1999). Channon and Crawford presented subjects (patients with anterior lesions, posterior lesions, and normal controls) with stories of everyday awkward situations or predicaments such as the following: Anne is in her office when Tony comes in. She asks how he is, and he says he is all right, but tired. She agrees that he looks tired, and asks what is the matter. He has new neighbors who moved into the flat above his a couple weeks ago. They are nice people, but they own dogs and keep them in their kitchen
8/17/09 2:20:00 PM
III-Structured and Well-Structured Situations
at night, which is directly above Tony’s bedroom. All night, and every night since they moved in, the dogs jump around and bark. He finds it impossible to get to sleep. He says he has had a word with the neighbors, and although they were very reasonable, they said they had nowhere else to put the dogs as it is a block of flats.
Subjects were required to generate solutions to these scenarios. Even though this may be an “everyday” situation, it is very clearly an ill-structured situation. Subjects also carried out more abstract neuropsychological tests that would satisfy the definition of well-structured problems. Patients as a group were impaired relative to the normal controls in both the everyday predicaments task and the more abstract neuropsychological tests. Patients with anterior lesions were impaired in more aspects of the predicaments task than the posterior patients. Ill-structured problems do not easily lend themselves to the technical constraints of brain imaging studies. To get around this difficulty, Goel and colleagues tried to simulate specific aspects of ill-structured problems within wellstructured problems. In one such attempt, Vartanian and Goel (2005) manipulated the constraints on the search space of an anagram task. On unconstrained trials, subjects were required to rearrange letters to generate solutions (e.g., generate a word from IKFEN). On semantically constrained trials, they were required to rearrange letters to generate solutions within particular semantic categories (e.g., generate a word for a kitchen utensil from IKFEN). On baseline trials, they rearranged letters to make specific words (e.g., generate the word KNIFE from IKFEN). The critical comparison of unconstrained versus semantically constrained trials revealed significant activation in areas
1.5 1 0.5 0 ⫺0.5 ⫺1 ⫺1.5
Figure 21.6 Right ventral lateral prefrontal cortex is activated by underconstrained situations. Note: A: Hypothesis generation in unconstrained anagram trials is associated with significant activation in the right ventral lateral prefrontal cortex (BA 47). B: Furthermore, activation in the right ventral lateral prefrontal
c21.indd 425
including the right ventral lateral prefrontal cortex (see Figure 21.6), left superior frontal gyrus, frontal pole, right superior parietal lobe, right post central gyrus, and the occipital-parietal sulcus. They argued that the activation in the right ventral lateral prefrontal cortex is related to hypothesis generation in unconstrained settings, whereas activation in other structures is related to additional semantic retrieval, semantic categorization, and cognitive monitoring processes. These results extend the lesion data by demonstrating that an absence of constrains on the solution space is sufficient to engage the right ventral lateral prefrontal cortex in hypothesis generation, even in a linguistic task. As noted, while deductive reasoning is probably the prototypical example of a well-structured task for most people, indeterminate trials do allow for situations of incomplete information. Goel, Tierney, et al. (2007) tested neurological patients with focal unilateral frontal lobe lesions on a transitive inference task while systematically manipulating completeness of information regarding the status of the conclusion (i.e., determinate and indeterminate trials). The results demonstrated a double dissociation such that patients with left prefrontal cortex lesions were selectively impaired in trials with complete information (i.e., determinate trials such as A ⬎ B, B ⬎ C, A ⬎ C; and A ⬎ B, B ⬎ C, C ⬎ A), while patients with right prefrontal cortex lesions were selectively impaired in trials with incomplete information (i.e., indeterminate trials; A ⬎ B, A ⬎ C, B ⬎ C) (see Figure 21.7). These findings are very similar to those of the problem-solving tasks. While it is possible to have ill-structured situations within decision-making task paradigms (see above), I am unaware of any studies that have addressed this issue.
(B) Contrast Estimate at (30, 34, ⫺20)
(A)
425
Unconstrained
Semantically Constrained
Baseline
cortex increases as a function of decreasing constraints on the problem space in the anagram task. From “Task Constraints Modulate Activation in Right Ventral Lateral Prefrontal Cortex,” by O. Vartanian and V. Goel, 2005, Neuroimage, 27, p. 931. Reprinted with permission.
8/17/09 2:20:00 PM
426
Cognitive Neuroscience of Thinking (A)
(B) 8
Performance of left and right PFC patients on determinate and indeterminate trials
7 6
90
5
80
4
70
2
60
1 8
Accuracy
Left PFC lesions
3
50 40 30
7 6 5
10
4
0
3 2
Right PFC lesions
1
Figure 21.7 Double dissociation between systems for processing certain and uncertain information. Note: A: Lesion overlay maps (transverse slices, R ⫽ L), displaying left and right prefrontal cortex lesions. B: Accuracy scores on three-term transitive reasoning. A Lesion (right prefrontal cortex, left prefrontal cortex, normal controls) by Determinacy (determinate, indeterminate)
SUMMARY In this chapter, I provide a selective, conceptually motivated review of cognitive neuroscience’s contributions to the understanding of thought processes (i.e., reasoning, problem solving, and decision making). The strategy has been to select three issues (heuristic versus formal processes, conflict detection/inhibition, and ill-structured and well-structured task situations) that have played important roles in the development of cognitive theories of thinking processes and suggest that these behavioral/functional distinctions correspond to distinctions in the underlying neural machinery. The exercise is valuable for at least three reasons: 1. The identification of dissociations corresponding to functional/behavioral distinctions reinforces those distinctions and provides support for cognitive theories that respect them. 2. The fact that the dissociations involved similar anatomical structures in reasoning, problem-solving, and decision-making tasks (see Table 21.1) suggests a degree of underlying similarity or unity in these task domains at the anatomical level that is often ignored at the cognitive level. 3. The strategy of identifying dissociations may help to provide much needed mid-level constructs for our theories of thinking.
c21.indd 426
20
Determinate trials Normal controls
Left PFC lesions
Indeterminate trials Right PFC lesions
interaction shows a crossover double dissociation in the performance of left and right prefrontal cortex patients in determinate and indeterminate trials. From “Hemispheric Specialization in Human Prefrontal Cortex for Resolving Certain and Uncertain Inferences,” by V. Goel, Tierney, et al., 2007, Cerebral Cortex, 17, p. 2246. Reprinted with permission.
Current theories of thinking operate at the level of either phenomenological descriptions or computational descriptions. A well-known example of the latter was introduced above in terms of the problem space construct (Newell & Simon, 1972). While this provides some critical constraints in terms of theoretical vocabulary, short-term memory limitations, and sequential processing, it is essentially a Turing machine-level description. What is missing from our theories are mid-level constructs that connect the phenomenological description to the Turing Machine-level description. The dissociations that have been identified by lesion and neuroimaging studies—namely a formal pattern matcher, a content sensitive pattern matcher, the conflict detection system, and a system for maintaining uncertain information—are good candidates for these mid-level systems or constructs. The first two systems may be part of Gazzaniga’s “left hemisphere interpreter” (Gazzaniga, 1985, 2000). The function of the “interpreter” is to make sense of the world by completing patterns (i.e., filling in the gaps in the available information). I suspect that this system does not care whether the pattern is logical, causal, social, statistical, and so on. It simply abhors uncertainty and will complete any pattern, often prematurely, to the detriment of the organism. The roles of the conflict detection and uncertainty maintenance systems are, respectively, to detect conflicts in patterns and actively maintain representations of indeterminate/ambiguous situations and bring them to the attention of the “interpreter.” While there
8/17/09 2:20:01 PM
Appendix A: Role of Neuropsychological Data in Informing Cognitive Theories
427
TABLE 21.1 Summary of studies and areas of brain activation discussed in the chapter organized by task domain (reasoning, decision-making, and problem solving) and issues of interest (familiarity, conflict detection, and task structure). Familiarity Domain & Studies
Method
Heuristic
Formal
PL
Task Structure Conflict Detection
Complete Information
Incomplete Information
Left PFC
Right PFC
Reasoning Goel et al. (2000)
fMRI
Left F-TL
Goel, Shuren, et al. (2004)
lesion
Left PFC
Goel, Mikale & Grafman (2004)
fMRI
Left F-TL
Goel et al. (2000)
fMRI
PL Right PFC
Goel and Dolan (2003)
fMRI
Right PFC
Caramazza et al. (1976)
lesion
Right TL
Goel et al. (2006)
lesion
Decision-Making De Neys & Goel (in press)
fMRI
De Neys, Vartanian & Goel (2008)
fMRI
Left F-TL Right PFC
N/A Problem Solving Sirigu et al. (1995)
lesion
PFC
Stavy et al. (2006)
fMRI
Right PFC
Reverberi et al. (2005)
lesion
Right PFC
Goel & Grafman (2000)
lesion
Channon & Crawford (1999)
lesion
Vartanian & Goel (2005)
fMRI
Right PFC PFC Left PFC
Right VLPFC
Abbreviations: F-TL ⫽ frontal-temporal lobes, PFC ⫽ prefrontal cortex; TL ⫽ temporal lobes; VLPFC ⫽ ventral lateral prefrontal cortex; PL ⫽ parietal lobes
is considerable evidence for the existence of such systems, their time course of processing and interactions are largely unknown. One account of how these systems may interact in the case of logical reasoning appears in Goel (2008).
APPENDIX A: ROLE OF NEUROPSYCHOLOGICAL DATA IN INFORMING COGNITIVE THEORIES Although few cognitive psychologists today question the value of neuroimaging and lesion data, there is still a lack of consensus as to their role in informing cognitive theory. They have at least two immediate roles: localization of functions and of the dissociation of functions. Arguably the latter is much more important than the former. Localization of Brain Functions It is now generally accepted that there is a degree of modularity in aspects of brain organization. Over the years, neuropsychologists and neuroscientists have accumulated
c21.indd 427
some knowledge of this organization. For example, we know some brain regions are involved in processing language while other regions process visual spatial information. Finding selective involvement of these regions in complex cognitive tasks—like reasoning—can help us differentiate between competing cognitive theories that make different claims about linguistic and visuo-spatial processes (as do mental logic and mental model theories in the case of reasoning). However, we also know that for much of the brain there is at least a one to many mapping from brain structures to cognitive processes (and probably a many to many mapping) which undercuts much of the utility of localization. Despite this caveat, localization seems to loom large in the literature. Dissociation of Brain Functions Brain lesions result in selective impairment of behavior. Such selective impairments are called dissociations. A single dissociation occurs when we find a case of a lesion in region x resulting in a deficit of function a but not function b. If we find another case, in which a lesion in region y results in a
8/17/09 2:20:01 PM
428
Cognitive Neuroscience of Thinking
deficit in function b but not in function a, then we have a double dissociation. The most famous example of a double dissociation comes from the domain of language. In the 1860s, Paul Broca described patients with lesions to the left posterior inferior frontal lobe who had difficulties in the production of speech but were quite capable of speech comprehension. This is a case of a single dissociation. In the 1870s, Carl Wernicke described two patients (with lesions to the posterior regions of the superior temporal gyrus) who had difficulty in speech comprehension, but were quite fluent in speech production. Jointly the two observations indicate a double dissociation and tell us something important about the causal independence of language production and comprehension systems. If this characterization is accurate (and there are now some questions about its accuracy), it tells us that any cognitive theory of speech production and comprehension needs to postulate two distinct functions/mechanisms. Recurrent patterns of double dissociation provide indication of causal joints in the cognitive system invisible in uninterrupted normal behavioral measures (Shallice, 1988). Double dissociations manifest themselves as crossover interactions in neuroimaging studies. Thus, if in the case of reasoning, decision making, and problem solving, we find double dissociations along the lines of familiarity/ unfamiliarity, conflict/agreement, and certainty/uncertainty, cognitive theories will need to take these dissociations into consideration. Indeed, some neuropsychologists have argued that it really does not matter where the lesions are in patients (or where the activations are in neuroimaging studies), but only that there are double dissociations. While this is an extreme position, it is not without some merit.
neural network modeling: Neuropsychology and cognitive neuropsychology and cognitive neuroscience. Cambridge, MA: MIT Press. Carlin, D., Bonerba, J., Phipps, M., Alexander, G., Shapiro, M., & Grafman, J. (2000). Planning impairments in frontal lobe dementia and frontal lobe lesion patients. Neuropsychologia, 38, 655–665. Channon, S., & Crawford, S. (1999). Problem-solving in real-life-type situations: The effects of anterior and posterior lesions on performance. Neuropsychologia, 37, 757–770. Christoff, K., Prabhakaran, V., Dorfman, J., Zhao, Z., Kroger, J. K., Holyoak, K. J., et al. (2001). Rostrolateral prefrontal cortex involvement in relational integration during reasoning. Neuroimage, 14, 1136–1149. Colvin, M. K., Dunbar, K., & Grafman, J. (2001). The effects of frontal lobe lesions on goal achievement in the water jug task. Journal of Cognitive Neuroscience, 13, 1129–1147. Corkin, S. (1965). Tactually-guided maze learning in man: Effects of unilateral cortical excisions and bilateral hippocampal lesions. Neuropsychologia, 3, 339–351. Damasio, A. R. (1994). Descartes’ error. New York: Avon Books. De Neys, W., & Goel, V. (in press). Heuristics and biases in the brain: Dual neural pathways for decision making. In O. Vartanian & D. R. Mandel (Eds.), Neuroscience of decision making. New York: Psychology Press. De Neys, W., Vartanian, O., & Goel, V. (2008). Smarter than we think: When our brain detects we’re wrong. Psychological Science, 19, 483–489. Diamond, A. (1990). Inhibition and executive control: Developmental time course in human infants and infant monkeys, and the neural bases of, inhibitory control in reaching. In A. Diamond (Ed.), Annals of the New York Academy of Sciences: Vol. 608. The Development and Neural Bases of Higher Cognitive Functions (Part 7, pp. 637–676). New York: New York Academy of Sciences. Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after frontal lobe ablation: Patient EVR. Neurology, 35, 1731–1741. Evans, J. S. B. T. (2003). In two minds: Dual-process accounts of reasoning. Trends in Cognitive Sciences, 7, 454–459. Evans, J. S. B. T., Barston, J., & Pollard, P. (1983). On the conflict between logic and belief in syllogistic reasoning. Memory and Cognition, 11, 295–306. Evans, J. S. B. T., Newstead, S. E., & Byrne, R. M. J. (1993). Human reasoning: The psychology of deduction. Hillsdale, NJ: Erlbaum. Evans, J. S. B. T., & Over, D. E. (1996). Rationality and reasoning. New York: Psychology Press.
REFERENCES Acuna, B. D., Eliassen, J. C., Donoghue, J. P., & Sanes, J. N. (2002). Frontal and parietal lobe activation during transitive inference in humans. Cerebral Cortex, 12, 1312–1321. Alivisatos, B., & Milner, B. (1989). Effects of frontal or temporal lobectomy on the use of advance information in a choice reaction time task. Neuropsychologia, 27, 495–504. Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50(1–3), 7–15.
Fellows, L. K., & Farah, M. J. (2003). Ventromedial frontal cortex mediates affective shifting in humans: Evidence from a reversal learning paradigm. Brain, 126, 1830–1837. Fincham, J. M., Carter, C. S., van Veen, V., Stenger, V. A., & Anderson, J. R. (2002). Neural mechanisms of planning: A computational analysis using event-related fMRI. Proceedings of the National Academy of Sciences, USA, 99, 3346–3351. Fink, G. R., Marshall, J. C., Halligan, P. W., Frith, C. D., Driver, J., Frackowiak, R. S., et al. (1999). The neural consequences of conflict between intention and the senses. Brain, 122(Pt. 3), 497–512. Fletcher, P. C., Happe, F., Frith, U., Baker, S. C., Dolan, R. J., Frackowiak, R. S. J., et al. (1995). Other minds. In the brain: A functional imaging study of theory of mind: In story comprehension. Cognition, 57, 109–128.
Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (2005). The Iowa gambling task and the somatic marker hypothesis: Some questions and answers. Trends in Cognitive Sciences, 9, 159–162.
Gazzaniga, M. S. (1985). The social brain. New York: Basic Books.
Caramazza, A., Gordon, J., Zurif, E. B., & DeLuca, D. (1976). Righthemispheric damage and verbal problem solving behavior. Brain and Language, 3, 41–46.
Gazzaniga, M. S. (2000). Cerebral specialization and interhemispheric communication: Does the corpus callosum enable the human condition? Brain, 123(Pt. 7), 1293–1326.
Cardoso, J., & Parks, R. W. (1998). Neural network modeling of executive functioning with the tower of Hanoi in frontal lobe: Lesioned patients. In R. W. Parks, D. S. Levine, & D. L. Long (Eds.), Fundamentals of
Gilovich, T., & Kahneman, D. (Eds.). (2002). Heuristics and biases: The psychology of intuitive judgment. New York: Cambridge University Press.
c21.indd 428
8/17/09 2:20:02 PM
References 429 Glimcher, P. (2002). Decisions, decisions, decisions: Choosing a biological science of choice. Neuron, 36, 323–332.
Kahneman, D., & Tversky, A. (1996). On the reality of cognitive illusions. Psychological Review, 103, 582–591.
Goel, V. (1995). Sketches of thought. Cambridge, MA: MIT Press.
Knauff, M., Fangmeier, T., Ruff, C. C., & Johnson-Laird, P. N. (2003). Reasoning, models, and images: Behavioral measures and cortical activity. Journal of Cognitive Neuroscience, 15, 559–573.
Goel, V. (2002). Planning: Neural and psychological. In L. Nadel (Ed.), Encyclopedia of cognitive science (pp. 697–703). New York: Macmillan. Goel, V. (2005). Cognitive neuroscience of deductive reasoning. In K. Holyoak & R. Morrison (Eds.), Cambridge handbook of thinking and reasoning (pp. 475–492). Cambridge: Cambridge University Press. Goel, V. (2007). Anatomy of deductive reasoning. Trends in Cognitive Sciences, 11, 435–441. Goel, V. (2008). Fractionating the system of deductive reasoning. In E. Pppel, B. Gulyas, & E. Kraft (Eds.), The neural correlates of thinking. New York: Springer Science. Goel, V., Buchel, C., Frith, C., & Dolan, R. J. (2000). Dissociation of mechanisms underlying syllogistic reasoning. NeuroImage, 12, 504–514. Goel, V., & Dolan, R. J. (2003). Explaining modulation of reasoning by belief. Cognition, 87, B11–B22. Goel, V., Gold, B., Kapur, S., & Houle, S. (1997). The seats of reason: A localization study of deductive and inductive reasoning using PET (O15) blood flow technique. NeuroReport, 8, 1305–1310. Goel, V., & Grafman, J. (1995). Are frontal lobes implicated in “planning” functions: Interpreting data from the tower of Hanoi. Neuropsychologia, 33, 623–642. Goel, V., & Grafman, J. (2000). The role of the right prefrontal cortex: Pt. Ill. Structured problem solving. Cognitive Neuropsychology, 17, 415–436. Goel, V., Grafman, J., Sadato, N., & Hallet, M. (1995). Modelling other minds. NeuroReport, 6, 1741–1746. Goel, V., Grafman, J., Tajik, J., Gana, S., & Danto, D. (1997). A study of the performance of patients with frontal lobe lesions in a financial planning task. Brain, 120, 1805–1822.
Manes, F., Sahakian, B., Clark, L., Rogers, R., Antoun, N., Aitken, M., et al. (2002). Decision-making processes following damage to the prefrontal cortex. Brain, 125(Pt. 3), 624–639. McCarthy, R. A., & Warrington, E. K. (1990). Cognitive neuropsychology: A clinical introduction. New York: Academic Press. Milner, B. (1963). Effects of different brain lesions on card sorting: The role of the frontal lobes. Archives of Neurology, 9, 100–110. Monti, M. M., Osherson, D. N., Martinez, M. J., & Parsons, L. M. (2007). Functional neuroanatomy of deductive inference: A language-independent distributed network. NeuroImage, 37, 1005–1016. Morris, R. G., Miotto, E. C., Feigenbaum, J. D., Bullock, P., & Polkey, C. E. (1997). The effect of goal-subgoal conflict on planning ability after frontal- and temporal-lobe lesions in humans. Neuropsychologia, 35, 1147–1157. Newell, A. (1990). Unified theories of cognition. Cambridge: MA: Harvard University Press. Newell, A., & Simon, H. A. (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Newman, S. D., Carpenter, P. A., Varma, S., & Just, M. A. (2003). Frontal and parietal participation in problem solving in the Tower of London: FMRI and computational modeling of planning and high-level perception. Neuropsychologia, 41, 1668–1682. Noveck, I. A., Goel, V., & Smith, K. W. (2004). The neural basis of conditional reasoning with arbitrary content. Cortex, 40(4–5), 613–622.
Goel, V., Makale, M., & Grafman, J. (2004). The hippocampal system mediates logical reasoning about familiar spatial environments. Journal of Cognitive Neuroscience, 16, 654–664.
Osherson, D., Perani, D., Cappa, S., Schnur, T., Grassi, F., & Fazio, F. (1998). Distinct brain loci in deductive versus probabilistic reasoning. Neuropsychologia, 36, 369–376.
Goel, V., Pullara, S. D., & Grafman, J. (2001). A computational model of frontal lobe dysfunction: Working memory and the tower of Hanoi. Cognitive Science, 25, 287–313.
Owen, A. M., Doyon, J., Petrides, M., & Evans, A. C. (1996). Planning and spatial working memory: A positron emission tomography study in humans. European Journal of Medicine, 8, 353–364.
Goel, V., Shuren, J., Sheesley, L., & Grafman, J. (2004). Asymmetrical involvement of frontal lobes in social reasoning. Brain, 127(Pt. 4), 783–790.
Parsons, L. M., & Osherson, D. (2001). New evidence for distinct right and left brain systems for deductive versus probabilistic reasoning. Cerebral Cortex, 11, 954–965.
Goel, V., Tierney, M., Sheesley, L., Bartolo, A., Vartanian, O., & Grafman, J. (2007). Hemispheric specialization in human prefrontal cortex for resolving certain and uncertain inferences. Cerebral Cortex, 17, 2245–2250.
Paulus, M. P., Hozack, N., Zauscher, B., McDowell, J. E., Frank, L., Brown, G. G., et al. (2001). Prefrontal, parietal, and temporal cortex networks underlie decision-making in the presence of uncertainty. NeuroImage, 13, 91–100.
Goldberg, E., & Bilder, R. M. (1987). The frontal lobes and hierarchical organization of cognitive control. In E. Perecman (Ed.), The frontal lobes revisited. New York: The Psychology Press.
Perret, E. (1974). The left frontal lobe of man and the suppression of habitual responses in verbal categorical behaviour. Neuropsychologia, 12, 323–374.
Grafman, J. (1989). Plans, actions, and mental sets: Managerial knowledge units in the frontal lobes. In E. Perecman (Ed.), Integrating theory and practice in clinical neuropsychology. Hillsdale, NJ: Erlbaum.
Picton, T. W., Stuss, D. T., Shallice, T., Alexander, M. P., & Gillingham, S. (2006). Keeping time: Effects of focal frontal lesions. Neuropsychologia, 44, 1195–1209.
Grafman, J. (2002). The structured event complex and the human prefrontal cortex. In D. T. Stuss & R. T. Knight (Eds.), The frontal lobes (pp. 292–310). New York: Oxford University Press.
Prado, J., & Noveck, I. A. (2007). Overcoming perceptual features in logical reasoning: A parametric functional magnetic resonance imaging study. Journal of Cognitive Neuroscience, 19, 642–657.
Houde, O., Zago, L., Mellet, E., Moutier, S., Pineau, A., Mazoyer, B., et al. (2000). Shifting from the perceptual brain to the logical brain: The neural impact of cognitive inhibition training. Journal of Cognitive Neuroscience, 12, 721–728.
Reitman, W. R. (1964). Heuristic decision procedures, open constraints, and the structure of ill-defined problems. In M. W. Shelly & G. L. Bryan (Eds.), Human judgments and optimality (pp. 282–315). New York: Wiley.
Kahneman, D., Slovic, P., & Tversky, A. (Eds.). (1988). Judgment under uncertainty: Heuristics and biases. New York: Cambridge University Press. Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251.
c21.indd 429
Lezak, M. D. (1995). Neuropsychological assessment (3rd ed.). New York: Oxford University Press.
Reverberi, C., Lavaroni, A., Gigli, G. L., Skrap, M., & Shallice, T. (2005). Specific impairments of rule induction in different frontal lobe subgroups. Neuropsychologia, 43, 460–472. Roberts, R. J., Hager, L. D., & Heron, C. (1994). Prefrontal cognitive processes: Working memory and inhibition in the antisaccade task. Journal of Experimental Psychology: General, 123, 374–393.
8/17/09 2:20:02 PM
430
Cognitive Neuroscience of Thinking
Rowe, J. B., Owen, A. M., Johnsrude, I. S., & Passingham, R. E. (2001). Imaging the mental components of a planning task. Neuropsychologia, 39, 315–327. Rylander, G. (1939). Personality changes after operations on the frontal lobes. Acta Psychiatrica et Neurologica Scandinavica, Supplementum(20). Sanfey, A. G., Rilling, J. K., Aronson, J. A., Nystrom, L. E., & Cohen, J. D. (2003, June 13). The neural basis of economic decision-making in the ultimatum game. Science, 300, 1755–1758. Shallice, T. (1982). Specific impairments of planning. Philosophical Transactions of the Royal Society of London. Series B, 298, 199–209. Shallice, T. (1988). From neuropsychology to mental structure. Cambridge, MA: Cambridge University Press. Shallice, T., & Burgess, P. W. (1991). Deficits in strategy application following frontal lobe damage in man. Brain, 114, 727–741. Shallice, T., & Evans, M. E. (1978). The involvement of the frontal lobes in cognitive estimation. Cortex, 14, 294–303. Simon, H. A. (1973). The structure of ill-structured problems. Artificial Intelligence, 4, 181–201. Simon, H. A. (1981). The sciences of the artificial (2nd ed.). Cambridge, MA: MIT Press. Simon, H. A. (1983). Reason in human affairs. Stanford, CA: Stanford University Press.
Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3–22. Smith, M. L., & Milner, B. (1988). Estimation of frequency of occurrence of abstract designs after frontal or temporal lobectomy. Neuropsychologia, 26, 297–306. Stavy, R., Goel, V., Critchley, H., & Dolan, R. (2006). Intuitive interference in quantitative reasoning. Brain Research, 1073, 383–388. Tranel, D., Bechara, A., & Denburg, N. L. (2002). Asymmetric functional roles of right and left ventromedial prefrontal cortices in social conduct, decision-making, and emotional processing. Cortex, 38, 589–612. Tversky, A., & Kahneman, D. (1974, September 27). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. Vallesi, A., Mussoni, A., Mondani, M., Budai, R., Skrap, M., & Shallice, T. (2007). The neural basis of temporal preparation: Insights from brain tumor patients. Neuropsychologia, 45, 2755–2763. Vallesi, A., Shallice, T., & Walsh, V. (2007). Role of the prefrontal cortex in the foreperiod effect: TMS evidence for dual mechanisms in temporal preparation. Cerebral Cortex, 17, 466–474. Vartanian, O., & Goel, V. (2005). Task constraints modulate activation in right ventral lateral prefrontal cortex. NeuroImage, 27, 927–933. Wilkins, A. J., Shallice, T., & McCarthy, R. A. (1987). Frontal lesions and sustained attention. Neuropsychologia, 25, 359–365. Wilkins, M. C. (1928). The effect of changed material on the ability to do formal syllogistic reasoning. Archives of Psychology, 16, 5–83.
Sirigu, A., Zalla, T., Pillon, B., Grafman, J., Agid, Y., & Dubois, B. (1995). Selective impairments in managerial knowledge following pre-frontal cortex damage. Cortex, 31, 301–316.
Wolford, G., Miller, M. B., & Gazzaniga, M. (2000). The left hemisphere’s role in hypothesis formation. Journal of Neuroscience, 20, 1–4.
Sirigu, A., Zalla, T., Pillon, B., Grafman, J., Dubois, B., & Agid, Y. (1995). Planning and script analysis following prefrontal lobe lesions. Annals of the New York Academy of Sciences, 769, 277–288.
Wood, J. N., & Grafman, J. (2003). Human prefrontal cortex: Processing and representational perspectives. Nature Reviews: Neuroscience, 4, 139–147.
c21.indd 430
8/17/09 2:20:02 PM
Chapter 22
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control KRISTA MCFARLAND
on organizational features that influence the production of goal-directed behavior. An extensive circuitry including subcortical and cortical structures (alternately termed the basal ganglia or motive circuit) has been implicated in motivational control of goal-directed behavior, and central to this circuitry are the dopaminergic neurons of the ventral midbrain. Thus, following a discussion of the anatomical and functional organization of the basal ganglia, we address the importance of dopaminergic signaling in the production of adaptive motor responses. Finally, we conclude with a brief examination of neuropsychiatric disorders that arise from disruptions of basal ganglia circuitry, including schizophrenia, Parkinson’s disease, and Huntington’s disease. The discussion emphasizes how understanding the organization of motivational circuitry can provide a framework for understanding these neuropsychiatric conditions.
One of the cardinal features of behavior is that it is goal directed. Animals seek food or water; they avoid predators and explore novel environments. Humans work overtime to buy a new home, get up early to go to the gym, or put their health and happiness in jeopardy to get drugs. One of the primary goals of behavioral scientists is to understand how goal-directed behaviors are produced. The present chapter reviews the anatomical, neurobiological, and behavioral evidence regarding the neural substrates of goal-directed behavior. A necessary first step involves understanding the organization and function of motor pathways. Perhaps more importantly, it involves understanding sensory and limbic control over motor pathways. Thus, an individual’s memories, expectations, past learning, and history of reward or punishment influences how information is processed and which behaviors are generated as a result of incoming information. For this reason, this chapter describes motor pathways, but primarily focuses on the integrative aspects of motor control, that is, how cognition and motivation influence behavior. This involves examination of a motive circuit and basal ganglia within the basal forebrain. The motive circuit and basal ganglia are comprised of multiple parallel circuits, some more directly connected with limbic functions and some more directly connected with motor functions. It is hypothesized that limbic circuits are important for processing environmental stimuli, relative to an individual’s past experience and current motivational state and transmitting this information to portions of the motor circuit, thus instigating novel and practiced adaptive motor responses. Within this framework, the neuroanatomical and neurochemical organization of the motive circuit provides the neural substrates of motivation and reinforcement and functions to elicit adaptive motor responses in the presence of motivationally significant stimuli. With these goals in mind, we begin with a description of the pyramidal motor system and then progress to a description of motivational circuitry with an emphasis
PYRAMIDAL MOTOR SYSTEM Although brain-stem activity is sufficient to regulate a number of very basic aspects of motor function, including respiration, maintenance of equilibrium, eye movements, and cardiovascular and gastrointestinal function, production of precise behaviors requires control from higher brain centers (Grillner, 1990). The pyramidal motor system provides the primary mechanism for descending cortical control of movement and is comprised of two pathways: the corticospinal and corticobulbar pathways. Corticospinal projections originate from multiple cortical areas, including primary motor cortex, premotor cortex, and sensorymotor cortex (Dum & Strick, 1991). Most axons in the corticospinal tract decussate (cross the midline) in the brain stem and descend in the lateral corticospinal tract to the contralateral spinal cord where they synapse (directly or indirectly) on neurons that control movement of the arms, 431
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c22.indd 431
8/17/09 2:21:00 PM
432
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
hands, fingers, feet, and toes. Neurons that do not decussate descend ipsilaterally in the ventral corticospinal tract where they synapse on neurons that control movement of the upper legs and trunk. Corticobulbar neurons project to the medulla and synapse on motor nuclei of cranial nerves that are critical for control of the face, neck, lips, and tongue. Thus, the cortex modulates activity of virtually all muscle groups via descending corticospinal inputs. Identification and mapping of motor cortex began in the second half of the nineteenth century. Experiments demonstrated that stimulation of the precentral gyrus in dogs and primates resulted in limb movements (Fritsch & Hitzig, 1870; Leyton & Sherrington, 1917). Later, Penfield and colleagues stimulated the precentral gyrus in human surgical patients and showed a disproportionate somatotopic map of the body (i.e., homunculus) in the cortex (Penfield & Boldrey, 1937; Penfield & Rasmussen, 1952), where specific areas of the cortex were responsible for muscle movements within corresponding body parts. This area was identified as primary motor cortex and subsequent work has demonstrated that there are many cortical areas that interact with primary motor cortex and the spinal cord, including premotor cortex and supplementary motor cortex (Dum & Strick, 1991; Picard & Strick, 1996, 2001). These cortical areas are topographically organized (He, Dum, & Strick, 1993, 1995), and each contains a separate somatotopic representation of the body (Dum & Strick, 2002; Gentilucci et al., 1989; Godschalk, Mitz, van Duin, & van der Burg, 1995; Mitz & Wise, 1987). Despite similarities in organization and spinal connectivity, different cortical motor areas have dissociable effects on behavior. In general, electrical stimulation of primary motor cortex causes individual muscles to contract, whereas stimulation of premotor or supplementary motor areas causes coordinated groups of muscles to contract, suggesting that these areas might be critical for coordination of muscle groups in the performance of behavior. The results from lesion studies suggest that a significant function of primary motor cortex is independent movement of the fingers that are critical for grasping objects (Baker, Spinks, Jackson, & Lemon, 2001; Brinkman & Kuypers, 1973; Lemon, Mantel, & Muir, 1986; Passingham, Perry, & Wilkinson, 1978, 1983). This idea is consistent with the observations from comparative anatomy studies that show a relationship between the development of precision grip and the size of the corticospinal tract (Heffner & Masterton, 1983; Nudo & Masterton, 1990). As previously mentioned, nonprimary motor areas seem to be important for coordinating movements of groups of muscles. As such, they participate in numerous functions, including preparation for specific tasks (e.g., positioning arms so the hands are oriented properly for a specific
c22.indd 432
task), postural adjustments, positional movements of the head and eyes, and coordinating bilateral movements. Notably, part of the premotor cortex has been implicated in learning associations between stimuli and movements. Functional disturbance of this area (e.g., lesions, pharmacological manipulations, or transcranial magnetic stimulation) disrupt the ability to learn or use previously learned relationships between environmental cues and motor responses (Chouinard, Leonard, & Paus, 2005; Halsband & Passingham, 1982, 1985; Kurata & Hoffman, 1994). The different roles of various areas of motor cortex in modulating behavioral output are presumably related to their differential connectivity. Different cortical motor areas receive a unique set of inputs from parietal (sensory and association) and prefrontal cortices, as well as participating in different loops with the basal ganglia (Alexander & Crutcher, 1990; Dum & Strick, 1991, 1993; Graybiel, 1991; Hoover & Strick, 1993), a topic that will be explored in greater detail later in this chapter within the context of the basal ganglia.
LIMBIC MOTOR CONTROL: THE MOTIVE CIRCUIT AND BASAL GANGLIA The previous section described the motor pathways important for controlling behavior. Activation of motor cortex and supplementary motor areas causes muscles to contract and consequently produce movement. However, of critical importance for understanding behavior is appreciating that behavior occurs within a particular context. Why does a particular behavior occur at a particular time? Out of all possible behaviors that could occur, how does a hungry individual successfully obtain food? How does an animal successfully avoid a predator, even in a novel environment? The answer to these questions lies in the ability of an animal to integrate information regarding its internal environment (e.g., hunger or thirst) with its expectations about the outcome of a particular behavior in the presence of available contextual cues. Thus, when entering a dark room, a person might flip the light switch, even if the light bulb burned out the day before, based on the expectation that flipping a light switch produces light. The integration of sensory and motivational information with previous learning and memory is the function of an extensive circuitry including subcortical and cortical structures. Examination of these structures has arisen from two different scientific perspectives: one focused on the role of the dorsal striatum (and its associated structures) in the control of movement, and another focused on the ventral striatum (and its associated structures) in reinforcement and reward. The dorsal striatum has been studied as part of basal ganglia circuitry,
8/17/09 2:21:00 PM
Limbic Motor Control: The Motive Circuit and Basal Ganglia
which also includes the substantia nigra, globus pallidus, and subthalamic nucleus. The ventral striatum has been studied as part of the “motive circuit,” which is comprised of corresponding ventral components of many of the same nuclei, including the ventral striatum (nucleus accumbens), ventral tegmental area, ventral pallidum, and medial dorsal nucleus of the thalamus. Although these circuits have historically been studied independently, what follows is an attempt to integrate what is known about their structure and function, focusing on the parallels between the two, in order to ascribe function to the circuitry, as a whole.
433
(A) DS Inputs (Cortex/Amyg/Hipp)
GP VS
VP
Output (Thalamus/Cortex)
VTA SN (B) NAs
VPm
NAc
VPI
Anatomy of the Motive Circuit VTA
Early biological studies of motivated behavior focused on the importance of a few individual nuclei in the production of adaptive motor responses. The amygdala and nucleus accumbens (ventral striatum) garnered particular attention. Early work demonstrated that lesions of the amygdala produced changes in emotionality (Kluver & Bucy, 1939) and deficits in the ability to identify and respond appropriately to biologically significant stimuli. For example, animals with amygdala lesions were shown to have impaired avoidance responding to stimuli that signaled shock (Weiskrantz, 1956), as well as impaired visual discrimination for food reward (Jones & Mishkin, 1972). These studies, when combined with anatomical studies demonstrating connectivity of the amygdala with sensory, autonomic, and motor structures (Aggleton, Burton, & Passingham, 1980; Herzog & Van Hoesen, 1976; Nauta, 1961) suggested a critical modulatory function in regulating motivated behavior. Olds and Milner (1954) discovered that animals would work for electrical simulation of the medial forebrain bundle. Later demonstrations that this effect was largely due to stimulation of dopaminergic afferents to the nucleus accumbens (NA; Corbett & Wise, 1980; Fibiger & Phillips, 1986), coupled with emerging evidence that the neurochemical mode of action of many drugs of abuse depended on the accumbens (Kornetsky & Esposito, 1979; Wise, 1982), indicated a central role in goal-directed behavior. Behavioral evidence, considered in conjunction with the connectivity of the NA with limbic and cortical structures, suggested a probable role in behavior elicited by at least positive motivational stimuli. In his now classic formulation, Mogenson moved beyond the single nucleus approach to suggest that the amygdala and NA form part of a circuit that functions to integrate limbic information and elicit appropriate behavioral responses (Mogenson, Jones, & Yim, 1980). Within this “motive circuit,” the NA was proposed to serve as a functional interface between the limbic system (including the amygdala) and motor output structures. Heimer
c22.indd 433
SN
Figure 22.1 Topographical organization of subcortical projections in motivational circuitry.
formalized the anatomical interrelationships between the amygdala and NA in motivation as the “extended amygdala” (Heimer et al., 1997; Heimer, Alheid, & Zahm, 1993). This interconnected series of nuclei, including the central nucleus of the amygdala, bed nucleus of the stria terminalis, medial ventral pallidum (VP), and ventromedial NA, is hypothesized to be a prime contributor of emotional context. Subcortical Circuitry Central to the organization of motive circuit is a trio of interconnected nuclei that form parallel loops through the midbrain, striatum, and pallidum (see Figure 22.1A). An important organizational feature of these nuclei is the precise topographical projections they display. Thus, a group of neurons in the striatum that receives a projection from neurons of the mesencephalon sends a reciprocal projection to these same neurons. Further, these striatal neurons share a reciprocal projection with neurons in the pallidum that share a reciprocal projection with the same region of the mesencepholon to which the striatal neurons project. In this manner, parallel circuits for processing of motivational information are formed. The motive circuit contains the ventral portions of these nuclei and can be further subdivided into two subcircuits that are parallel (see Figure 22.1B), but have different input and output structures. Figure 22.2 depicts connectivity of the motive circuit. The mesolimbic dopamine system projects from the ventral tegmental area (VTA) to the NA in the ventral striatum (Beckstead, 1979; Beckstead, Domesick, & Nauta, 1979;
8/17/09 2:21:00 PM
434
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
Cortical and Allocortical Input PFCd PFCv
MD Amygdala hippocampus VPm
NAs NAc
VPI
VTA SN Motor subcircuit
Glutamate
Limbic subcircuit
GABA
Transition
Dopamine
Figure 22.2 The connectivity of the motive circuit.
Fallon & Moore, 1978), although, up to 20% of the pathway contains ␥-aminobutyric acid (GABA; Carr & Sesack, 2000b). The VTA innervates and receives innervation from ventromedial portions of the NA, termed the shell (NAs; Heimer, Zahm, Churchill, Kalivas, & Wohltmann, 1991; Swanson, 1982). Accumbal projections to the VTA contain GABA, dynorphin and substance P (Kalivas, Churchill, & Klitenick, 1993; Zahm & Heimer, 1990). Although the VTA also innervates the medial portions of the accumbens, termed the core (NAc), reciprocal projections with the NAc arise from the substantia nigra (SN, Heimer et al., 1991), which is a component of the basal ganglia and will be discussed in further detail later. The parallel topography of these subcortical loops is maintained in efferent projections from the NA to the VP. Thus, the NAs projects to ventromedial portions of the VP (VPm), while the NAc projects primarily to the dorsolateral, subcomissural VP (VPl; Zahm, 1989; Zahm & Heimer, 1990). The striatopallidal projections contain GABA, enkephalin, substance P, and neurotensin (Churchill, Dilts, & Kalivas, 1990; Napier et al., 1995; Olive, Bertolucci, Evans, & Maidment, 1995; Zahm & Heimer, 1988), while reciprocal projections are primarily GABAergic (Churchill & Kalivas, 1994). Like the NA, the VP exhibits mediolateral topography in its innervation of the mesencephalon, with the VPm providing GABAergic innervation of the VTA and the VPl innervating the SN (Kalivas et al., 1993; Zahm & Heimer, 1990). However, reciprocal innervation of the VP is not as discrete. The VTA projects to both the VPm and the VPl, while the SN shows little, if any, innervation of the VP (Klitenick, Deutch, Churchill, & Kalivas, 1992).
c22.indd 434
Afferent innervation of the subcortical circuit arises from a number of sources. Primary among these is the medial prefrontal cortex that maintains topographic connectivity with both the VTA and the NA, and thus forms an extension of the parallel subcortical circuitry. Both the dorsal (mPFCd) and ventral (mPFCv) prefrontal cortices receive mesocortical dopamine projections from the VTA (Fuxe, Hokfelt, Johansson, Lidbrink, & Ljungdahl, 1974), which like the mesolimbic system has a significant (up to 40%) GABAergic component (Carr & Sesack, 2000a). Additionally, both the mPFCd and the mPFCv send reciprocal projections back to the VTA, and the mPFCd also sends a projection to SN (Beckstead, 1979; Sesack, Deutch, Roth, & Bunney, 1989). Whereas the mPFCv and mPFCd show some degree of mediolateral topography in corticofugal innervation of the VTA, the mPFC displays very discrete target specificity in its innervation of the NA. The dorsal mPFCd projects selectively to the NAc and the mPFCv projects to the NAs (Berendse, Galis-de Graaf, & Groenewegen, 1992; Sesack et al., 1989). The main transmitter of the efferent mPFC projection is glutamate (Christie, Summers, Stephenson, Cook, & Beart, 1987; Fonnum, Storm-Mathisen, & Divac, 1981; Ray, Russchen, Fuller, & Price, 1992; Spencer, 1976). In addition to excitatory glutamatergic afferents arising from the mPFC, the ventral striatum also receives prominent afferents from allocortical regions, including the basolateral amygdala and hippocampus (Groenewegen, Vermeulen-Van der Zee, te Kortschot, & Witter, 1987; Kelley & Domesick, 1982; Kelley, Domesick, & Nauta, 1982; Wright, Beijer, & Groenewegen, 1996), as well as nonisocortical limbic regions, including entorhinal cortex, anterior cingulate cortex, orbitofrontal cortex, and insula (Chikama, McFarland, Amaral, & Haber, 1997; Ferry, Ongur, An, & Price, 2000; Haber, Kunishio, Mizobuchi, & Lynd-Balta, 1995), all of which have been termed the limbic lobe (Heimer & Van Hoesen, 2006). In addition to cortical innervation, the VTA and substantia nigra receive innervation from the central nucleus of the amygdala (Fudge & Haber, 2000). This is thought to be a primary source of enkephalin and neurotensin innervation to the VTA. Likewise, the VTA has a substantive dopaminergic and GABAergic projection to the central nucleus. As part of the extended amygdala, the central nucleus also provides innervation to the VPm and NAs (Heimer et al., 1993). Thus, the ventral striatum is closely related to areas of the brain that are critical for emotion, memory, and contextual relevance. This connectivity is hypothesized to confer the ability to choose and mobilize adaptive motor responses in the context of cues in the environment that are predictive of important outcomes.
8/17/09 2:21:01 PM
Limbic Motor Control: The Motive Circuit and Basal Ganglia
Motor and Limbic Subcircuits
Spinal cord Motor Response
ct p
SNc
Brain Stem (PPN/SC)
y wa ath
VTA
Thalamus VA/VL/MD
re Di
DS VS
way path
c22.indd 435
Motor Association
t irec
The basal ganglia consist of a several nuclei, including the substantia nigra, dorsal striatum, globus pallidus, and subthalamic nucleus. Notably, these include the more dorsal aspects of the nuclei involved in the motive circuit (see Figure 22.1A). In fact, the motive circuit described previously can be viewed as the “limbic” loop of an extensive circuitry that progresses into “association” and “sensory motor” loops in these more dorsal regions of the basal ganglia. The current review does not cover in detail all aspects of basal ganglia organization and function (a topic to which many books have been devoted), but instead focuses on parallels between the basal ganglia and motive circuit, in an attempt to provide an integrated framework for thinking about the production of adaptive motor responses. Figure 22.3 depicts basal ganglia circuitry implicated in the control of motivated behavior, integrating aspects of the motive circuit described previously.
Limbic
Ind
Anatomy of Basal Ganglia Circuitry
Pyramidal pathway Cortex
Indirect pathway
An inspection of Figure 22.2 and the preceding description reveals that there are two subcircuits that comprise the larger motive circuit, one comprised of predominantly limbic-related structures and one of primarily of motorrelated structures. Thus, the VTA, NAs, and VPm are associated with limbic structures like the mPFCv, basolateral amygdala, hippocampus, and bed nucleus of the stria terminalis. In fact, recent conceptualizations of the ventral forebrain suggest that the centromedial amygdala, sublenticular VP, bed nucleus of the stria terminalis, and NAs form a continuous network termed the extended amygdala (Alheid, 2003; De Olmos & Heimer, 1999). Conversely, the mPFCd, NAc, and VPl form contacts with motor structures, like the SN, motor cortex, pedunculopontine nucleus, and subthalamic nucleus. Consistent with the idea that there are two distinct circuits that run through the ventral striatum, the NAc and NAs accumbal subregions can be distinguished on the basis of connectivity, histochemistry, and behavior (Heimer et al., 1997; Kelley, 2004; Meredith, Baldo, Andrezjewski, & Kelley, 2008; Voorn, Vanderschuren, Groenewegen, Robbins, & Pennartz, 2004; Zahm, 1999; Zahm & Brog, 1992). Thus, converging evidence suggests that two relatively closed-loop systems within the larger motive circuit separately integrate motor and limbic information (Kalivas, Churchill, & Romanides, 1999). For an individual to effectively integrate incoming motivational stimuli and emit appropriate behavioral responses, there must be interplay between the motor and limbic systems. Transfer of information among various loops within motivational circuitry is discussed next.
435
GPe
SNr/GPi
STN
VP
Figure 22.3 The major projections within the basal ganglia that contribute to the control motivated behavior.
Subcortical Circuitry As with the motive circuit, topographical, reciprocal connections among three nuclei are central to basal ganglia circuitry: the substantia nigra, dorsal striatum, and globus pallidus. The nigrostriatal dopamine system arises in the substantia nigra pars compacta (SNc) and terminates in the dorsal striatum (including both the caudate and putamen) and the globus pallidus (Anaya-Martinez, Martinez-Marcos, Martinez-Fong, Aceves, & Erlij, 2006; Fallon & Moore, 1978; Hedreen & DeLong, 1991; Lavoie, Smith, & Parent, 1989; Lynd-Balta & Haber, 1994a; Szabo, 1980). The dorsal striatum, in turn, sends highly topographically organized projections back to both the SNc and the substantia nigra pars reticulata (SNr), as well as to both the internal and external divisions of the globus pallidus (GPi and GPe, respectively; Haber, Groenewegen, Grove, & Nauta, 1985; Hazrati & Parent, 1992; Hedreen & DeLong, 1991; Parent, Bouchard, & Smith, 1984). Finally, the SNr, SNc, and dorsal striatum all receive GABAergic projections from the globus pallidus (Beckstead, 1983; Bevan, Smith, & Bolam, 1996; Kita, Tokuno, & Nambu, 1999; Rajakumar, Elisevich, & Flumerfelt, 1994; Smith & Bolam, 1990). Cortical Inputs The striatum is the major input structure of the basal ganglia. It receives dense, topographically organized projections
8/17/09 2:21:01 PM
436
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
from extensive parts of the cerebral cortex. The corticostriatal projection imposes a functional organization on the striatum that is largely maintained in striatal projections to the pallidum. Thus, the striatum has been subdivided based on the cortical area from which it receives a projection into at least three distinct areas: limbic, associative, and sensorimotor (Alexander & Crutcher, 1990; Alexander, DeLong, & Strick, 1986; Haber, 2003; Joel & Weiner, 1997; Parent & Hazrati, 1995; Tisch, Silberstein, LimousinDowsey, & Jahanshai, 2004). The limbic striatum is largely synonymous with the nucleus accumbens, although the most ventral parts of the caudate and putamen and the olfactory tubercle can also be deemed limbic striatum. As previously described, limbic striatum receives projections from limbic cortical and allocortical areas including the medial prefrontal cortex, amygdala, and hippocampus, as well as nonisocortical areas like the anterior cingulate and orbitofrontal cortex. These areas have been implicated in numerous functions including reward-based learning, stimulus-guided behavior, expression of emotion, and control of impulsive behavior (Apicella, 2002; Badre & Wagner, 2004; Botvinick, Nystrom, Fissell, Carter, & Cohen, 1999; Floresco, Magyar, Ghods-Sharifi, Vexelman, & Tse, 2006; Holland & Gallagher, 2004; Hollerman, Tremblay, & Schultz, 2000; Laurens, Kiehl, & Liddle, 2005; Rolls & Baylis, 1994; Schultz, Tremblay, & Hollerman, 2000; Vertes, 2006; Woodward, Chang, Janak, Azarov, & Anstrom, 1999). The associative striatum includes the putamen rostral to the anterior commissure and most of the caudate, except a small portion near the internal capsule. Associative striatum receives projections primarily from the dorsolateral prefrontal cortex that has been implicated in executive functions like attentional control in working memory, set shifting, and strategic planning (Arikuni & Kubota, 1986; Fuster, 2000a, 2000b; Goldman & Nauta, 1977; GoldmanRakic, 1996; Mitchell, Rhodes, Pine, & Blair, 2008; Novais-Santos et al., 2007). Sensorimotor striatum includes the dorsolateral putamen (caudal to the anterior commissure) and the dorsolateral rim of the head of the caudate. These areas receive projections from motor cortices, including primary motor cortex, supplementary motor cortex, and premotor cortices, which have been functionally implicated in the control of motor behavior, and from sensory cortices (Flaherty & Graybiel, 1994, 1995; Kunzle, 1975; Liles & Updyke, 1985; N. R. McFarland & Haber, 2000). Data suggest involvement of these projections in sensorimotor control and motor planning (Alexander & DeLong, 1985; Bailey & Mair, 2006; Boecker et al., 1998; DeLong, Alexander, Mitchell, & Richardson, 1986; Hikosaka, 2007; Lehericy et al., 2006). Thus, projections from the cortex to the striatum form a functional gradient from limbic through associative to sensory and motor
c22.indd 436
(i.e., from emotional to motor control) along the ventromedial to dorsolateral axis of the striatum. These projections are largely topographically segregated, thus parallel streams of information seem to enter the striatum. Cortical and allocortical projections to the striatum are excitatory, glutamatergic projections. Thus, it seems that the cortex is situated to stimulate behavior, while striatal circuitry is positioned to modulate cortical output. Notably, striatal circuitry also projects back to the cortex. This arrangement allows for environmentally important stimuli to guide ongoing behavior and provide a feedback mechanism for learning about the outcome of goal-directed behavior. Another important organizational feature of the circuitry is extensive cortico-cortical interactions, allowing integration to occur among various functional circuits. For example, the dorsolateral PFC is interconnected with orbital and medial prefrontal cortices (Barbas, 2000; Passingham, Stephan, & Kotter, 2002; Petrides & Pandya, 1999), as are somatosensory and premotor cortices (Carmichael & Price, 1995; Conde, Maire-Lepoivre, Audinat, & Crepel, 1995). Similarly, somatosensory and motor cortices are highly interconnected (Cipolloni & Pandya, 1999; Morecraft, Cipolloni, Stilwell-Morecraft, Gedney, & Pandya, 2004; Simonyan & Jurgens, 2002). These connections are probably crucial for smooth behavior because they would allow integration of information about behavioral outcomes and emotional and environmental context and memory.
FLOW OF INFORMATION THROUGH MOTIVATIONAL CIRCUITRY Striatal Efferents: Direct versus Indirect Pathways GABAergic medium spiny neurons make up approximately 95% of striatal neurons and are the main recipient of cortical and allocortical input to the striatum. Striatal projections are topographically organized, thus largely maintaining the functional organization of the striatum in the output nuclei (Hedreen & DeLong, 1991; Lynd-Balta & Haber, 1994b; Parent et al., 1984; Parent & Hazrati, 1994). These projections form two efferent pathways from the striatum to the substantia nigra and pallidum. These are sometimes called the striatonigral and striatopallidal projections, but more frequently are referred to as the “direct” and “indirect” pathways (see Figure 22.3; Albin, Young, & Penney, 1989; Bolam, Hanley, Booth, & Bevan, 2000; DeLong, 1990). The pathways differ in projection targets, expression of dopamine (DA) receptors, and peptide content. The direct pathway projects monosynaptically to the SNr/GPi, while the indirect pathway from the dorsal striatum projects through
8/17/09 2:21:02 PM
Flow of Information through Motivational Circuitry
the GPe to the subthalamic nucleus before projecting on to the SNr/GPi (Gerfen, 1992; Kawaguchi, Wilson, & Emson, 1990; Parent & Hazrati, 1995). The indirect pathway from the ventral striatum projects to the ventral pallidum (Zahm, 1989; Zahm & Heimer, 1990). Neurons of the direct pathway contain dynorphin and substance P, and mainly express DA D1 receptors that are positively coupled to adenylate cyclase; Neurons of the indirect pathway contain enkephalin and neurotensin and express DA D2 receptors that are negatively coupled to adenylate cycles activity (Beckstead & Kersey, 1985; Besson, Graybiel, & Quim, 1990; Fallon & Leslie, 1986; Gerfen et al., 1990; Hersch et al., 1995; Le Moine & Bloch, 1995; Stoof & Kebabian, 1981). The SNr/GPi and VP send prominent GABAergic efferent projections to the thalamus, with the VP (primarily the VPm with only minor involvement of the VPl) projecting to the mediodorsal nucleus, and the SNr/GPi projecting to the ventral anterior and ventral lateral nuclei. The thalamic nuclei, in turn, provide glutamatergic innervation of the cortex. Projections to the thalamus and cortex are functionally and topographically organized, forming the last link in cortico-striato-cortico re-entrant circuits (see Figure 22.4), where a particular region of the cortex receives projections from the same portions of the basal ganglia to which it sends projections (DeVito & Anderson, 1982; Groenewegen, 1988; Haber, Lynd-Balta, & Mitchell, 1993; Kim, Nakano, Jayaraman, & Carpenter, 1976; Kuo & Carpenter, 1973; Kuroda, Murakami, Kishi, & Price, 1995; McFarland & Haber, 2002; Mogenson, Ciriello, Garland, & Wu, 1987; Zahm, Williams, & Wohltmann, 1996). The direct and indirect pathways have opposing effects on the activity of efferent neurons in the thalamus and cortex. Striatal neurons have the electrophysiological property of being tonically active, and thus exhibit a tonic inhibitory influence on their pallidal and nigral targets. Neurons in the direct pathway inhibit SNr/GPi targets, thus disinhibiting
thalamic and cortical structures (Chevalier & Deniau, 1990). By contrast, neurons in the indirect pathway inhibit neurons of the GPe, which disinhibits the subthalamic nucleus and consequently activates the GABAergic neurons of the SNr/GPi, resulting in inhibition of thalamic and cortical targets. Thus, the balance of activity between direct and indirect pathways is a critical component for determining basal ganglia output and consequently cortical motor or cognitive activity. We return to this topic in our discussion of Parkinson’s disease, as an imbalance of activity in direct and indirect pathways is central to the disorder. Basal Ganglia Loops: Parallel versus Integrative The striatum, pallidum, and thalamus are connected with the frontal cortex in a series of parallel modules that maintain parallel anatomical and functional organization (see Figure 22.4), leading to the suggestion that information is processed by this circuitry via a series of parallel loops, and it is the function of these parallel circuits to regulate motivated behavior (Alexander et al., 1986; Groenewegen, Berendse, Wolters, & Lohman, 1990; Heimer, Switzer, & Van Hoesen, 1982; Middleton & Strick, 2001). However, for an individual to effectively integrate incoming motivational stimuli and emit appropriate behavioral responses there must be interplay between the motor and limbic systems. Thus, there has been an increasing appreciation of potential integrative aspects of motivational circuitry (Haber, 2003; Haber, Fudge, & McFarland, 2000; Joel & Weiner, 1994; Percheron & Filion, 1991; Zahm & Brog, 1992). Midbrain-Striatal Interactions Several pathways have been proposed that could subserve integrative functions within motivational circuitry, including striato-nigro-striatal projections. Midbrain dopamine neurons project to striatum and, in turn, receive striatal
The cortico-basal ganglia re-entrant circuits (B) (C)
(A)
437
(D)
Cortex
PFC
dlPFC
SMA
Striatum
NA
DSvm
DSdl
Pallidum/nigra
VP/SN
dm GPi/SN
vl GPi/SN
Thalamus
MD
VA
VL
Figure 22.4 The cortico-basal ganglia re-entrant circuits.
c22.indd 437
8/17/09 2:21:02 PM
438
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
efferents. The midbrain-striatal projection is organized with inverse dorsal-ventral topography, such that the ventral midbrain projects to dorsal striatum and dorsal midbrain projects to ventral striatum. The NAs receives the most limited DA input, with this coming primarily from the VTA. The NAc receives dopaminergic input from the VTA and dorsomedial portions of the SNc. Associative striatum receives input from the densocellular group of the SNc and sensorimotor (dorsolateral) striatum receives input from the entire SNc, including both the densocellular group and the cell columns. Thus, along the gradient from limbic to motor striatum, there is also a gradient in the density of DA projections, with the densest projections going to motor striatum. Limbic striatum projects to the VTA and SNc; associative striatum projects to the SNc, primarily the ventral densocellular group; and sensorimotor striatum projects to the cell columns of the ventrolateral SNc (Beckstead et al., 1979; Carmona, Catalina-Herrera, & Jimenez-Castellanos, 1991; Haber, Lynd, Klein, & Groenewegen, 1990; Hedreen & DeLong, 1991; Parent & Hazrati, 1994; Selemon & Goldman-Rakic, 1990; Szabo, 1980). Loosely there is reciprocal, topographic organization to striatonigro-striatal projections. However, the ventral striatum receives limited dopaminergic input, but projects to a large region. In contrast, sensorimotor striatum has limited influence on midbrain DA cells, but receives a dense dopaminergic projection. In addition, for each striatal region, there are nonreciprocal connections with midbrain. Dorsal to the reciprocal projection lies a group of cells that that project to the same striatal region, but does not receive a projection from it. Ventral to the reciprocal projection lies efferent terminals without a reciprocal projection. With this arrangement, information from limbic striatum can reach motor striatum (Haber, 2003; Haber et al., 2000). More concretely, limbic striatum sends a projection to the VTA that extends beyond the portion that projects back to limbic striatum. This terminal region projects to associative striatum. Associative striatum participates in reciprocal projections with the densocellular region, but also send a projection to more ventral regions that share reciprocal projections with motor striatum. Thus, the connections “spiral” between midbrain and striatum in a manner that allows information to move in a rectified manner from limbic to motor loops. The nonreciprocal interactions of the VTA are evident in its connections to the PFC and VP, as well as the striatum. Thus, the VTA forms reciprocal connections with both the mPFCd and the mPFCv. Additionally, although it receives a projection only from the VPm, it sends projections to both the VPm and VPl. The permissive topography of VTA efferent projections within the motive circuit positions it to
c22.indd 438
influence the activity of both the motor and limbic subcircuits of the motive circuit. Thalamo-Cortical Interactions Although within the framework of basal ganglia loops, the thalamus is frequently described as a relay to the cortex, in reality, the thalamus participates in nonreciprocal interactions with the cortex that regulate the activity of cortical ensembles (Deschenes, Veinante, & Zhang, 1998; Jones, 1985; McFarland & Haber, 2002; Sherman & Guillery, 1996). These nonreciprocal projections are in position to integrate information across functional circuits. Although the thalamus displays reciprocal topography with cortex, closing cortico-basal ganglia-cortical segregated functional circuits, cortico-thalamic projections to the ventroanterior, ventrolateral, and mediodorsal thalamus are more extensive than thalamo-cortical projections. These extra projections are derived from areas of the cortex not innervated by that area of thalamus. For example, the ventroanterior thalamus has reciprocal projections with dorsolateral PFC (associative) and also nonreciprocal inputs from medial PFC (limbic); ventrolateral thalamus has reciprocal projections with caudal motor regions and also nonreciprocal inputs from more rostral motor regions (Haber, 2003). Thus, like striato-nigro-striatal projections, these projections mediate the flow of information from higher “association” areas to “motor” areas, allowing rectified integration of information across basal ganglia circuits. Another means of moving information from limbic to motor-related circuitry via the thalamus occurs within the motive circuit via the MD. The VP sends a prominent GABAergic efferent to the MD, with the primary contribution coming from the VPm and only minor involvement of the VPl (Mogenson et al., 1987; Vives & Mogenson, 1985; Zahm et al., 1996). The MD does not send a reciprocal projection to the VP, but there is reciprocal glutamatergic innervation of the mPFCd (Groenewegen, 1988; Kuroda, Murakami, Shinkai, Ojima, & Kishi, 1995). Thus, the MD receives information from the limbic circuit via the VPm but sends a projection to the motor-associated mPFCd, consequently forming a bridge between limbic and motor subcircuits (see Figure 22.2). Communication between the limbic and motor basal ganglia loops is rectified to bias information flow from limbic to motor while flow in the reverse direction requires multisynaptic communication. The location of rectified information flow is strategic for the movement of information from limbic to motor, and it has been suggested that information may “spiral” outward from the more medial limbic nuclei to the more lateral and dorsal motor nuclei (Haber, 2003; Haber & Fudge, 1997; Zahm & Brog, 1992). Motivationally relevant limbic information can exit the
8/17/09 2:21:02 PM
Dopamine and Motivated Behavior 439
motive circuit to the motor system via a number of different pathways. Hence, the VPl projects to the pedunculopontine nucleus, subthalamic nucleus, and SN, and subsequently to all parts of the extrapyramidal motor system (Haber et al., 1985; Zahm, 1989). There is also a projection to motor cortex that arises from the mPFCd (Zahm & Brog, 1992). Finally, the SN receives a projection directly from the NAc. Thus, the limbic motive circuit has several conduits by which it can influence motor behavior following presentation of motivationally relevant stimuli.
DOPAMINE AND MOTIVATED BEHAVIOR Although the function of dopamine in motivated behavior has been extensively reviewed by myself and others (Ettenberg, 1989; Fibiger & Phillips, 1988; McFarland & Kalivas, 2003; Robbins, 2003; Robinson & Berridge, 2000; Salamone & Correa, 2002; Schultz, 2002; Wise, 1982, 2004), its critical importance in motivational circuitry warrants discussion. There are three prominent pathways arising from the dopaminergic cell bodies of the ventral midbrain. The nigrostriatal pathway projects from the SN to the dorsal striatum; the mesolimbic pathway projects from the VTA to the NA; and the mesocortical projects from the VTA to the mPFC. For several reasons, extensive efforts to understand motivated behavior have focused on the role of dopamine. Primary among these is the demonstration that animals will work to deliver stimulation to brain regions containing dopaminergic neurons (for reviews, see Fibiger & Phillips, 1988; Redgrave & Dean, 1981; Wise, 1978). This finding led to the suggestion the dopamine system served as a brain substrate for reinforcement or reward that would serve to strengthen adaptive behaviors (i.e., those followed by a positive outcome). This notion was consistent with evidence that was emerging to indicate disrupting dopamine function altered responding of animals working for natural reinforcers, including food and water (Ettenberg & Horvitz, 1990; Gerber, Sing, & Wise, 1981; Mason, Beninger, Fibiger, & Phillips, 1980; Tombaugh, Tombaugh, & Anisman, 1979; Wise, Spindler, deWit, & Gerberg, 1978; Wise, Spindler, & Legault, 1978), electrical brain stimulation (Fouriezos, Hansson, & Wise, 1978; Fouriezos & Wise, 1976; Gallistel, Boytim, Gomita, & Klebanoff, 1982; Stellar & Corbett, 1989; Stellar, Kelley, & Corbett, 1983), or drugs of abuse (Bozarth & Wise, 1981; De Wit & Wise, 1977; Lyness, Friedle, & Moore, 1979; Roberts, Corcoran, & Fibiger, 1977; Yokel & Wise, 1976). Despite the well-documented role of DA in regulating goal-directed behavior, its specific function remains a matter for debate. Theories suggest a role for DA in everything from reward (Schultz, 1998, 2006; Wise & Rompre, 1989) to response
c22.indd 439
initiation or selection (Kelley, Baldo, Pratt, & Will, 2005; Salamone & Correa, 2002) to motivation/wanting (Robinson & Berridge, 2000). The following is an attempt to integrate what is known about dopaminergic function in order to frame its role in the production of motivated behavior. Many postulates suggest that midbrain DA neurons function in reward, indicating that they govern behavior directed toward appetitive, rather than aversive, stimuli. Such suggestions seem, at best, incomplete since DA neurons have been shown to respond to presentation of aversive, as well as appetitive, stimuli (Abercrombie, Keefe, DiFrischia, & Zigmond, 1989; Doherty & Gratton, 1992; Louilot, Le Moal, & Simon, 1986; Young, Joseph, & Gray, 1993), and DA has been shown to increase in the NA during associative learning of neutral stimuli (Young, Ahier, Upton, Joseph, & Gray, 1998). Additionally, DA receptor antagonism has been shown to disrupt learning about aversive stimuli (Salamone, 1994). Furthermore, DA neurons do not fire in a temporal pattern consistent with a role in pleasure or hedonics. Thus, once reward is expected, DA neurons have been shown to respond not to presentation of the reward itself, but instead to presentation of a stimulus that is most predictive of the reward, even before it is presented (Schultz, Apicella, & Ljungberg, 1993). For these reasons, theories of DA function that depend purely on notions of hedonics and reward have largely been dismissed. Dopamine Mediates the Learning of Motivational Responding, but Not the Emission of Motivated Behavior Purported roles for DA in wanting or craving, as well as in response initiation or response selection are motivational theories of DAergic function, and are very influential in contemporary thinking about motivated behavior. They suggest that DA is involved in the energizing or directing of behavior toward the appropriate goal. However, behavioral evidence suggests that DA receptor antagonism leaves motivational processes very much intact. For example, animals can be trained to run a straight alley when presented with an olfactory cue (S⫹) predictive of either food or drug reinforcement in the goal box. Following DA receptor antagonist treatment, such animals still traverse the alley normally when presented with the reinforcement-predictive cue (McFarland & Ettenberg, 1995, 1998). Furthermore, in subjects having undergone training to run an alley for heroin reinforcement and a subsequent period of extinction (with no cues or reinforcement available), haloperidol does not block the ability of the S⫹ to reinstate drug-seeking behavior (McFarland & Ettenberg, 1997). Additionally, the ability of an S⫹ conditioned
8/17/09 2:21:02 PM
440
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
in this fashion to elicit conditioned locomotor activation, conditioned place preference and conditioned autonomic activation remain intact during dopamine receptor antagonist treatment (Ettenberg & McFarland, 2003; McFarland & Ettenberg, 1999). Together these data strongly suggest that the motivational capacity of the S⫹ stimulus (i.e., its ability to activate and direct behavior) remains intact despite DA receptor blockade. Studies examining the role of conditioned stimuli in behavioral activation have produced comparable results. Horvitz and Ettenberg (Horvitz & Ettenberg, 1991) showed that administration of pimozide, a nonselective DA receptor antagonist, did not reduce locomotor activity in the presence of a stimulus previously paired with food delivery. This suggests that the motivational properties of food-paired stimuli are left intact. Such data are also consistent with demonstrations that environments or stimuli previously paired with amphetamine reinforcement retained their conditioned behavior-activating effects under dopamine receptor antagonist challenge (Beninger & Herz, 1986; Robbins, Cador, Taylor, & Everitt, 1989). Additionally, preferential responding on a lever associated with conditioned reinforcement is preserved following dopaminergic dennervation of the ventral striatum (Everitt & Robbins, 1992; Robbins et al., 1989). Thus, it seems that the motivating capacity of reinforcement-associated cues remains intact following disruption of DA function. When subjects are actively engaged in operant responding, administration of a DA receptor antagonist produces one of two behavioral patterns. Low doses produce increases in responding similar to that seen when the reinforcer is diminished (Ettenberg, Pettit, Bloom, & Koob, 1982; Geary & Smith, 1985; Rolls et al., 1974; Schneider, Davis, Watson, & Smith, 1990). High doses produce withinsession declines in operant behavior, similar to “extinction curves” that result from removal of the reinforcer (Franklin, 1978; Franklin & McCoy, 1979; Gallistel et al., 1982; Gerber et al., 1981; Lovibond, 1980; Wise et al., 1978). The fact that, in both situations, animals will initiate responding, and do so with normal (or near normal) response latencies suggests that the motivation of these subjects to engage in goal-oriented behavior remains intact. Franklin and McCoy (1979) trained animals to press a lever in order to receive electrical brain stimulation. They demonstrated that when pretreated with pimozide, animals showed an extinction-like pattern of responding. However, presentation of a conditioned stimulus (CS) that was previously paired with brain stimulation reward, successfully reinstated operant responding. Thus, subjects maintained motivational responding to a reward-paired stimulus despite the reinforcement decrement that presumably led to the progressive decline in responding through the initial course
c22.indd 440
of the session. Similarly, Gallistel and colleagues (1982) showed that while dopamine antagonists elevated brain reward thresholds for intracranial stimulation in a runway paradigm, they did not prevent the motivational effects of “priming” stimulation that incited animals to run the alley in the first place. Taken together, these data suggest that DA receptor antagonism, while capable of blocking the ability of reinforcing stimuli to maintain responding, does not alter the motivation to seek reinforcement. Further evidence that motivational processes remain intact during DA receptor antagonism comes from choice experiments. In such experiments, subjects are allowed to choose between two alternative responses: one that leads to reinforcer delivery and one that does not. Doses of dopamine receptor antagonist drugs that are sufficient to disrupt operant response rates have little effect on response choices in lever-press (Bowers, Hamilton, Zacharko, & Anisman, 1985; Evenden & Robbins, 1983) or T-maze (Tombaugh, Szostak, & Mills, 1983) tasks. Rats still prefer to make a response that has previously led to reinforcement over one that has not, even following challenge with DA receptor antagonists. Taken together, the data described suggest that the fundamental aspects of motivation remain intact despite disruption of DA transmission. Although the midbrain DA system does not appear to signal either reward or motivation, it is clear that intact dopaminergic function is important for both the acquisition and maintenance of operant responding (for reviews see Beninger & Miller, 1998; Di Chiara, Acquas, Tanda, & Cadoni, 1993; Kiyatkin, 1995). Thus, DA must serve a function related to the learning and maintenance of motivated responding, while the emission of previously learned behavior progresses independent of dopamine receptor activation. Dopamine Stimulates Plasticity within Motivational Circuitry An examination of the firing pattern of DA neurons reveals that most DA neurons display phasic activation after novel stimuli and after delivery of primary reinforcers (e.g., food). Additionally, when a biologically significant stimulus is predicted by an environmental cue, with experience DA neurons come to respond to the predictive cue, rather than to the reinforcer itself. Such changes in firing rate produce a pattern of responding whereby DA neurons increase firing to better than expected outcomes, remain unaffected by predictable outcomes and decrease firing in response to worse than expected outcomes (for a review, see Schultz, 1998). Thus, DA neurons respond to the difference between actual reward and expected reward, not the presence of reward itself. This suggests that the function of DA within
8/17/09 2:21:03 PM
Dopamine and Motivated Behavior 441
the production of goal-directed behavior is to signal the need to create an adaptive behavioral response, that is, promote neuronal plasticity. Such a suggestion is consistent with evidence regarding the anatomical location of DA synapses. Anatomical studies indicate that DA afferents are well situated to modulate or gate the probability of cells being activated (O’Donnell & Grace, 1995). Thus, DA synapses in both the mPFC and striatum tend to be located proximal to excitatory contact, with excitatory inputs forming on the head of the spine and dopamine terminals synapsing on the neck (Arbuthnott, Ingham, & Wickens, 1998; Carr & Sesack, 1999; Sesack & Pickel, 1990; Smiley & Goldman-Rakic, 1993; Yang, Seamans, & Gorelova, 1999). From a purely anatomical perspective, DA synapses seem to be poised to modulate incoming excitatory information. Ample electrophysiological data also suggest that DA is capable of modulating the efficiency of neuronal responses to other inputs, particularly to glutamate, either supporting or diminishing neuronal activity, depending on the quality of excitatory inputs received by target cells (O’Donnell & Grace, 1995). Both pyramidal cells in the mPFC and spiny cells of the accumbens have been shown to exist in a bistable state (Bazhenov, Timofeev, Steriade, & Sejnowski, 1998; O’Donnell & Grace, 1995; Timofeev, Grenier, & Steriade, 1998; Yim & Mogenson, 1988). Thus, cells fluctuate between a “down state” where membrane potential is relatively hyperpolarized and an “up state” where membrane potential is relatively depolarized. Dopamine tends to inhibit cells in the down state but excite cells in the up state (Hernandez-Lopez, Bargas, Surmeier, Reyes, & Galarraga, 1997; Kiyatkin & Rebec, 1999; O’Donnell, Greene, Pabello, Lewis, & Grace, 1999; Yang & Seamans, 1996). If there is more depolarizing (i.e., glutamatergic) input to a cell, DA D1 receptor activation increases the duration of depolarization via increasing a calcium (Hernandez-Lopez et al., 1997). In the absence of depolarizing input, DA will support the inactive state via D2 receptor activation of potassium conductances (O’Donnell & Grace, 1996). Dopamine appears to serve a similar role within the basolateral amygdala, where there are two types of neurons, inhibitory interneurons and pyramidal-like projection neurons. Stimulation of DA receptors in the basolateral amygdala increases the firing rate of interneurons thereby decreasing the firing rate of projection neurons. Further, DA attenuates activation of pyramidal cells in the basolateral amygdala that is elicited by electrical stimulation of the mPFC and mediodorsal nucleus of the thalamus, while potentiating the responses evoked by electrical stimulation of sensory association cortex (Rosenkranz & Grace, 1999, 2001). This organization is suggested to produce a
c22.indd 441
global filtration of inputs such that, upon presentation of an affective stimulus, there is potentiation of the strongest sensory input and concomitant dampening of cortical inhibition; thereby augmenting the response to affective stimuli. When considered as a whole, DA neurotransmission seems to increase the signal to noise ratio and consequently gate the flow of information within the motive circuit (Le Moal & Simon, 1991; Rosenkranz & Grace, 1999, 2001). The pattern of DA innervation of its target structures is also consistent with a general filtration and modulatory function. Dopaminergic projections to target structures are very divergent, with each axon being highly ramified (Anden, Hfuxe, Hamberger, & Hokfelt, 1966; Percheron, Francois, Yelnik, & Fenelon, 1989). Nearly every striatal neuron and many cortical neurons receive dopaminergic innervation. Additionally, these neurons display homogeneous and synchronous responsivity following presentation of motivationally significant stimuli that activate DA cells. Thus, DA neurons broadcast a global wave of activity to the NA and mPFC, rather than a stimulus or response specific signal (Schultz, 1998). Such a pattern of responding is suited to simultaneous modulation of ongoing activity in these forebrain and allocortical structures.
Role for Dopamine-Induced Plasticity in the Acquisition of Adaptive Behavior The data outlined previously suggest that behavioral responding to motivationally relevant stimuli proceeds in at least two phases; the acquisition of a response and the maintenance of a response. During the acquisition phase, synaptic DA is increased by presentation of primary reinforcers or novel stimuli. This DA signal can specifically strengthen those synapses receiving simultaneous excitatory glutamatergic input (e.g., corticostriatal or amygdalostriatal). In this fashion, DA would serve to facilitate the learning of adaptive behavioral responses, as well as increase access of limbic and cortical structures to the motor system. With repeated presentations of motivationally relevant stimuli (either primary or conditioned), these same excitatory inputs would be recruited and strengthened such that they no longer require dopaminergic influence to elicit motor output. Thus, the primary function of DA is to facilitate synaptic (and behavioral) plasticity, rather than to direct elicitation of motor responses. This helps explain why behavioral data show that animals do not acquire behavioral responses when DA transmission is disrupted, however, they will exhibit previously learned behaviors (see earlier discussion). This explanation is also consistent with the observation that the activity of DA neurons fails to discriminate among different salient stimuli, regardless of valence, or among different sensory modalities. Thus, DA
8/17/09 2:21:03 PM
442
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
facilitates the learning goal-directed responses, in general, rather than specific motor responses to specific stimuli. The involvement of DA in both the acquisition and maintenance of operant responding has been difficult to explain with a single theory of DA function. Theories emphasizing the modulatory effects of DA in learning can explain acquisition effects, but typically fail to explain effects on maintenance. Thus, if the inhibition of DA neurotransmission blocks plasticity, it should cause a kind of behavioral and neuronal inflexibility that leads to a decrease in responding for reinforcers and perseverance in previously learned behavioral patterns. However, if one remembers that both increases and decreases in firing rates of dopaminergic neurons have functional implications, then a possible explanation presents itself. As discussed earlier, increases in DA firing rates seem able to support behavioral and neuronal plasticity leading to the learning of new adaptive responses. Similarly, depressed DA transmission (like that resulting from DA receptor antagonism) provides a signal indicating a less than expected outcome. From a functional perspective, such an error signal could lead to compensatory adaptations that would weaken the strength and persistence of the preceding behavior. It seems possible that both an augmentation and a diminution in DA cell firing rates would elicit behavioral plasticity resulting in a change in behavioral output.
NEUROPSYCHIATRIC INDICATIONS In human patients, disruption of motivational circuitry results in a number of behavioral disorders. Primary lesions of the “motor” aspects of the circuitry have been shown to lead to disorders associated with kinetic disturbances (e.g., Parkinson’s disease), while disturbances in more “limbic” aspects of the circuitry are associated with schizophrenia, which is characterized by emotional and sensory gating difficulties. Notably, however, neuropsychiatric disorders relating to disruption of motivational circuitry are characterized by a broad range of symptoms from motor to cognitive and affective. This suggests that integrative aspects of motivational circuitry are a fundamental feature of motivated behavior and that understanding behavior (in both healthy and disease states) depends on an appreciation of both parallel and integrative processing. Schizophrenia Schizophrenia is a complex disorder characterized by hallucinations and delusions (positive symptoms), as well as flattened affect, apathy, and anhedonia (negative symptoms). It is now appreciated that a major determinant
c22.indd 442
of functional impairment in schizophrenic patients is their cognitive deficits, which include impairments in executive function, working memory, and attention (Breier, Schreiber, Dyer, & Pickar, 1991; Keefe et al., 1987; Revheim et al., 2006). Based on the fact that amphetamines exacerbate psychotic symptoms, while blockade of dopamine receptors alleviates schizophrenic symptoms, early theories of schizophrenia were centered around the “dopamine hypothesis” (Carlsson, 1988; Joseph, Frith, & Waddington, 1979; Meltzer & Stahl, 1976; Snyder, 1976), which suggested that schizophrenia was the result of excessive DA, triggering sensory, cognitive, and affective abnormalities. Too much dopamine was hypothesized to inhibit striatal efferents, thereby disinhibiting the thalamus (Swerdlow, Braff, Geyer, & Koob, 1986), producing an impaired ability to gate incoming sensory information and respond appropriately. There is evidence that dopamine turnover is increased is drug-naïve schizophrenic patients (DaoCastellana et al., 1997; Hietala et al., 1994; Kumakura et al., 2007), and dopamine D2 receptor blockade remains the most clinically useful mechanism to treat psychosis (Hietala & Syvalahti, 1996). However, recent evidence converges to suggest that a purely dopaminergic explanation of schizophrenia is overly simplistic. Current theories of schizophrenia continue to identify dysfunction of frontostriatal circuitry as a core component. For example, studies suggest that schizophrenia is associated with functional and structural changes within the thalamus, striatum, and prefrontal cortex (Andrews, Wang, Csernansky, Gado, & Barch, 2006; Cullen et al., 2006; Juckel et al., 2006; Kemether et al., 2003; Lawrie, McIntosh, Hall, Owens, & Johnstone, 2008; Mitelman, Byne, Kemether, Hazlett, & Buchsbaum, 2005; Rubin et al., 1994; Schlosser et al., 2007). However, transmitters other than DA, such as serotonin, glutamate, and GABA have also been implicated, particularly in the PFC (Aghajanian & Marek, 2000; Carlsson, Hansson, Waters, & Carlsson, 1999; Carlsson, Waters, & Carlsson, 1999; Gray & Roth, 2007; Jentsch & Roth, 1999; Lewis, Hashimoto, & Volk, 2005; Lewis & Moghaddam, 2006; Lewis, Pierri, Volk, Melchitzky, & Woo, 1999; Weiner et al., 2001). Consistent with the importance of PFCstriatal processing in schizophrenia, animal studies have shown that prepulse inhibition of the startle response (a model for studying sensorimotor gating) depends on intact nucleus accumbens and prefrontal cortical circuitry (Bubser & Koch, 1994; Kodsi & Swerdlow, 1994; Reijmers, Vanderheyden, & Peeters, 1995; Swerdlow, Braff, Masten, & Geyer, 1990; Swerdlow et al., 1995). Taken together, the data suggest that altered processing within motivational cotrico-basal-ganglia-cortical circuitry is central to schizophrenia. Within this circuitry,
8/17/09 2:21:03 PM
Neuropsychiatric Indications
numerous changes have taken place, and identifying the primary deficit seems to be something of a chickenand-egg question. However, the balance of all these neurochemical and neuroanatomical changes presumably explains the complex features of schizophrenia. Parkinson’s Disease Parkinson’s disease (PD) is a neurodegenerative disorder that is characterized by motor symptoms, such as resting tremor, rigidity, and bradykinesia (Carpenter, Allum, Honegger, Adkin, & Bloem, 2004; Gelb, Oliver, & Gilman, 1999; Mardsen, 1994). In addition to motor disturbances, it is now recognized that PD is associated with cognitive impairment, and in extreme cases dementia (Brown & Marsden, 1984; Owen, 2004). The primary pathology in PD is degeneration of the dopamine-containing cells of the SNc that project to the striatum (Bernheimer, Birkmayer, Hornykiewicz, Jellinger, & Seitelberger, 1973; Damier, Hirsch, Agid, & Graybiel, 1999; Graybiel, Hirsch, & Agid, 1990; Hirsch, Graybiel, & Agid, 1988; Hornykiewicz & Kish, 1987), with less involvement of VTA dopamine neurons (Fearnley & Lees, 1991; Gibb & Lees, 1989). Dopaminergic degeneration is earliest and most severe in neurons projecting to the dorsolateral (motor) striatum, but progresses ventromedially through the striatum to associative and limbic areas (Damier et al., 1999; Fearnley & Lees, 1990; Graybiel et al., 1990). The principal loss of DA within the motor loop of the basal ganglia is consistent with the primary pathology consisting of motor deficits, and the degree of DA loss in the striatum has been correlated with the severity of motor symptoms. However, PD is associated with several neuroadaptive changes throughout motivational circuitry, including supersensitivity of postsynaptic D2 DA receptors in the striatum, degeneration of serotonin and acetylcholine neurons projecting to the striatum, and loss of DA in the PFC (Birkmayer, Danielczyk, Neumayer, & Riederer, 1975; Calabresi, Mercuri, Sancesario, & Bernardi, 1993; Jellinger, 1991; Jellinger, 1998; Kostrzewa, Kostrzewa, Nowak, Kostrzewa, & Brus, 2004; Scatton, Javoy-Agid, Rouquier, Dubois, & Agid, 1983; Scatton, Rouquier, Javoy-Agid, & Agid, 1982). Thus, the complete constellation of symptoms associated with PD presumably results from both alterations in the normal flow of information through cortico-striato-cortical circuitry (resulting from striatal dopamine depletion) and additionally from the concomitant neuradapataions within the distributed motivational network. Loss of DA within the striatum has differential effects on different efferent striatal pathways, and it is the competing balance of these pathways that regulates thalamic and cortical
c22.indd 443
443
excitatory activity, and consequently behavior. We previously discussed two efferent pathways from the striatum: the direct and indirect pathways (see earlier discussion). A third “hyperdirect” pathway has also been suggested (Nambu, Tokuno, & Takada, 2002). This pathway bypasses the striatum and runs from the cortex, through the subthalamic nucleus to the GPi to thalamus and back to cortex. While activity of the direct pathway results in thalamic and cortical activation, activity of the indirect or hyperdirect pathways has the opposite effect—increased inhibition of the thalamus and less activation of cortex. Circuit-based explanations have suggested that PD results from decreased activity of the direct pathway relative to either the indirect or hyperdirect pathway, resulting in a reduction in thalamic and cortical activity (Albin et al., 1989; DeLong, 1990; Leblois, Boraud, Meissner, Bergman, & Hansel, 2006; Nambu, 2005; Nambu et al., 2002). DA, via activation of D1 receptors, activates the direct pathway, but via activation of D2 receptors inhibits activity of the indirect pathway (Gerfen, 2000; Mallet, Ballion, Le Moine, & Gonon, 2006; O’Connor, 1998). Following DA denervation, decreased dopamine stimulation of the direct pathway results in reduced GABAergic inhibition of SNr/GPi, thereby disinhibiting the GABA projection to the thalamus and resulting in decreased cortical activation by basal ganglia circuitry. DA denervation is hypothesized to have less influence on the indirect pathway and no effect on the hyperdirect pathway because it bypasses the striatum. Thus, the symptoms of PD arise from a relative increase in the influence of indirect or hyperdirect pathways on thalamic and cortical excitation. Cortical dysfunction within different functional loops is presumed to underlie the various functional deficits of PD, like akinesia and executive function deficits. Because the loops are integrated (see earlier discussion), dysfunction of one corticobasal ganglia-cortical loop has the potential to additionally disrupt information processing within other loops, as well. This integrative aspect of motivational circuitry explains how cognitive deficits arise, even early in the course of PD when DA denervation is restricted predominantly to motor striatum. Huntington’s Disease Huntington’s disease (HD) is a genetic neurodegenerative disorder that, like PD, is characterized clinically by prominent motor dysfunction. In this case, however, movements are described as jerky, random, and uncontrollable. HD is also associated with cognitive deficits including deficiencies in executive function (planning, cognitive flexibility, abstract thinking, rule acquisition, initiating appropriate actions, and inhibiting inappropriate
8/17/09 2:21:04 PM
444
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
actions) and psychiatric symptoms, which may include blunted affect, aggression, or compulsivity (Albin et al., 1989; Folstein, Folstein, & McHugh, 1979; Martin, 1984; Martin & Gusella, 1986; Wilson & Garron, 1979). Huntington’s disease is caused by a trinucleotide (GAC) repeat expansion in the gene coding for Huntingtin protein. The exact mechanism by which the mutated form of the protein leads to Huntington’s disease is not known, but it is clear that it leads to cell death in medium spiny neurons of the striatum—although some degeneration other brain regions, including the cortex, have also been reported (Ferrante, Kowall, Richardson, Bird, & Martin, 1986; Hedreen & Folstein, 1995; Kowall, Ferrante, & Martin, 1987; Li, 1999; Penney & Young, 1986). Striatal degeneration is progressive with respect to topography with the earliest and most pronounced degeneration occurring in associative striatum and damage to motor and finally limbic striatum appearing in more advanced stages of the disease (Augood, Faull, Love, & Emson, 1996; Ferrante et al., 1986; Kowall et al., 1987; Vonsattel & DiFiglia, 1998). Striatal degeneration is also progressive with respect to target. Early stages of HD are characterized by loss of GABAergic projections to the GPe and SN, with GPi projections only affected in later stages of the disease (Albin et al., 1992; Albin, Reiner, Anderson, Penney, & Young, 1990; Albin, Young, et al., 1990; Pearson, Heathfield, & Reynolds, 1990; Reiner et al., 1988; Richfield & Herkenham, 1994; Sapp et al., 1995). Symptoms of HD are generally attributed to frontostriatal dysfunction resulting from striatal degeneration. Most frequently implicated is degeneration of indirect striatal pathway to the GPe (Albin et al., 1989; Hallett, 1993; Kipps et al., 2005; Penney & Young, 1986; Wakai, Takahashi, & Hashizume, 1993). Degeneration of the indirect pathway leads to underactivity of the subthalamic nucleus, and subsequent underactivity of the GPi/SNr, thus removing inhibitory control over the thalamus. Overactivity of the thalamus is hypothesized to result in chorea, which is the hallmark motor abnormality of HD. Chorea is described as intrusion of undesirable motor programs into the normal flow of behavior. Thus, the primary motor dysfunction in HD seems to result from a failure of striatal circuitry to appropriately inhibit undesired behaviors, resulting in jerky and undesired movements. Consistent with the finding that the initial neurodegeneration in HD occurs in associative striatum, cognitive symptoms frequently predate the appearance of motor symptoms in HD (Butters, Albert, & Sax, 1979; Kirkwood et al., 2000; Lemiere, Decruyenaere, Evers-Kiebooms, Vandenbussche, & Dom, 2004; Stout et al., 2007; Wilson & Garron, 1979). Motor symptoms in early stages (prior to degeneration of motor striatum) presumably occur via the integrative
c22.indd 444
aspects of basal ganglia circuitry, with the associative loop positioned to influence the activity of the motor loop (as discussed previously, see also Joel, 2001). In this respect, given the hypothesized overactivity of the thalamus in HD, thalamo-cortical interactions that integrate associative information into motor circuitry might be particularly important.
SUMMARY The fact that there are separate, but interactive, subcircuits within the motivational circuitry suggests that they have separable functions in the production of goal-directed behaviors. Limbic structures, like the VTA, basolateral amygdala, hippocampus, and medial prefrontal cortex are more intimately connected with the NA and VP, while motor structures like the primary motor cortex, SN, and the pedunculopontine nucleus are more intimately connected with the dorsolateral striatum and globus pallidus. In between (functionally and anatomically) lies associative circuitry. Such segregation leads to the suggestion that the limbic loop is involved in learning about motivationally relevant stimuli and subsequently integrating incoming information about such stimuli when they are presented; the associative loop is involved in executive functions like strategic planning initiating appropriate actions, and inhibiting inappropriate actions; the motor loop is involved in sending information about the welllearned responses to motor systems. The production of behavior requires activation of the motor cortex and pyramidal motor systems, which are most intimately connected with the motor loop. While the existence of parallel circuit processing within motivational circuitry is an important feature, an equally important attribute is the integrative nature of the circuitry. It is difficult to imagine that behavior could be smoothly produced in the face of changing environmental cues (either interoceptive or exteroceptive) without motivational circuitry being able to integrate environmental stimuli, relative to an individual’s past experience and current motivational state, plan, and then execute appropriate behavioral responses. It is hypothesized that this integration occurs via striatal-midbrain and cortico-thalamic interactions. These interactions are rectified such that limbic and associative information can be transferred through the circuitry and affect ongoing motor processing and goal-directed behavior. The integrative nature of processing within motivational circuitry is evidenced by the disturbance at multiple functional levels that occurs in neuropsychiatric disorders where processing within motivational circuitry is disrupted.
8/17/09 2:21:04 PM
References 445
REFERENCES Abercrombie, E. D., Keefe, K. A., DiFrischia, D. S., & Zigmond, M. J. (1989). Differential effect of stress on in vivo dopamine release in striatum, nucleus accumbens, and medial frontal cortex. Journal of Neurochemistry, 52, 1655–1658.
Bailey, K. R., & Mair, R. G. (2006). The role of striatum in initiation and execution of learned action sequences in rats. Journal of Neuroscience, 26, 1016–1025.
Aggleton, J. P., Burton, M. J., & Passingham, R. E. (1980). Cortical and subcortical afferents to the amygdala of the rhesus monkey (macaca mulatta). Brain Research, 190, 347–368.
Baker, S. N., Spinks, R., Jackson, A., & Lemon, R. N. (2001). Synchronization in monkey motor cortex during a precision grip task: Pt. I. Task-dependent modulation in single-unit synchrony. Journal of Neurophysiology, 85, 869–885.
Aghajanian, G. K., & Marek, G. J. (2000). Serotonin model of schizophrenia: Emerging role of glutamate mechanisms. Brain Research: Brain Research Reviews, 31(2/3), 302–312. Albin, R. L., Reiner, A., Anderson, K. D., Dure, L. S. T., Handelin, B., Balfour, R., et al. (1992). Preferential loss of striato-external pallidal projection neurons in presymptomatic Huntington’s disease. Annals of Neurology, 31, 425–430. Albin, R. L., Reiner, A., Anderson, K. D., Penney, J. B., & Young, A. B. (1990). Striatal and nigral neuron subpopulations in rigid Huntington’s disease: Implications for the functional anatomy of chorea and rigidityakinesia. Annals of Neurology, 27, 357–365. Albin, R. L., Young, A. B., & Penney, J. B. (1989). The functional anatomy of basal ganglia disorders. Trends in Neurosciences, 12, 366–375. Albin, R. L., Young, A. B., Penney, J. B., Handelin, B., Balfour, R., Anderson, K. D., et al. (1990). Abnormalities of striatal projection neurons and n-methyl-d-aspartate receptors in presymptomatic Huntington’s disease. New England Journal of Medicine, 322, 1293–1298. Alexander, G. E., & Crutcher, M. D. (1990). Functional architecture of basal ganglia circuits: Neural substrates of parallel processing. Trends in Neurosciences, 13, 266–271. Alexander, G. E., & DeLong, M. R. (1985). Microstimulation of the primate neostriatum: Pt. II. Somatotopic organization of striatal microexcitable zones and their relation to neuronal response properties. Journal of Neurophysiology, 53, 1417–1430. Alexander, G. E., DeLong, M. R., & Strick, P. L. (1986). Parallel organization of functionally segregated circuits linking basal ganglia and cortex. Annual Review of Neuroscience, 9, 357–381. Alheid, G. F. (2003). Extended amygdala and basal forebrain. Annals of the New York Academy of Sciences, 985, 185–205. Anaya-Martinez, V., Martinez-Marcos, A., Martinez-Fong, D., Aceves, J., & Erlij, D. (2006). Substantia nigra compacta neurons that innervate the reticular thalamic nucleus in the rat also project to striatum or globus pallidus: Implications for abnormal motor behavior. Neuroscience, 143, 477–486. Anden, N. E., Hfuxe, K., Hamberger, B., & Hokfelt, T. (1966). A quantitative study on the nigro-neostriatal dopamine neuron system in the rat. Acta Physiologica Scandinavica, 67, 306–312. Andrews, J., Wang, L., Csernansky, J. G., Gado, M. H., & Barch, D. M. (2006). Abnormalities of thalamic activation and cognition in schizophrenia. American Journal of Psychiatry, 163, 463–469. Apicella, P. (2002). Tonically active neurons in the primate striatum and their role in the processing of information about motivationally relevant events. European Journal of Neuroscience, 16, 2017–2026. Arbuthnott, G. W., Ingham, C. A., & Wickens, J. R. (1998). Modulation by dopamine of rat corticostriatal input. Advances in Pharmacology, 42, 733–736. Arikuni, T., & Kubota, K. (1986). The organization of prefrontocaudate projections and their laminar origin in the macaque monkey: A retrograde study using hrp-gel. Journal of Comparative Neurology, 244, 492–510. Augood, S. J., Faull, R. L., Love, D. R., & Emson, P. C. (1996). Reduction in enkephalin and substance p messenger rna in the striatum of early grade Huntington’s disease: A detailed cellular in situ hybridization study. Neuroscience, 72, 1023–1036.
c22.indd 445
Badre, D., & Wagner, A. D. (2004). Selection, integration, and conflict monitoring: Assessing the nature and generality of prefrontal cognitive control mechanisms. Neuron, 41, 473–487.
Barbas, H. (2000). Connections underlying the synthesis of cognition, memory, and emotion in primate prefrontal cortices. Brain Research Bulletin, 52, 319–330. Bazhenov, M., Timofeev, I., Steriade, M., & Sejnowski, T. J. (1998). Cellular and network models for intrathalamic augmenting responses during 10-hz stimulation. Journal of Neurophysiology, 79, 2730–2748. Beckstead, R. M. (1979). An autoradiographic examination of corticocortical and subcortical projections of the mediodorsal-projection (prefrontal) cortex in the rat. Journal of Comparative Neurology, 184, 43–62. Beckstead, R. M. (1983). A pallidostriatal projection in the cat and monkey. Brain Research Bulletin, 11, 629–632. Beckstead, R. M., Domesick, V. B., & Nauta, W. J. (1979). Efferent connections of the substantia nigra and ventral tegmental area in the rat. Brain Research, 175, 191–217. Beckstead, R. M., & Kersey, K. S. (1985). Immunohistochemical demonstration of differential substance p-, met-enkephalin-, and glutamic-aciddecarboxylase-containing cell body and axon distributions in the corpus striatum of the cat. Journal of Comparative Neurology, 232, 481–498. Beninger, R. J., & Herz, R. S. (1986). Pimozide blocks establishment but not expression of cocaine-produced environment-specific conditioning. Life Sciences, 38, 1425–1431. Beninger, R. J., & Miller, R. (1998). Dopamine d1-like receptors and reward-related incentive learning. Neuroscience and Biobehavioral Reviews, 22, 335–345. Berendse, H. W., Galis-de Graaf, Y., & Groenewegen, H. J. (1992). Topographical organization and relationship with ventral striatal compartments of prefrontal corticostriatal projections in the rat. Journal of Comparative Neurology, 316, 314–347. Bernheimer, H., Birkmayer, W., Hornykiewicz, O., Jellinger, K., & Seitelberger, F. (1973). Brain dopamine and the syndromes of Parkinson and Huntington: Clinical, morphological and neurochemical correlations. Journal of the Neurological Sciences, 20, 415–455. Besson, M. J., Graybiel, A. M., & Quinn, B. (1990). Co-expression of neuropeptides in the cat’s striatum: An immunohistochemical study of substance p, dynorphin b and enkephalin. Neuroscience, 39, 33–58. Bevan, M. D., Smith, A. D., & Bolam, J. P. (1996). The substantia nigra as a site of synaptic integration of functionally diverse information arising from the ventral pallidum and the globus pallidus in the rat. Neuroscience, 75, 5–12. Birkmayer, W., Danielczyk, W., Neumayer, E., & Riederer, P. (1975). Dopaminergic supersensitivity in Parkinsonism. Advances in Neurology, 9, 121–129. Boecker, H., Dagher, A., Ceballos-Baumann, A. O., Passingham, R. E., Samuel, M., Friston, K. J., et al. (1998). Role of the human rostral supplementary motor area and the basal ganglia in motor sequence control: Investigations with h2 15o pet. Journal of Neurophysiology, 79, 1070–1080. Bolam, J. P., Hanley, J. J., Booth, P. A., & Bevan, M. D. (2000). Synaptic organisation of the basal ganglia. Journal of Anatomy, 196(Pt. 4), 527–542. Botvinick, M., Nystrom, L. E., Fissell, K., Carter, C. S., & Cohen, J. D. (1999, November 11). Conflict monitoring versus selection-for-action in anterior cingulate cortex. Nature, 402, 179–181.
8/17/09 2:21:04 PM
446
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
Bowers, W., Hamilton, M., Zacharko, R. M., & Anisman, H. (1985). Differential effects of pimozide on response-rate and choice accuracy in a self-stimulation paradigm in mice. Pharmacology, Biochemistry, and Behavior, 22, 521–526.
Christie, M. J., Summers, R. J., Stephenson, J. A., Cook, C. J., & Beart, P. M. (1987). Excitatory amino acid projections to the nucleus accumbens septi in the rat: A retrograde transport study utilizing d[3h]aspartate and [3h]gaba. Neuroscience, 22, 425–439.
Bozarth, M. A., & Wise, R. A. (1981). Heroin reward is dependent on a dopaminergic substrate. Life Sciences, 29, 1881–1886.
Churchill, L., Dilts, R. P., & Kalivas, P. W. (1990). Changes in gammaaminobutyric acid, mu-opioid and neurotensin receptors in the accumbens-pallidal projection after discrete quinolinic acid lesions in the nucleus accumbens. Brain Research, 511, 41–54.
Breier, A., Schreiber, J. L., Dyer, J., & Pickar, D. (1991). National institute of mental health longitudinal study of chronic schizophrenia: Prognosis and predictors of outcome. Archives of General Psychiatry, 48, 239–246. Brinkman, J., & Kuypers, H. G. (1973). Cerebral control of contralateral and ipsilateral arm, hand and finger movements in the split-brain rhesus monkey. Brain, 96, 653–674. Brown, R. G., & Marsden, C. D. (1984). How common is dementia in Parkinson’s disease? Lancet, 2, 1262–1265. Bubser, M., & Koch, M. (1994). Prepulse inhibition of the acoustic startle response of rats is reduced by 6-hydroxydopamine lesions of the medial prefrontal cortex. Psychopharmacology, 113(3–4), 487–492. Butters, N., Albert, M. S., & Sax, D. (1979). Investigations of the memory disorders of patients with Huntington’s disease. Advances in Neurology, 23, 203–213. Calabresi, P., Mercuri, N. B., Sancesario, G., & Bernardi, G. (1993). Electrophysiology of dopamine-denervated striatal neurons: Implications for Parkinson’s disease. Brain, 116(Pt. 2), 433–452. Carlsson, A. (1988). The current status of the dopamine hypothesis of schizophrenia. Neuropsychopharmacology, 1, 179–186. Carlsson, A., Hansson, L. O., Waters, N., & Carlsson, M. L. (1999). A glutamatergic deficiency model of schizophrenia. British Journal of Psychiatry: Supplement(37), 2–6.
Churchill, L., & Kalivas, P. W. (1994). A topographically organized gamma-aminobutyric acid projection from the ventral pallidum to the nucleus accumbens in the rat. Journal of Comparative Neurology, 345, 579–595. Cipolloni, P. B., & Pandya, D. N. (1999). Cortical connections of the frontoparietal opercular areas in the rhesus monkey. Journal of Comparative Neurology, 403, 431–458. Conde, F., Maire-Lepoivre, E., Audinat, E., & Crepel, F. (1995). Afferent connections of the medial frontal cortex of the rat: Pt. II. Cortical and subcortical afferents. Journal of Comparative Neurology, 352, 567–593. Corbett, D., & Wise, R. A. (1980). Intracranial self-stimulation in relation to the ascending dopaminergic systems of the midbrain: A moveable electrode mapping study. Brain Research, 185, 1–15. Cullen, T. J., Walker, M. A., Eastwood, S. L., Esiri, M. M., Harrison, P. J., & Crow, T. J. (2006). Anomalies of asymmetry of pyramidal cell density and structure in dorsolateral prefrontal cortex in schizophrenia. British Journal of Psychiatry, 188, 26–31. Damier, P., Hirsch, E. C., Agid, Y., & Graybiel, A. M. (1999). The substantia nigra of the human brain: Pt. II. Patterns of loss of dopamine-containing neurons in Parkinson’s disease. Brain, 122(Pt. 8), 1437–1448.
Carlsson, A., Waters, N., & Carlsson, M. L. (1999). Neurotransmitter interactions in schizophrenia-therapeutic implications. European Archives of Psychiatry and Clinical Neuroscience, 249(Suppl. 4), 37–43.
Dao-Castellana, M. H., Paillere-Martinot, M. L., Hantraye, P., Attar-Levy, D., Remy, P., Crouzel, C., et al. (1997). Presynaptic dopaminergic function in the striatum of schizophrenic patients. Schizophrenia Research, 23, 167–174.
Carmichael, S. T., & Price, J. L. (1995). Sensory and premotor connections of the orbital and medial prefrontal cortex of macaque monkeys. Journal of Comparative Neurology, 363, 642–664.
DeLong, M. R. (1990). Primate models of movement disorders of basal ganglia origin. Trends in Neurosciences, 13, 281–285.
Carmona, A., Catalina-Herrera, C. J., & Jimenez-Castellanos, J. (1991). Nigrocaudate and nigroputaminal projections in the monkey. Acta Anatomica, 141, 145–150. Carpenter, M. G., Allum, J. H., Honegger, F., Adkin, A. L., & Bloem, B. R. (2004). Postural abnormalities to multidirectional stance perturbations in Parkinson’s disease. Journal of Neurology, Neurosurgery, and Psychiatry, 75, 1245–1254. Carr, D. B., & Sesack, S. R. (1999). Terminals from the rat prefrontal cortex synapse on mesoaccumbens vta neurons. Annals of the New York Academy of Sciences, 877, 676–678. Carr, D. B., & Sesack, S. R. (2000a). Gaba-containing neurons in the rat ventral tegmental area project to the prefrontal cortex. Synapse, 38, 114–123. Carr, D. B., & Sesack, S. R. (2000b). Projections from the rat prefrontal cortex to the ventral tegmental area: Target specificity in the synaptic associations with mesoaccumbens and mesocortical neurons. Journal of Neuroscience, 20, 3864–3873. Chevalier, G., & Deniau, J. M. (1990). Disinhibition as a basic process in the expression of striatal functions. Trends in Neurosciences, 13, 277–280.
DeLong, M. R., Alexander, G. E., Mitchell, S. J., & Richardson, R. T. (1986). The contribution of basal ganglia to limb control. Progress in Brain Research, 64, 161–174. De Olmos, J. S., & Heimer, L. (1999). The concepts of the ventral striatopallidal system and extended amygdala. Annals of the New York Academy of Sciences, 877, 1–32. Deschenes, M., Veinante, P., & Zhang, Z. W. (1998). The organization of corticothalamic projections: Reciprocity versus parity. Brain Research: Brain Research Reviews, 28, 286–308. DeVito, J. L., & Anderson, M. E. (1982). An autoradiographic study of efferent connections of the globus pallidus in macaca mulatta. Experimental Brain Research, 46, 107–117. De Wit, H., & Wise, R. A. (1977). Blockade of cocaine reinforcement in rats with the dopamine receptor blocker pimozide, but not with the noradrenergic blockers phentolamine or phenoxybenzamine. Canadian Journal of Psychology, 31, 195–203. Di Chiara, G., Acquas, E., Tanda, G., & Cadoni, C. (1993). Drugs of abuse: Biochemical surrogates of specific aspects of natural reward? Biochemical Society Symposium, 59, 65–81. Doherty, M. D., & Gratton, A. (1992). High-speed chronoamperometric measurements of mesolimbic and nigrostriatal dopamine release associated with repeated daily stress. Brain Research, 586, 295–302.
Chikama, M., McFarland, N. R., Amaral, D. G., & Haber, S. N. (1997). Insular cortical projections to functional regions of the striatum correlate with cortical cytoarchitectonic organization in the primate. Journal of Neuroscience, 17, 9686–9705.
Dum, R. P., & Strick, P. L. (1991). The origin of corticospinal projections from the premotor areas in the frontal lobe. Journal of Neuroscience, 11, 667–689.
Chouinard, P. A., Leonard, G., & Paus, T. (2005). Role of the primary motor and dorsal premotor cortices in the anticipation of forces during object lifting. Journal of Neuroscience, 25, 2277–2284.
Dum, R. P., & Strick, P. L. (1993). Cingulate motor areas. In B. A. Vogt & M. Gabriel (Eds.), Neurobiology of cingulate cortex and limbic thalamus (pp. 415–441). Boston: Birkhäuser.
c22.indd 446
8/17/09 2:21:05 PM
References Dum, R. P., & Strick, P. L. (2002). Motor areas in the frontal lobe of the primate. Physiology and Behavior, 77(4–5), 677–682. Ettenberg, A. (1989). Dopamine, neuroleptics and reinforced behavior. Neuroscience and Biobehavioral Reviews, 13(2–3), 105–111. Ettenberg, A., & Horvitz, J. C. (1990). Pimozide prevents the responsereinstating effects of water reinforcement in rats. Pharmacology, Biochemistry, and Behavior, 37, 465–469. Ettenberg, A., & McFarland, K. (2003). Effects of haloperidol on cueinduced autonomic and behavioral indices of heroin reward and motivation. Psychopharmacology, 168(1–2), 139–145. Ettenberg, A., Pettit, H. O., Bloom, F. E., & Koob, G. F. (1982). Heroin and cocaine intravenous self-administration in rats: Mediation by separate neural systems. Psychopharmacology, 78, 204–209. Evenden, J. L., & Robbins, T. W. (1983). Dissociable effects of d-amphetamine, chlordiazepoxide and alpha-flupenthixol on choice and rate measures of reinforcement in the rat. Psychopharmacology, 79(2–3), 180–186. Everitt, B. J., & Robbins, T. W. (1992). Amygdala-ventral striatal interactions and reward-related processes. In J. P. Aggleton (Ed.), The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 401–429). New York: Wiley-Liss.
Fouriezos, G., & Wise, R. A. (1976). Pimozide-induced extinction of intracranial self-stimulation: Response patterns rule out motor or performance deficits. Brain Research, 103, 377–380. Franklin, K. B. (1978). Catecholamines and self-stimulation: Reward and performances effects dissociated. Pharmacology, Biochemistry, and Behavior, 9, 813–820. Franklin, K. B., & McCoy, S. N. (1979). Pimozide-induced extinction in rats: Stimulus control of responding rules out motor deficit. Pharmacology, Biochemistry, and Behavior, 11, 71–75. Fritsch, G., & Hitzig, E. (1870). Uber die electrische erregbarkeit des grosshirns. Archiv für Anatomie, Physiologie, und Wissenschaftliche Medicin, 37, 300–332. Fudge, J. L., & Haber, S. N. (2000). The central nucleus of the amygdala projection to dopamine subpopulations in primates. Neuroscience, 97, 479–494. Fuster, J. M. (2000a). Executive frontal functions. Experimental Brain Research, 133, 66–70. Fuster, J. M. (2000b). Prefrontal neurons in networks of executive memory. Brain Research Bulletin, 52, 331–336.
Fallon, J. H., & Leslie, F. M. (1986). Distribution of dynorphin and enkephalin peptides in the rat brain. Journal of Comparative Neurology, 249, 293–336.
Fuxe, K., Hokfelt, T., Johansson, O., Lidbrink, P., & Ljungdahl, A. (1974). The origin of the dopamine nerve terminals in limbic and frontal cortex: Evidence for meso-cortico dopamine neurons. Brain Research, 82, 349–355.
Fallon, J. H., & Moore, R. Y. (1978). Catecholamine innervation of the basal forebrain: Pt. IV. Topography of the dopamine projection to the basal forebrain and neostriatum. Journal of Comparative Neurology, 180, 545–580.
Gallistel, C. R., Boytim, M., Gomita, Y., & Klebanoff, L. (1982). Does pimozide block the reinforcing effect of brain stimulation? Pharmacology, Biochemistry, and Behavior, 17, 769–781.
Fearnley, J. M., & Lees, A. J. (1990). Striatonigral degeneration. A clinicopathological study. Brain, 113(Pt. 6), 1823–1842.
Geary, N., & Smith, G. P. (1985). Pimozide decreases the positive reinforcing effect of sham fed sucrose in the rat. Pharmacology, Biochemistry, and Behavior, 22, 787–790.
Fearnley, J. M., & Lees, A. J. (1991). Ageing and Parkinson’s disease: Substantia nigra regional selectivity. Brain, 114(Pt. 5), 2283–2301. Ferrante, R. J., Kowall, N. W., Richardson, E. P., Jr., Bird, E. D., & Martin, J. B. (1986). Topography of enkephalin, substance p and acetylcholinesterase staining in Huntington’s disease striatum. Neuroscience Letters, 71, 283–288. Ferry, A. T., Ongur, D., An, X., & Price, J. L. (2000). Prefrontal cortical projections to the striatum in macaque monkeys: Evidence for an organization related to prefrontal networks. Journal of Comparative Neurology, 425, 447–470. Fibiger, H. C., & Phillips, A. G. (1986). Reward, motivation, cognition: Psychobiology of mesotelencephalic dopamine systems. Handbook of Physiology, 4, 647–675. Fibiger, H. C., & Phillips, A. G. (1988). Mesocorticolimbic dopamine systems and reward. Annals of the New York Academy of Sciences, 537, 206–215. Flaherty, A. W., & Graybiel, A. M. (1994). Input-output organization of the sensorimotor striatum in the squirrel monkey. Journal of Neuroscience, 14, 599–610. Flaherty, A. W., & Graybiel, A. M. (1995). Motor and somatosensory corticostriatal projection magnifications in the squirrel monkey. Journal of Neurophysiology, 74, 2638–2648. Floresco, S. B., Magyar, O., Ghods-Sharifi, S., Vexelman, C., & Tse, M. T. (2006). Multiple dopamine receptor subtypes in the medial prefrontal cortex of the rat regulate set-shifting. Neuropsychopharmacology, 31, 297–309. Folstein, S. E., Folstein, M. F., & McHugh, P. R. (1979). Psychiatric syndromes in Huntington’s disease. Advances in Neurology, 23, 281–289. Fonnum, F., Storm-Mathisen, J., & Divac, I. (1981). Biochemical evidence for glutamate as neurotransmitter in corticostriatal and corticothalamic fibres in rat brain. Neuroscience, 6, 863–873. Fouriezos, G., Hansson, P., & Wise, R. A. (1978). Neuroleptic-induced attenuation of brain stimulation reward in rats. Journal of Comparative and Physiological Psychology, 92, 661–671.
c22.indd 447
447
Gelb, D. J., Oliver, E., & Gilman, S. (1999). Diagnostic criteria for Parkinson disease. Archives of Neurology, 56, 33–39. Gentilucci, M., Fogassi, L., Luppino, G., Matelli, M., Camarda, R., & Rizzolatti, G. (1989). Somatotopic representation in inferior area 6 of the macaque monkey. Brain, Behavior and Evolution, 33(2/3), 118–121. Gerber, G. J., Sing, J., & Wise, R. A. (1981). Pimozide attenuates lever pressing for water reinforcement in rats. Pharmacology, Biochemistry, and Behavior, 14, 201–205. Gerfen, C. R. (1992). The neostriatal mosaic: Multiple levels of compartmental organization. Journal of Neural Transmission (Suppl. 36), 43–59. Gerfen, C. R. (2000). Molecular effects of dopamine on striatal-projection pathways. Trends in Neurosciences, 23(Suppl. 10), S64–S70. Gerfen, C. R., Engber, T. M., Mahan, L. C., Susel, Z., Chase, T. N., Monsma, F. J., Jr., et al. (1990, December 7). D1 and d2 dopamine receptor-regulated gene expression of striatonigral and striatopallidal neurons. Science, 250, 1429–1432. Gibb, W. R., & Lees, A. J. (1989). The significance of the lewy body in the diagnosis of idiopathic Parkinson’s disease. Neuropathology and Applied Neurobiology, 15, 27–44. Godschalk, M., Mitz, A. R., van Duin, B., & van der Burg, H. (1995). Somatotopy of monkey premotor cortex examined with microstimulation. Neuroscience Research, 23, 269–279. Goldman, P. S., & Nauta, W. J. (1977). An intricately patterned prefrontocaudate projection in the rhesus monkey. Journal of Comparative Neurology, 72, 369–386. Goldman-Rakic, P. S. (1996). The prefrontal landscape: Implications of functional architecture for understanding human mentation and the central executive. Philosophical Transactions of the Royal Society of London, B: Biological Sciences, 351, 1445–1453. Gray, J. A., & Roth, B. L. (2007). Molecular targets for treating cognitive dysfunction in schizophrenia. Schizophrenia Bulletin, 33, 1100–1119. Graybiel, A. M. (1991). Basal ganglia: Input, neural activity, and relation to the cortex. Current Opinion in Neurobiology, 1, 644–651.
8/17/09 2:21:05 PM
448
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
Graybiel, A. M., Hirsch, E. C., & Agid, Y. (1990). The nigrostriatal system in Parkinson’s disease. Advances in Neurology, 53, 17–29. Grillner, S. (1990). Neurobiology of vertebrate motor behavior: From flexion reflexes to manipulative movements. In G. M. Edelman, W. E. Gall, & W. M. Cowan (Eds.), Signal and sense (pp. 187–208). New York: Wiley-Liss. Groenewegen, H. J. (1988). Organization of the afferent connections of the mediodorsal thalamic nucleus in the rat, related to the mediodorsalprefrontal topography. Neuroscience, 24, 379–431. Groenewegen, H. J., Berendse, H. W., Wolters, J. G., & Lohman, A. H. (1990). The anatomical relationship of the prefrontal cortex with the striatopallidal system, the thalamus and the amygdala: Evidence for a parallel organization. Progress in Brain Research, 85, 95–116; discussion 116–118. Groenewegen, H. J., Vermeulen-Vander Zee, E., te Kortschot, A., & Witter, M. P. (1987). Organization of the projections from the subiculum to the ventral striatum in the rat: A study using anterograde transport of phaseolus vulgaris leucoagglutinin. Neuroscience, 23, 103–120. Haber, S. N. (2003). The primate basal ganglia: Parallel and integrative networks. Journal of Chemical Neuroanatomy, 26, 317–330. Haber, S. N., & Fudge, J. L. (1997). The primate substantia nigra and vta: Integrative circuitry and function. Critical Reviews in Neurobiology, 11, 323–342. Haber, S. N., Fudge, J. L., & McFarland, N. R. (2000). Striatonigrostriatal pathways in primates form an ascending spiral from the shell to the dorsolateral striatum. Journal of Neuroscience, 20, 2369–2382. Haber, S. N., Groenewegen, H. J., Grove, E. A., & Nauta, W. J. (1985). Efferent connections of the ventral pallidum: Evidence of a dual striato pallidofugal pathway. Journal of Comparative Neurology, 235, 322–335. Haber, S. N., Kunishio, K., Mizobuchi, M., & Lynd-Balta, E. (1995). The orbital and medial prefrontal circuit through the primate basal ganglia. Journal of Neuroscience, 15(7, Pt. 1), 4851–4867.
Heffner, R. S., & Masterton, R. B. (1983). The role of the corticospinal tract in the evolution of human digital dexterity. Brain, Behavior and Evolution, 23(3–4), 165–183. Heimer, L., Alheid, G. F., de Olmos, J. S., Groenewegen, H. J., Haber, S. N., Harlan, R. E., et al. (1997). The accumbens: Beyond the core-shell dichotomy. Journal of Neuropsychiatry and Clinical Neurosciences, 9, 354–381. Heimer, L., Alheid, G. F., & Zahm, D. S. (1993). Basal forebrain organization: An anatomical framework for motor aspects of drive and motivation. In P. W. Kalivas & C. D. Barnes (Eds.), Limbic motor circuits and neuropsychiatry (pp. 1–32). Boca Raton, FL: CRC Press. Heimer, L., Switzer, R. D., & Van Hoesen, G. W. (1982). Ventral striatumand ventral pallidum: Components of the motor system? TINS, 5, 83–87. Heimer, L., & Van Hoesen, G. W. (2006). The limbic lobe and its output channels: Implications for emotional functions and adaptive behavior. Neuroscience and Biobehavioral Reviews, 30, 126–147. Heimer, L., Zahm, D. S., Churchill, L., Kalivas, P. W., & Wohltmann, C. (1991). Specificity in the projection patterns of accumbal core and shell in the rat. Neuroscience, 41, 89–125. Hernandez-Lopez, S., Bargas, J., Surmeier, D. J., Reyes, A., & Galarraga, E. (1997). D1 receptor activation enhances evoked discharge in neostriatal medium spiny neurons by modulating an l-type ca2+ conductance. Journal of Neuroscience, 17, 3334–3342. Hersch, S. M., Ciliax, B. J., Gutekunst, C. A., Rees, H. D., Heilman, C. J., Yung, K. K., et al. (1995). Electron microscopic analysis of d1 and d2 dopamine receptor proteins in the dorsal striatum and their synaptic relationships with motor corticostriatal afferents. Journal of Neuroscience, 15(7, Pt. 2), 5222–5237. Herzog, A. G., & Van Hoesen, G. W. (1976). Temporal neocortical afferent connections to the amygdala in the rhesus monkey. Brain Research, 115, 57–69. Hietala, J., & Syvalahti, E. (1996). Dopamine in schizophrenia. Annals of Medicine, 28, 557–561.
Haber, S. N., Lynd, E., Klein, C., & Groenewegen, H. J. (1990). Topographic organization of the ventral striatal efferent projections in the rhesus monkey: An anterograde tracing study. Journal of Comparative Neurology, 293, 282–298.
Hietala, J., Syvalahti, E., Vuorio, K., Nagren, K., Lehikoinen, P., Ruotsalainen, U., et al. (1994). Striatal d2 dopamine receptor characteristics in neuroleptic-naive schizophrenic patients studied with positron emission tomography. Archives of General Psychiatry, 51, 116–123.
Haber, S. N., Lynd-Balta, E., & Mitchell, S. J. (1993). The organization of the descending ventral pallidal projections in the monkey. Journal of Comparative Neurology, 329, 111–128.
Hikosaka, O. (2007). Basal ganglia mechanisms of reward-oriented eye movement. Annals of the New York Academy of Sciences, 1104, 229–249.
Hallett, M. (1993). Physiology of basal ganglia disorders: An overview. Canadian Journal of Neurological Sciences, 20, 177–183.
Hirsch, E., Graybiel, A. M., & Agid, Y. A. (1988, July 28). Melanized dopaminergic neurons are differentially susceptible to degeneration in Parkinson’s disease. Nature, 334, 345–348.
Halsband, U., & Passingham, R. E. (1982). The role of premotor and parietal cortex in the direction of action. Brain Research, 240, 368–372.
Holland, P. C., & Gallagher, M. (2004). Amygdala-frontal interactions and reward expectancy. Current Opinion in Neurobiology, 14, 148–155.
Halsband, U., & Passingham, R. E. (1985). Premotor cortex and the conditions for movement in monkeys (macaca fascicularis). Behavioral Brain Research, 18, 269–277.
Hollerman, J. R., Tremblay, L., & Schultz, W. (2000). Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior. Progress in Brain Research, 126, 193–215.
Hazrati, L. N., & Parent, A. (1992). The striatopallidal projection displays a high degree of anatomical specificity in the primate. Brain Research, 592(1–2), 213–227.
Hoover, J. E., & Strick, P. L. (1993, February 5). Multiple output channels in the basal ganglia. Science, 259, 819–821.
He, S. Q., Dum, R. P., & Strick, P. L. (1993). Topographic organization of corticospinal projections from the frontal lobe: Motor areas on the lateral surface of the hemisphere. Journal of Neuroscience, 13, 952–980. He, S. Q., Dum, R. P., & Strick, P. L. (1995). Topographic organization of corticospinal projections from the frontal lobe: Motor areas on the medial surface of the hemisphere. Journal of Neuroscience, 15(5, Pt. 1), 3284–3306. Hedreen, J. C., & DeLong, M. R. (1991). Organization of striatopallidal, striatonigral, and nigrostriatal projections in the macaque. Journal of Comparative Neurology, 304, 569–595. Hedreen, J. C., & Folstein, S. E. (1995). Early loss of neostriatal striosome neurons in Huntington’s disease. Journal of Neuropathology and Experimental Neurology, 54, 105–120.
c22.indd 448
Hornykiewicz, O., & Kish, S. J. (1987). Biochemical pathophysiology of Parkinson’s disease. Advances in Neurology, 45, 19–34. Horvitz, J. C., & Ettenberg, A. (1991). Conditioned incentive properties of a food-paired conditioned stimulus remain intact during dopamine receptor blockade. Behavioral Neuroscience, 105, 536–541. Jellinger, K. A. (1998). Neuropathology of movement disorders. Neurosurgery Clinics of North America, 9, 237–262. Jellinger, K. A. (1991). Pathology of Parkinson’s disease: Changes other than the nigrostriatal pathway. Molecular and Chemical Neuropathology, 14, 153–197. Jentsch, J. D., & Roth, R. H. (1999). The neuropsychopharmacology of phencyclidine: From nmda receptor hypofunction to the dopamine hypothesis of schizophrenia. Neuropsychopharmacology, 20, 201–225.
8/17/09 2:21:05 PM
References Joel, D. (2001). Open interconnected model of basal ganglia-thalamocortical circuitry and its relevance to the clinical syndrome of Huntington’s disease. Movement Disorders, 16, 407–423.
Kiyatkin, E. A. (1995). Functional significance of mesolimbic dopamine. Neuroscience and Biobehavioral Reviews, 19, 573–598.
Joel, D., & Weiner, I. (1994). The organization of the basal gangliathalamocortical circuits: Open interconnected rather than closed segregated. Neuroscience, 63, 363–379.
Kiyatkin, E. A., & Rebec, G. V. (1999). Striatal neuronal activity and responsiveness to dopamine and glutamate after selective blockade of d1 and d2 dopamine receptors in freely moving rats. Journal of Neuroscience, 19, 3594–3609.
Joel, D., & Weiner, I. (1997). The connections of the primate subthalamic nucleus: Indirect pathways and the open-interconnected scheme of basal ganglia-thalamocortical circuitry. Brain Research: Brain Research Reviews, 23(1–2), 62–78.
Klitenick, M. A., Deutch, A. Y., Churchill, L., & Kalivas, P. W. (1992). Topography and functional role of dopaminergic projections from the ventral mesencephalic tegmentum to the ventral pallidum. Neuroscience, 50, 371–386.
Jones, B., & Mishkin, M. (1972). Limbic lesions and the problem of stimulus: Reinforcement associations. Experimental Neurology, 36, 362–377. Jones, E. B. (1985). The thalamus. New York: Plenum Press.
Kluver, H., & Bucy, P. C. (1939). Preliminary analysis of the functions of temporal lobe in monkeys. Archives of Neurology and Psychiatry, 42, 979–1000.
Joseph, M. H., Frith, C. D., & Waddington, J. L. (1979). Dopaminergic mechanisms and cognitive deficit in schizophrenia. A neurobiological model. Psychopharmacology, 63, 273–280.
Kodsi, M. H., & Swerdlow, N. R. (1994). Quinolinic acid lesions of the ventral striatum reduce sensorimotor gating of acoustic startle in rats. Brain Research, 643(1–2), 59–65.
Juckel, G., Schlagenhauf, F., Koslowski, M., Filonov, D., Wustenberg, T., Villringer, A., et al. (2006). Dysfunction of ventral striatal reward prediction in schizophrenic patients treated with typical, not atypical, neuroleptics. Psychopharmacology, 187, 222–228.
Kornetsky, C., & Esposito, R. U. (1979). Euphorigenic drugs: Effects on the reward pathways of the brain. Federation Proceedings, 38, 2473–2476.
Kalivas, P. W., Churchill, L., & Klitenick, M. A. (1993). Gaba and enkephalin projection from the nucleus accumbens and ventral pallidum to the ventral tegmental area. Neuroscience, 57, 1047–1060. Kalivas, P. W., Churchill, L., & Romanides, A. (1999). Involvement of the pallidal-thalamocortical circuit in adaptive behavior. Annals of the New York Adademy of Sciences, 877, 64–70. Kawaguchi, Y., Wilson, C. J., & Emson, P. C. (1990). Projection subtypes of rat neostriatal matrix cells revealed by intracellular injection of biocytin. Journal of Neuroscience, 10, 3421–3438. Keefe, R. S., Mohs, R. C., Losonczy, M. F., Davidson, M., Silverman, J. M., Kendler, K. S., et al. (1987). Characteristics of very poor outcome schizophrenia. American Journal of Psychiatry, 144, 889–895. Kelley, A. E. (2004). Ventral striatal control of appetitive motivation: Role in ingestive behavior and reward-related learning. Neuroscience and Biobehavioral Reviews, 27, 765–776. Kelley, A. E., Baldo, B. A., Pratt, W. E., & Will, M. J. (2005). Corticostriatalhypothalamic circuitry and food motivation: Integration of energy, action and reward. Physiology and Behavior, 86, 773–795. Kelley, A. E., & Domesick, V. B. (1982). The distribution of the projection from the hippocampal formation to the nucleus accumbens in the rat: An anterograde- and retrograde-horseradish peroxidase study. Neuroscience, 7, 2321–2335. Kelley, A. E., Domesick, V. B., & Nauta, W. J. (1982). The amygdalostriatal projection in the rat: An anatomical study by anterograde and retrograde tracing methods. Neuroscience, 7, 615–630. Kemether, E. M., Buchsbaum, M. S., Byne, W., Hazlett, E. A., Haznedar, M., Brickman, A. M., et al. (2003). Magnetic resonance imaging of mediodorsal, pulvinar, and centromedian nuclei of the thalamus in patients with schizophrenia. Archives of General Psychiatry, 60, 983–991.
c22.indd 449
449
Kostrzewa, R. M., Kostrzewa, J. P., Nowak, P., Kostrzewa, R. A., & Brus, R. (2004). Dopamine d2 agonist priming in intact and dopamine-lesioned rats. Neurotoxicity Research, 6, 457–462. Kowall, N. W., Ferrante, R. J., & Martin, A. B. (1987). Patterns of cell loss in Huntington’s disease. Trends in Neurosciences, 10, 24–29. Kumakura, Y., Cumming, P., Vernaleken, I., Buchholz, H. G., Siessmeier, T., Heinz, A., et al. (2007). Elevated [18f]fluorodopamine turnover in brain of patients with schizophrenia: An [18f]fluorodopa/positron emission tomography study. Journal of Neuroscience, 27, 8080–8087. Kunzle, H. (1975). Bilateral projections from precentral motor cortex to the putamen and other parts of the basal ganglia: An autoradiographic study in macaca fascicularis. Brain Research, 88, 195–209. Kuo, J. S., & Carpenter, M. B. (1973). Organization of pallidothalamic projections in the rhesus monkey. Journal of Comparative Neurology, 151, 201–236. Kurata, K., & Hoffman, D. S. (1994). Differential effects of muscimol microinjection into dorsal and ventral aspects of the premotor cortex of monkeys. Journal of Neurophysiology, 71, 1151–1164. Kuroda, M., Murakami, K., Kishi, K., & Price, J. L. (1995). Thalamocortical synapses between axons from the mediodorsal thalamic nucleus and pyramidal cells in the prelimbic cortex of the rat. Journal of Comparative Neurology, 356, 143–151. Kuroda, M., Murakami, K., Shinkai, M., Ojima, H., & Kishi, K. (1995). Electron microscopic evidence that axon terminals from the mediodorsal thalamic nucleus make direct synaptic contacts with callosal cells in the prelimbic cortex of the rat. Brain Research, 677, 348–353. Laurens, K. R., Kiehl, K. A., & Liddle, P. F. (2005). A supramodal limbicparalimbic-neocortical network supports goal-directed stimulus processing. Human Brain Mapping, 24, 35–49.
Kim, R., Nakano, K., Jayaraman, A., & Carpenter, M. B. (1976). Projections of the globus pallidus and adjacent structures: An autoradiographic study in the monkey. Journal of Comparative Neurology, 169, 263–290.
Lavoie, B., Smith, Y., & Parent, A. (1989). Dopaminergic innervation of the basal ganglia in the squirrel monkey as revealed by tyrosine hydroxylase immunohistochemistry. Journal of Comparative Neurology, 289, 36–52.
Kipps, C. M., Duggins, A. J., Mahant, N., Gomes, L., Ashburner, J., & McCusker, E. A. (2005). Progression of structural neuropathology in preclinical Huntington’s disease: A tensor based morphometry study. Journal of Neurology, Neurosurgery, and Psychiatry, 76, 650–655.
Lawrie, S. M., McIntosh, A. M., Hall, J., Owens, D. G., & Johnstone, E. C. (2008). Brain structure and function changes during the development of schizophrenia: The evidence from studies of subjects at increased genetic risk. Schizophrenia Bulletin, 34, 330–340.
Kirkwood, S. C., Siemers, E., Hodes, M. E., Conneally, P. M., Christian, J. C., & Foroud, T. (2000). Subtle changes among presymptomatic carriers of the Huntington’s disease gene. Journal of Neurology, Neurosurgery, and Psychiatry, 69, 773–779.
Leblois, A., Boraud, T., Meissner, W., Bergman, H., & Hansel, D. (2006). Competition between feedback loops underlies normal and pathological dynamics in the basal ganglia. Journal of Neuroscience, 26, 3567–3583.
Kita, H., Tokuno, H., & Nambu, A. (1999). Monkey globus pallidus external segment neurons projecting to the neostriatum. NeuroReport, 10, 1467–1472.
Lehericy, S., Bardinet, E., Tremblay, L., Van de Moortele, P. F., Pochon, J. B., Dormont, D., et al. (2006). Motor control in basal ganglia circuits using fmri and brain atlas approaches. Cerebral Cortex, 16, 149–161.
8/17/09 2:21:06 PM
450
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
Lemiere, J., Decruyenaere, M., Evers-Kiebooms, G., Vandenbussche, E., & Dom, R. (2004). Cognitive changes in patients with Huntington’s disease (hd) and asymptomatic carriers of the hd mutation: A longitudinal follow-up study. Journal of Neurology, 251, 935–942. Le Moal, M., & Simon, H. (1991). Mesocorticolimbic dopaminergic network: Functional and regulatory roles. Physiological Reviews, 71, 155–234. Le Moine, C., & Bloch, B. (1995). D1 and d2 dopamine receptor gene expression in the rat striatum: Sensitive crna probes demonstrate prominent segregation of d1 and d2 mrnas in distinct neuronal populations of the dorsal and ventral striatum. Journal of Comparative Neurology, 355, 418–426. Lemon, R. N., Mantel, G. W. H., & Muir, R. B. (1986). Corticospinal facilitation of hand muscles during voluntary movement in the conscious monkey. Journal of Physiology, 381, 497–527. Lewis, D. A., Hashimoto, T., & Volk, D. W. (2005). Cortical inhibitory neurons and schizophrenia. Nature Reviews: Neuroscience, 6, 312–324. Lewis, D. A., & Moghaddam, B. (2006). Cognitive dysfunction in schizophrenia: Convergence of gamma-aminobutyric acid and glutamate alterations. Archives of Neurology, 63, 1372–1376. Lewis, D. A., Pierri, J. N., Volk, D. W., Melchitzky, D. S., & Woo, T. U. (1999). Altered gaba neurotransmission and prefrontal cortical dysfunction in schizophrenia. Biological Psychiatry, 46, 616–626. Leyton, A. S. F., & Sherrington, C. S. (1917). Observations on the excitbale cortex of the chimpanzee, orang-utan and gorilla. Quarterly Journal of Experimental Physiology, 11, 135–222.
McFarland, K., & Ettenberg, A. (1995). Haloperidol differentially affects reinforcement and motivational processes in rats running an alley for intravenous heroin. Psychopharmacology, 122, 346–350. McFarland, K., & Ettenberg, A. (1997). Reinstatement of drug-seeking behavior produced by heroin-predictive environmental stimuli. Psychopharmacology, 131, 86–92. McFarland, K., & Ettenberg, A. (1998). Haloperidol does not affect motivational processes in an operant runway model of food-seeking behavior. Behavioral Neuroscience, 112, 630–635. McFarland, K., & Ettenberg, A. (1999). Haloperidol does not attenuate conditioned place preferences or locomotor activation produced by foodor heroin-predictive discriminative cues. Pharmacology, Biochemistry, and Behavior, 62, 631–641. McFarland, K., & Kalivas, P. W. (2003). Motivational sytems. In M. Gallagher & R. J. Nelson (Eds.), Handbook of psychology (Vol. 3, pp. 379–404). Hoboken, NJ: Wiley. McFarland, N. R., & Haber, S. N. (2000). Convergent inputs from thalamic motor nuclei and frontal cortical areas to the dorsal striatum in the primate. Journal of Neuroscience, 20, 3798–3813. McFarland, N. R., & Haber, S. N. (2002). Thalamic relay nuclei of the basal ganglia form both reciprocal and nonreciprocal cortical connections, linking multiple frontal cortical areas. Journal of Neuroscience, 22, 8117–8132. Meltzer, H. Y., & Stahl, S. M. (1976). The dopamine hypothesis of schizophrenia: A review. Schizophrenia Bulletin, 2, 19–76.
Li, X. J. (1999). The early cellular pathology of Huntington’s disease. Molecular Neurobiology, 20(2–3), 111–124.
Meredith, G. E., Baldo, B. A., Andrezjewski, M. E., & Kelley, A. E. (2008). The structural basis for mapping behavior onto the ventral striatum and its subdivisions. Brain Structure and Function, 213, 17–27.
Liles, S. L., & Updyke, B. V. (1985). Projection of the digit and wrist area of precentral gyrus to the putamen: Relation between topography and physiological properties of neurons in the putamen. Brain Research, 339, 245–255.
Middleton, F. A., & Strick, P. L. (2001). A revised neuroanatomy of frontal-subcortical circuits. In D. G. Lichter & J. Cummings (Eds.), Frontal-subcortical circuits in psychiatric and neurological disorders (pp. 44–58). New York: Guilford Press.
Louilot, A., Le Moal, M., & Simon, H. (1986). Differential reactivity of dopaminergic neurons in the nucleus accumbens in response to different behavioral situations: An in vivo voltammetric study in free moving rats. Brain Research, 397, 395–400.
Mitchell, D. G., Rhodes, R. A., Pine, D. S., & Blair, R. J. (2008). The contribution of ventrolateral and dorsolateral prefrontal cortex to response reversal. Behavioural and Brain Research, 187, 80–87.
Lovibond, P. F. (1980). Effects of long- and variable-duration signals for food on activity, instrumental responding, and eating. Learning and Motivation, 11, 164.
Mitelman, S. A., Byne, W., Kemether, E. M., Hazlett, E. A., & Buchsbaum, M. S. (2005). Metabolic disconnection between the mediodorsal nucleus of the thalamus and cortical brodmann’s areas of the left hemisphere in schizophrenia. American Journal of Psychiatry, 162, 1733–1735.
Lynd-Balta, E., & Haber, S. N. (1994a). The organization of midbrain projections to the striatum in the primate: Sensorimotor-related striatum versus ventral striatum. Neuroscience, 59, 625–640.
Mitz, A. R., & Wise, S. P. (1987). The somatotopic organization of the supplementary motor area: Intracortical microstimulation mapping. Journal of Neuroscience, 7, 1010–1021.
Lynd-Balta, E., & Haber, S. N. (1994b). Primate striatonigral projections: A comparison of the sensorimotor-related striatum and the ventral striatum. Journal of Comparative Neurology, 345, 562–578.
Mogenson, G. J., Ciriello, J., Garland, J., & Wu, M. (1987). Ventral pallidum projections to mediodorsal nucleus of the thalamus: An anatomical and electrophysiological investigation in the rat. Brain Research, 404(1–2), 221–230.
Lyness, W. H., Friedle, N. M., & Moore, K. E. (1979). Destruction of dopaminergic nerve terminals in nucleus accumbens: Effect on d-amphetamine self-administration. Pharmacology, Biochemistry, and Behavior, 11, 553–556. Mallet, N., Ballion, B., Le Moine, C., & Gonon, F. (2006). Cortical inputs and gaba interneurons imbalance projection neurons in the striatum of Parkinsonian rats. Journal of Neuroscience, 26, 3875–3884. Mardsen, C. D. (1994). Parkinson’s disease. Journal of Neurology, Neurosurgery, and Psychiatry, 57, 672–681. Martin, J. B. (1984). Huntington’s disease: New approaches to an old problem. The Robert Wartenberg lecture. Neurology, 34, 1059–1072.
Mogenson, G. J., Jones, D. L., & Yim, C. Y. (1980). From motivation to action: Functional interface between the limbic system and the motor system. Progress in Neurobiology, 14(2–3), 69–97. Morecraft, R. J., Cipolloni, P. B., Stilwell-Morecraft, K. S., Gedney, M. T., & Pandya, D. N. (2004). Cytoarchitecture and cortical connections of the posterior cingulate and adjacent somatosensory fields in the rhesus monkey. Journal of Comparative Neurology, 469, 37–69. Nambu, A. (2005). A new approach to understand the pathophysiology of Parkinson’s disease. Journal of Neurology, 252 (Suppl. 4), IV1–IV4.
Martin, J. B., & Gusella, J. F. (1986). Huntington’s disease: Pathogenesis and management. New England Journal of Medicine, 315, 1267–1276.
Nambu, A., Tokuno, H., & Takada, M. (2002). Functional significance of the cortico-subthalamo-pallidal ‘hyperdirect’ pathway. Neuroscience Research, 43, 111–117.
Mason, S. T., Beninger, R. J., Fibiger, H. C., & Phillips, A. G. (1980). Pimozide-induced suppression of responding: Evidence against a block of food reward. Pharmacology, Biochemistry, and Behavior, 12, 917–923.
Napier, T. C., Mitrovic, I., Churchill, L., Klitenick, M. A., Lu, X. Y., & Kalivas, P. W. (1995). Substance p in the ventral pallidum: Projection from the ventral striatum, and electrophysiological and behavioral consequences of pallidal substance p. Neuroscience, 69, 59–70.
c22.indd 450
8/17/09 2:21:06 PM
References Nauta, W. J. (1961). Fiber degeneration following lesions of the amygdaloid complex in the monkey. Journal of Anatomy, 95, 515–531. Novais-Santos, S., Gee, J., Shah, M., Troiani, V., Work, M., & Grossman, M. (2007). Resolving sentence ambiguity with planning and working memory resources: Evidence from fMRI. Neuroimage, 37, 361–378. Nudo, R. J., & Masterton, R. B. (1990). Descending pathways to the spinal cord: Pt. IV. Some factors related to the amount of cortex devoted to the corticospinal tract. Journal of Comparative Neurology, 296, 584–597. O’Connor, W. T. (1998). Functional neuroanatomy of the basal ganglia as studied by dual-probe microdialysis. Nuclear Medicine and Biology, 25, 743–746. O’Donnell, P., & Grace, A. A. (1995). Synaptic interactions among excitatory afferents to nucleus accumbens neurons: Hippocampal gating of prefrontal cortical input. Journal of Neuroscience, 15(5, Pt. 1), 3622–3639. O’Donnell, P., & Grace, A. A. (1996). Dopaminergic reduction of excitability in nucleus accumbens neurons recorded in vitro. Neuropsychopharmacology, 15, 87–97. O’Donnell, P., Greene, J., Pabello, N., Lewis, B. L., & Grace, A. A. (1999). Modulation of cell firing in the nucleus accumbens. Annals of the New York Academy of Sciences, 877, 157–175. Olds, J., & Milner, P. (1954). Positive reinforcement produced by electrical stimulation of septal area and other regions of rat brain. Journal of Comparative and Physiological Psychology, 47, 419–427. Olive, M. F., Bertolucci, M., Evans, C. J., & Maidment, N. T. (1995). Microdialysis reveals a morphine-induced increase in pallidal opioid peptide release. NeuroReport, 6, 1093–1096. Owen, A. M. (2004). Cognitive dysfunction in Parkinson’s disease: The role of frontostriatal circuitry. Neuroscientist, 10, 525–537. Parent, A., Bouchard, C., & Smith, Y. (1984). The striatopallidal and striatonigral projections: Two distinct fiber systems in primate. Brain Research, 303, 385–390.
Petrides, M., & Pandya, D. N. (1999). Dorsolateral prefrontal cortex: Comparative cytoarchitectonic analysis in the human and the macaque brain and corticocortical connection patterns. European Journal of Neuroscience, 11, 1011–1036. Picard, N., & Strick, P. L. (1996). Motor areas of the medial wall: A review of their location and functional activation. Cerebral Cortex, 6, 342–353. Picard, N., & Strick, P. L. (2001). Imaging the premotor areas. Current Opinion in Neurobiology, 11, 663–672. Rajakumar, N., Elisevich, K., & Flumerfelt, B. A. (1994). The pallidostriatal projection in the rat: A recurrent inhibitory loop? Brain Research, 651(1/2), 332–336. Ray, J. P., Russchen, F. T., Fuller, T. A., & Price, J. L. (1992). Sources of presumptive glutamatergic/aspartatergic afferents to the mediodorsal nucleus of the thalamus in the rat. Journal of Comparative Neurology, 320, 435–456. Redgrave, P., & Dean, P. (1981). Intracranial self-stimulation. British Medical Bulletin, 37, 141–146. Reijmers, L. G., Vanderheyden, P. M., & Peeters, B. W. (1995). Changes in prepulse inhibition after local administration of nmda receptor ligands in the core region of the rat nucleus accumbens. European Journal of Pharmacology, 272(2–3), 131–138. Reiner, A., Albin, R. L., Anderson, K. D., D’Amato, C. J., Penney, J. B., & Young, A. B. (1988). Differential loss of striatal projection neurons in Huntington disease. Proceedings of the National Academy of Sciences, USA, 85, 5733–5737. Revheim, N., Schechter, I., Kim, D., Silipo, G., Allingham, B., Butler, P., et al. (2006). Neurocognitive and symptom correlates of daily problemsolving skills in schizophrenia. Schizophrenia Research, 83, 237–245. Richfield, E. K., & Herkenham, M. (1994). Selective vulnerability in Huntington’s disease: Preferential loss of cannabinoid receptors in lateral globus pallidus. Annals of Neurology, 36, 577–584.
Parent, A., & Hazrati, L. N. (1994). Multiple striatal representation in primate substantia nigra. Journal of Comparative Neurology, 344, 305–320.
Robbins, T. W. (2003). Dopamine and cognition. Current Opinion in Neurology, 16(Suppl. 2), S1–S2.
Parent, A., & Hazrati, L. N. (1995). Functional anatomy of the basal ganglia: Pt. I. The cortico-basal ganglia-thalamo-cortical loop. Brain Research: Brain Research Reviews, 20, 91–127.
Robbins, T. W., Cador, M., Taylor, J. R., & Everitt, B. J. (1989). Limbicstriatal interactions in reward-related processes. Neuroscience and Biobehavioral Reviews, 13(2–3), 155–162.
Passingham, R. E., Perry, H., & Wilkinson, F. (1978). Failure to develop a precision grip in monkeys with unilateral neocortical lesions made in infancy. Brain Research, 145, 410–414.
Roberts, D. C., Corcoran, M. E., & Fibiger, H. C. (1977). On the role of ascending catecholaminergic systems in intravenous self-administration of cocaine. Pharmacology, Biochemistry, and Behavior, 6, 615–620.
Passingham, R. E., Perry, V. H., & Wilkinson, F. (1983). The long-term effects of removal of sensorimotor cortex in infant and adult rhesus monkeys. Brain, 106(Pt. 3), 675–705. Passingham, R. E., Stephan, K. E., & Kotter, R. (2002). The anatomical basis of functional localization in the cortex. Nature Reviews: Neuroscience, 3, 606–616. Pearson, S. J., Heathfield, K. W., & Reynolds, G. P. (1990). Pallidal gaba and chorea in Huntington’s disease. Journal of Neural Transmission: General Section, 81, 241–246. Penfield, W., & Boldrey, E. (1937). Somatic motor and sensory representation in the cerebral cortex of man as a studies by electrical brain stimulation. Brain, 60, 389–443. Penfield, W., & Rasmussen, T. (1952). The cerebral cortex of man. New York: Macmillan. Penney, J. B., Jr., & Young, A. B. (1986). Striatal inhomogeneities and basal ganglia function. Movement Disorders, 1, 3–15. Percheron, G., & Filion, M. (1991). Parallel processing in the basal ganglia: Up to a point. Trends in Neurosciences, 14, 55–59. Percheron, G., Francois, C., Yelnik, J., & Fenelon, G. (1989). The primate nigro-striato-pallido-nigral system: Not a mere loop. In A. R. Crossman & M. A. Sambrook (Eds.), Neural mechanisms in disorders of movements (pp. 103–109). London: John Libbey.
c22.indd 451
451
Robinson, T. E., & Berridge, K. C. (2000). The psychology and neurobiology of addiction: An incentive-sensitization view. Addiction, 95(Suppl. 2), S91–S117. Rolls, E. T., & Baylis, L. L. (1994). Gustatory, olfactory, and visual convergence within the primate orbitofrontal cortex. Journal of Neuroscience, 14, 5437–5452. Rolls, E. T., Rolls, B. J., Kelly, P. H., Shaw, S. G., Wood, R. J., & Dale, R. (1974). The relative attenuation of self-stimulation, eating and drinking produced by dopamine-receptor blockade. Psychopharmacologia, 38, 219–230. Rosenkranz, J. A., & Grace, A. A. (1999). Modulation of basolateral amygdala neuronal firing and afferent drive by dopamine receptor activation in vivo. Journal of Neuroscience, 19, 11027–11039. Rosenkranz, J. A., & Grace, A. A. (2001). Dopamine attenuates prefrontal cortical suppression of sensory inputs to the basolateral amygdala of rats. Journal of Neuroscience, 21, 4090–4103. Rubin, P., Hemmingsen, R., Holm, S., Moller-Madsen, S., Hertel, C., Povlsen, U. J., et al. (1994). Relationship between brain structure and function in disorders of the schizophrenic spectrum: Single positron emission computerized tomography, computerized tomography and psychopathology of first episodes. Acta Psychiatrica Scandinavica, 90, 281–289.
8/17/09 2:21:07 PM
452
Motor Control: Pyramidal, Extrapyramidal, and Limbic Motor Control
Salamone, J. D. (1994). The involvement of nucleus accumbens dopamine in appetitive and aversive motivation. Behavioural Brain Research, 61, 117–133.
Snyder, S. H. (1976). The dopamine hypothesis of schizophrenia: Focus on the dopamine receptor. American Journal of Psychiatry, 133, 197–202.
Salamone, J. D., & Correa, M. (2002). Motivational views of reinforcement: Implications for understanding the behavioral functions of nucleus accumbens dopamine. Behavioural Brain Research, 137(1–2), 3–25.
Spencer, H. J. (1976). Antagonism of cortical excitation of striatal neurons by glutamic acid diethyl ester: Evidence for glutamic acid as an excitatory transmitter in the rat striatum. Brain Research, 102, 91–101.
Sapp, E., Ge, P., Aizawa, H., Bird, E., Penney, J., Young, A. B., et al. (1995). Evidence for a preferential loss of enkephalin immunoreactivity in the external globus pallidus in low grade Huntington’s disease using high resolution image analysis. Neuroscience, 64, 397–404.
Stellar, J. R., & Corbett, D. (1989). Regional neuroleptic microinjections indicate a role for nucleus accumbens in lateral hypothalamic selfstimulation reward. Brain Research, 477(1–2), 126–143.
Scatton, B., Javoy-Agid, F., Rouquier, L., Dubois, B., & Agid, Y. (1983). Reduction of cortical dopamine, noradrenaline, serotonin and their metabolites in Parkinson’s disease. Brain Research, 275, 321–328.
Stellar, J. R., Kelley, A. E., & Corbett, D. (1983). Effects of peripheral and central dopamine blockade on lateral hypothalamic self-stimulation: Evidence for both reward and motor deficits. Pharmacology, Biochemistry, and Behavior, 18, 433–442.
Scatton, B., Rouquier, L., Javoy-Agid, F., & Agid, Y. (1982). Dopamine deficiency in the cerebral cortex in Parkinson disease. Neurology, 32, 1039–1040.
Stoof, J. C., & Kebabian, J. W. (1981, November 26). Opposing roles for d-1 and d-2 dopamine receptors in efflux of cyclic amp from rat neostriatum. Nature, 294, 366–368.
Schlosser, R. G., Nenadic, I., Wagner, G., Gullmar, D., von Consbruch, K., Kohler, S., et al. (2007). White matter abnormalities and brain activation in schizophrenia: A combined dti and fMRI study. Schizophrenia Research, 89(1–3), 1–11.
Stout, J. C., Weaver, M., Solomon, A. C., Queller, S., Hui, S., Johnson, S. A., et al. (2007). Are cognitive changes progressive in prediagnostic hd? Cognitive and Behavioral Neurology, 20, 212–218.
Schneider, L. H., Davis, J. D., Watson, C. A., & Smith, G. P. (1990). Similar effect of raclopride and reduced sucrose concentration on the microstructure of sucrose sham feeding. European Journal of Pharmacology, 186, 61–70. Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80, 1–27. Schultz, W. (2002). Getting formal with dopamine and reward. Neuron, 36, 241–263. Schultz, W. (2006). Behavioral theories and the neurophysiology of reward. Annual Review of Psychology, 57, 87–115. Schultz, W., Apicella, P., & Ljungberg, T. (1993). Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. Journal of Neuroscience, 13, 900–913. Schultz, W., Tremblay, L., & Hollerman, J. R. (2000). Reward processing in primate orbitofrontal cortex and basal ganglia. Cerebral Cortex, 10, 272–284. Selemon, L. D., & Goldman-Rakic, P. S. (1990). Topographic intermingling of striatonigral and striatopallidal neurons in the rhesus monkey. Journal of Comparative Neurology, 297, 359–376. Sesack, S. R., Deutch, A. Y., Roth, R. H., & Bunney, B. S. (1989). Topographical organization of the efferent projections of the medial prefrontal cortex in the rat: An anterograde tract-tracing study with phaseolus vulgaris leucoagglutinin. Journal of Comparative Neurology, 290, 213–242. Sesack, S. R., & Pickel, V. M. (1990). In the rat medial nucleus accumbens, hippocampal and catecholaminergic terminals converge on spiny neurons and are in apposition to each other. Brain Research, 527, 266–279.
Swanson, L. W. (1982). The projections of the ventral tegmental area and adjacent regions: A combined fluorescent retrograde tracer and immunofluorescence study in the rat. Brain Research Bulletin, 9(1–6), 321–353. Swerdlow, N. R., Braff, D. L., Geyer, M. A., & Koob, G. F. (1986). Central dopamine hyperactivity in rats mimics abnormal acoustic startle response in schizophrenics. Biological Psychiatry, 21, 23–33. Swerdlow, N. R., Braff, D. L., Masten, V. L., & Geyer, M. A. (1990). Schizophrenic-like sensorimotor gating abnormalities in rats following dopamine infusion into the nucleus accumbens. Psychopharmacology, 101, 414–420. Swerdlow, N. R., Lipska, B. K., Weinberger, D. R., Braff, D. L., Jaskiw, G. E., & Geyer, M. A. (1995). Increased sensitivity to the sensorimotor gating-disruptive effects of apomorphine after lesions of medial prefrontal cortex or ventral hippocampus in adult rats. Psychopharmacology, 122, 27–34. Szabo, J. (1980). Organization of the ascending striatal afferents in monkeys. Journal of Comparative Neurology, 189, 307–321. Timofeev, I., Grenier, F., & Steriade, M. (1998). Spike-wave complexes and fast components of cortically generated seizures: Pt. IV. Paroxysmal fast runs in cortical and thalamic neurons. Journal of Neurophysiology, 80, 1495–1513. Tisch, S., Silberstein, P., Limousin-Dowsey, P., & Jahanshahi, M. (2004). The basal ganglia: Anatomy, physiology, and pharmacology. Psychiatric Clinics of North America, 27, 757–799. Tombaugh, T. N., Szostak, C., & Mills, P. (1983). Failure of pimozide to disrupt the acquisition of light-dark and spatial discrimination problems. Psychopharmacology, 79(2/3), 161–168.
Sherman, S. M., & Guillery, R. W. (1996). Functional organization of thalamocortical relays. Journal of Neurophysiology, 76, 1367–1395.
Tombaugh, T. N., Tombaugh, J., & Anisman, H. (1979). Effects of dopamine receptor blockade on alimentary behaviors: Home cage food consumption, magazine training, operant acquisition, and performance. Psychopharmacology, 66, 219–225.
Simonyan, K., & Jurgens, U. (2002). Cortico-cortical projections of the motorcortical larynx area in the rhesus monkey. Brain Research, 949(1/2), 23–31.
Vertes, R. P. (2006). Interactions among the medial prefrontal cortex, hippocampus and midline thalamus in emotional and cognitive processing in the rat. Neuroscience, 142, 1–20.
Smiley, J. F., & Goldman-Rakic, P. S. (1993). Heterogeneous targets of dopamine synapses in monkey prefrontal cortex demonstrated by serial section electron microscopy: A laminar analysis using the silverenhanced diaminobenzidine sulfide (seds) immunolabeling technique. Cerebral Cortex, 3, 223–238.
Vives, F., & Mogenson, G. J. (1985). Electrophysiological evidence that the mediodorsal nucleus of the thalamus is a relay between the ventral pallidum and the medial prefrontal cortex in the rat. Brain Research, 344, 329–337.
Smith, Y., & Bolam, J. P. (1990). The output neurones and the dopaminergic neurones of the substantia nigra receive a gaba-containing input from the globus pallidus in the rat. Journal of Comparative Neurology, 296, 47–64.
c22.indd 452
Vonsattel, J. P., & DiFiglia, M. (1998). Huntington disease. Journal of Neuropathology and Experimental Neurology, 57, 369–384. Voorn, P., Vanderschuren, L. J., Groenewegen, H. J., Robbins, T. W., & Pennartz, C. M. (2004). Putting a spin on the dorsal-ventral divide of the striatum. Trends in Neurosciences, 27, 468–474.
8/17/09 2:21:07 PM
References Wakai, M., Takahashi, A., & Hashizume, Y. (1993). A histometrical study on the globus pallidus in Huntington’s disease. Journal of the Neurological Sciences, 119, 18–27.
Yang, C. R., Seamans, J. K., & Gorelova, N. (1999). Developing a neuronal model for the pathophysiology of schizophrenia based on the nature of electrophysiological actions of dopamine in the prefrontal cortex. Neuropsychopharmacology, 21, 161–194.
Weiner, D. M., Burstein, E. S., Nash, N., Croston, G. E., Currier, E. A., Vanover, K. E., et al. (2001). 5-hydroxytryptamine2a receptor inverse agonists as antipsychotics. Journal of Pharmacology and Experimental Theraupeutics, 299, 268–276.
Yim, C. Y., & Mogenson, G. J. (1988). Neuromodulatory action of dopamine in the nucleus accumbens: An in vivo intracellular study. Neuroscience, 26, 403–415.
Weiskrantz, L. (1956). Behavioral changes associated with the ablation of the amygdaloid complex in monkeys. Journal of Comparative and Physiological Psychology, 49, 381–391.
Yokel, R. A., & Wise, R. A. (1976). Attenuation of intravenous amphetamine reinforcement by central dopamine blockade in rats. Psychopharmacology, 48, 311–318.
Wilson, R. S., & Garron, D. C. (1979). Cognitive and affective aspects of Huntington’s disease. Advances in Neurology, 23, 193–201.
Young, A. M., Ahier, R. G., Upton, R. L., Joseph, M. H., & Gray, J. A. (1998). Increased extracellular dopamine in the nucleus accumbens of the rat during associative learning of neutral stimuli. Neuroscience, 83, 1175–1183.
Wise, R. A. (1978). Neuroleptic attenuation of intracranial self-stimulation: Reward or performance deficits? Life Sciences, 22, 535–542. Wise, R. A. (1982). Neuroleptics and operant behavior: The anhedonia hypothesis. Behavior Brain Science, 39–88. Wise, R. A. (2004). Dopamine, learning and motivation. Nature Reviews: Neuroscience, 5, 483–494. Wise, R. A., & Rompre, P. P. (1989). Brain dopamine and reward. Annual Review of Psychology, 40, 191–225. Wise, R. A., Spindler, J., deWit, H., & Gerberg, G. J. (1978, July 21). Neuroleptic-induced “anhedonia” in rats: Pimozide blocks reward quality of food. Science, 201, 262–264. Wise, R. A., Spindler, J., & Legault, L. (1978). Major attenuation of food reward with performance-sparing doses of pimozide in the rat. Canadian Journal of Psychology, 32, 77–85. Woodward, D. J., Chang, J. Y., Janak, P., Azarov, A., & Anstrom, K. (1999). Mesolimbic neuronal activity across behavioral states. Annals of the New York Academy of Sciences, 877, 91–112. Wright, C. I., Beijer, A. V., & Groenewegen, H. J. (1996). Basal amygdaloid complex afferents to the rat nucleus accumbens are compartmentally organized. Journal of Neuroscience, 16, 1877–1893. Yang, C. R., & Seamans, J. K. (1996). Dopamine d1 receptor actions in layers v-vi rat prefrontal cortex neurons in vitro: Modulation of dendriticsomatic signal integration. Journal of Neuroscience, 16, 1922–1935.
c22.indd 453
453
Young, A. M., Joseph, M. H., & Gray, J. A. (1993). Latent inhibition of conditioned dopamine release in rat nucleus accumbens. Neuroscience, 54, 5–9. Zahm, D. S. (1989). The ventral striatopallidal parts of the basal ganglia in the rat: Pt. II. Compartmentation of ventral pallidal efferents. Neuroscience, 30, 33–50. Zahm, D. S. (1999). Functional-anatomical implications of the nucleus accumbens core and shell subterritories. Annals of the New York Adademy of Sciences, 877, 113–128. Zahm, D. S., & Brog, J. S. (1992). On the significance of subterritories in the “accumbens” part of the rat ventral striatum. Neuroscience, 50, 751–767. Zahm, D. S., & Heimer, L. (1988). Ventral striatopallidal parts of the basal ganglia in the rat: Pt. I. Neurochemical compartmentation as reflected by the distributions of neurotensin and substance p immunoreactivity. Journal of Comparative Neurology, 272, 516–535. Zahm, D. S., & Heimer, L. (1990). Two transpallidal pathways originating in the rat nucleus accumbens. Journal of Comparative Neurology, 302, 437–446. Zahm, D. S., Williams, E., & Wohltmann, C. (1996). Ventral striatopallidothalamic projection: Pt. IV. Relative involvements of neurochemically distinct subterritories in the ventral pallidum and adjacent parts of the rostroventral forebrain. Journal of Comparative Neurology, 364, 340–362.
8/17/09 2:21:07 PM
Chapter 23
Neural Perspectives on Activation and Arousal NICHOLAS D. SCHIFF AND DONALD W. PFAFF
in terms of the extent of usage of metabolic energy by the individual activity during behavioral activity or behavioral response (pp. 17–18). Pfaff (2006) hypothesized that underlying the activation of behavior is a set of CNS mechanisms that support generalized arousal. An animal or human with higher generalized arousal (a) is more responsive to sensory stimuli in all modalities, (b) emits more voluntary motor activity, and (c) is more reactive emotionally. This definition is precise, complete, and yields quantitative, physical measures. In a meta-analysis of five experiments directed toward measurements of mouse behaviors reflecting arousal, principal components analysis indicated that about one-third of the variance was due to a generalized arousal factor (Garey et al., 2003). Our generalized arousal concept maps onto established neurology of consciousness and its disorders (Plum & Posner, 1982; Posner, Saper, Schiff, & Plum, 2007). Some of the modern approaches to disorders of consciousness are reviewed briefly next. Likewise, circadian rhythms of behavioral activity, including but not limited to the sleep/ wake cycle, must depend on fluctuations of generalized CNS arousal. The relation of our arousal concept to stress is more complex. Arousal is a valence-free force that supplies the energy of stress responses. However, there is an asymmetry between these two neurobiological concepts. Whereas all stressful stimuli must be arousing, not all arousing stimuli are stressful (Pfaff, Martin, & Ribeiro, 2007). We note that the slopes of avoidance functions are sometimes higher than the slopes of approach functions, so that greater arousal would yield greater negative affective activation, but this inequality in behavioral result does not gainsay the concept that generalized arousal itself is without valence. Finally, it is clear that generalized arousal is necessary to provide the strength and persistence of emotional response related to the primitive, physiological side of libido, the aspect of Freud’s libido that has been conserved from the animal into the human brain (Pfaff, 1999).
Most of the work on biologically or psychologically motivated behaviors in animals and humans has been done using specific forms of motivational conditions such as hunger, thirst, sex, or fear. A different line of thought and experimentation, however, has concentrated on the fundamental ability and, indeed, the need of animals and humans to initiate general activity (reviewed in Cofer & Appley, 1964). Apparently, spontaneous activity is a universal phenomenon. In addition, animals in various states of need often show greater activity, perhaps in the service of finding the corresponding incentive object to reduce drive. Whether there is an independent motivational system demanding activity has been debated without a clear conclusion. However, it is notable that biologists can breed animals to achieve substrains that have high levels of activity versus low. Cofer and Appley recognize that in some cases high levels of activity, for example during food deprivation, are not achieved in vacuo, but instead they are achieved by virtue of increased sensitivity to sensory stimulation. A closely related question is whether there is a separate need for exploration, in other words, a need for environmental stimulation. Early theorists such as Donald Hebb (1955) worked on the assumption that many exploratory movements by animals would have the effect of increasing one kind of stimulation and decreasing another kind, or, perhaps, simply achieving more stimulation in general. We recognize the feeling of restlessness as a result of boredom, often perceived as unpleasant and even stressful. Thus, the need for stimulation and activity makes intuitive sense. Elizabeth Duffy, prominent in this field by virtue of her empirical work as well as her theoretical writing (1962) stated that “for performances of many kinds, there appears to be an optimal level of activation at which the performance reaches its greatest excellence” (p. 158), and we note that this optimum point varies according to the task and according to the individual. Duffy (1962) defines the activation of behavior not in terms of cortical arousal or autonomic arousal but instead 454
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c23.indd 454
8/17/09 2:21:26 PM
Neuroanatomical, Physiological, Genomic Mechanisms 455
This chapter on arousal and the activation of behavior seems timely because of what we perceive as a sea change in neuroscience. During the twentieth century almost exclusive emphasis was placed on the specificity of response to sensory stimuli and on the regulation of specific, simple reflex responses. Thus, Hubel and Wiesel studied the visual system, while Mountcastle studied cortical responses to somatosensory stimulation. Meanwhile, neurophysiologists such as John Eccles and David P. C. Llyor charted the circuitry and synaptology underlying specific spinal reflex responses. However, now, there is a much greater emphasis on changes of CNS state—sleep/ wake, mood, motivational states, and so on—that, in fact, tend to be of much greater medical importance. First, we refer briefly to a literature that describes some of the mechanisms underlying generalized CNS arousal, a literature that depends heavily on animal research. Second, we cover modern neurological experience with efforts to bring patients who lack normal arousal and self-awareness to a point where they can initiate purposeful behaviors. Finally, we identify some of the key issues for new research. NEUROANATOMICAL, PHYSIOLOGICAL, GENOMIC MECHANISMS Decades of work have gone into the elucidation of the neuroanatomical pathways and neurophysiological mechanisms related to arousal and stress. They have been well reviewed (e.g., Pfaff, 2006) and will be treated here only briefly. Mechanisms for generalized arousal of the CNS are fairly well known at the neuroanatomical level. Their most important features emphasize multiplicity and redundancy of ascending arousal pathways in such a way as to prevent failure. Five major neurochemically distinct systems all work together to increase arousal. They use norepinephrine, dopamine, serotonin, acetylcholine, and histamine as transmitters. They all begin in the brain stem and converge in the thalamus or in the basal forebrain. They overlap and cooperate. Their very multiplicity ensures against failure. Four sensory systems feed ascending arousal pathways in a straightforward fashion. These clearly show how vestibular stimuli, somatosensory, auditory, and taste stimuli on the tongue could arouse an animal or human being. Pain mechanisms further dramatize how a vastly amplified somatosensory signal from the skin or the viscera can wake up and alert an individual. Moreover, pain pathways and sexual cutaneous signals overlap and share the ability to cause states of high arousal. In contrast, electrical impulses triggered by odor stimuli enter the brain through tracts in the basal forebrain, and project to a primary receiving zone which itself is connected with high degrees of arousal,
c23.indd Sec1:455
during both sex and fear—the amygdala. Visual stimuli impact CNS arousal pathways both through the outer layers of the superior colliculus and through the reticular and medial cell groups of the thalamus. An important point, to reiterate, is that these various arousal-related transmitter systems and sensory signals converge. Whether in the basal forebrain or in the medial thalamus, a strong signal for cortical arousal is generated and must be distributed broadly in the cerebral cortex to command the attention of a wide variety of higher-level perceptual processers and motor control cell groups. In terms of the general principles illustrated by CNS arousal pathways, it is eminently clear that arousal mechanisms are bilateral (B). Unilateral damage in the animal brain or human brain has little effect on generalized arousal or consciousness. Second, they are bidirectional (B). In addition to the classical aminergic ascending pathways just mentioned, there are crucial descending pathways (e.g., vasopressin, histamine, and orexin). Third, these pathways have been conserved across a variety of species, including humans; they are universal (U). Finally, these pathways always potentiate an animal’s or human’s behavioral responsivity (response potentiation, RP). One of us (DP) has informally described this formulation as BBURP theory, schematically illustrated in Figure 23.1. These responses may be active, approach responses but in the case of fearful or stressful inputs, these are avoidance responses.
POA
OLF
OLF
POA
PVN
PVN
T-C T-C
BF
BF
Figure 23.1 Bilaterally symmetric, bidirectional (descending as well as ascending) pathways regulating CNS arousal. Note: We propose that these systems are universal among mammalian brains and that they are always necessary for response potentiation, whether approach responses or avoidance responses. More phylogenetically ancient systems work through the basal forebrain (BF), whereas thalamo-cortical (T-C) systems evolved later. Descending systems sketched here as examples include those consequent to olfactory (OLF) stimulation, and those from the preoptic area (POA) and the paraventricular nucleus (PVN) of the hypothalamus.
8/17/09 2:21:26 PM
456
Neural Perspectives on Activation and Arousal
A current hypothesis states that the most elementary, primitive, and universal set of cells that initiate the activation of behavior have their cell bodies in the medullary reticular formation and influence the generalized arousal of the CNS through bifurcating axons that ascend the brain stem (for cortical arousal) and descend to the cord (for autonomic arousal). Without adequate levels of this elementary, primitive arousal, it is impossible to be alert. Without adequate alertness, it is impossible to pay attention. Older concepts of autonomic arousal pictured two extreme states: total dominance by sympathetic autonomic mechanisms versus total dominance by parasympathetic mechanisms. The two autonomic systems were thought always to oppose each other. We now understand that the situation is much more complicated than that. In several examples, for example, erection and ejaculation by the male, and control over the anal sphincter in both sexes, temporally patterned coordination between the two systems achieve the desired physiological result. Berntson, Cacioppo, and Quigley (1991) reviewed the relevant literature and concluded that various autonomic states are best described in a two-dimensional space (rather than laying along a single continuum). This concept allowed them to account for a larger percentage of the variance in psychophysiological studies than had previously been achieved. Correspondingly, it suggests a greater sophistication and complexity of medullary neuronal integration than had been supposed because nerve cell groups in, for example, the dorsomotor nucleus of the vagus and rostral ventrolateral cells of the medulla would have to be well coordinated. We note that in several experimental situations individual mechanisms and measures of arousal are not highly correlated with each other. On the one hand, this fact emphasizes that global CNS arousal mechanisms are not simply monolithic. Different ascending monoaminergic systems, for example, contribute differently to arousal states. Noradrenergic pathways certainly support increased attention; dopaminergic pathways support directed motor acts toward salient stimuli; and serotonergic systems are intimately related to emotional regulation. On the other hand, we have elsewhere (Pfaff, 2006) envisioned this lack of correlation in the following way: By analogy to the movements of different members of an athletic team, different components of CNS arousal mechanisms may be coordinated in an adaptive fashion even when they are not regulated in a manner identical to each other. In neurophysiological terms, cells involved in generalized arousal of the CNS would be expected to respond to a variety of stimuli in several sensory modalities. During electrophysiological recordings from reticular and raphe neurons in the medulla, such neurons have been found (Hubscher & Johnson, 2002; Leung & Mason, 1998, 1999;
c23.indd 456
Martin, Pavlides, & Pfaff, unpublished data). Moving anterior in the brain stem, certain pontomedullary reticular neurons (Peterson, Anderson, & Filion, 1974) as well as the omnipause neurons (Phillips, Ling, & Fuchs, 1999) recorded in the pons, also fit the requirement that cells be multimodal in their range of sensitivities and have firing rates correlated with arousal and visual attention. In the midbrain, Horvitz, Stewart, and Jacobs (1997), recording from dopaminergic neurons, revealed responses that were correlated with the activation of behavior, that is, the initiation of motor responses directed toward salient stimuli. These and many other reports supply the neurophysiological basis of generalized arousal responses. Functional Genomics Data are accumulating rapidly with respect to neurochemical and genomic mechanisms for both arousal and stress. A large number of genes, more than 120, participate in regulating generalized CNS arousal. The large number is due to the inclusion of genes encoding synthetic enzymes, receptors (for serotonin, alone, there are 14), transporters, and catabolic enzymes for both the relevant neurotransmitters and neuropeptides; both those increasing and those decreasing arousal (Pfaff, 2006). As might be expected, sex hormones are involved in CNS arousal. Disruption of the gene encoding estrogen receptor alpha severely reduced arousal measures in female mice, compared to their wildtype littermate controls. Disruption of the gene for estrogen receptor beta, a likely gene duplication product, had no significant effect (Garey et al., 2003). There are additional implications of having so many genes controlling arousal mechanisms. The heterogeneity among the genes involved presumably provides for great flexibility of response. The very multiplicity yields the possibility of large numbers of meaningful patterns of gene expression. In a neuroendocrine context, we have shown that one never could understand gene/behavior relations on a one-by-one basis. Moving beyond Beadle and Tatum’s concept from their work with the fungus Neurospora— their classical one gene/one enzyme concept—we reached the conclusion that different patterns of gene expression yield different patterns of sociosexual behaviors (Pfaff et al., 2002).
EVOLUTION OF CNS AROUSAL PATHWAYS Certain aspects of the literature on ascending CNS arousal pathways have highlighted distinctions among them. In particular, some workers would protest that pathways activating the basal forebrain cholinergic neurons are the only
8/17/09 2:21:26 PM
Clinical Study: Forebrain Arousal Regulation Mechanisms and Neurological Disorders of Consciousness 457
important ones, while others would argue that thalamocortical systems are the most important. We propose an evolutionary approach to these questions, emphasizing two main principles. First, that as brain stem reticular mechanisms evolved, both in their internal structure and in their connectivities with the forebrain, new layers of capacities for aroused, alert, and even conscious behaviors likewise emerged. More primitive capacities governing autonomic arousal and the most rudimentary form of attention to simple stimuli are required for later developed, higher capacities, but not vice versa. This neuroanatomical and functional idea can also be expressed in the form of an equation:
Cerebral cortex
Thalamus
Cerebellum Basal forebrain Hypothalamus
Spinal cord
A Fg(Ag) (F1(As ) F2(As ). . . . . . Fsn(Asn)) 1
2
Figure 23.2 Human brain from left side.
(23.1)
where the overall state of CNS arousal, A, is a function of the most primitive arousal force, generalized arousal (Ag), multiplied by the effects of several specific states or arousal As. This equation represents a mathematical statement of a hypothesis. We must determine experimentally how different sources of arousal interact in order to produce an overall CNS arousal state. However, one thing is clear: In this equation, if generalized arousal goes to zero, no behavioral response will take place. Second, there is no need to set basal forebrain and thalamocortical systems against one another in a false competition for the designation of “most important.” Instead, we propose that the older ascending CNS arousal pathway uses the basal forebrain route. For example, Jones (2003) carefully charted axons ascending from the lower brain stem reticular formation that followed a “low road” into the medial forebrain bundle, continuing into the basal forebrain “where fibers were visible in the lateral preoptic area, substantia innominata, and the nuclei of the horizontal and vertical limbs of the diagonal band.” Clearly, such a system is in place to affect the activity of basal forebrain cholinergic neurons, whose influences on cerebral cortical activity are powerful and subtle (Xiang, Huguenard, & Prince, 1998). In addition, ascending aminergic systems have projections through the medial forebrain bundle, some axons of which reach the cerebral cortex. Of course, the more recently evolved ascending CNS arousal pathway would involve the thalamus. For example, in Jones’ and Yang’s (1985) neuroanatomical work, axons from neurons in the hindbrain reticular formation took a “high road.” They “passed into the internal medullary lamina of the thalamus and left collaterals in the intralaminar nuclei (parafascicular, paracentral, centrolateral, and centromedial), and midline nuclei (rhomboid and reunions).” In mice with reduced
c23.indd Sec2:457
Note: Looking at a sketch of the human brain from the left side, one can see that the more ancient sets of arousal-facilitating axons that travel the medial forebrain bundle of the hypothalamus to innervate the basal forebrain, have been preserved. Especially striking, however, is the elaboration of projections from the medial and intralaminar thalamus. Stimulation of these thalamo-cortical systems can increase arousal in mice and in human patients.
arousal responses consequent to anoxia, electrical stimulation of the midline thalamus can activate greater responses (Arietta-Cruz and Pfaff, unpublished data). The distinction between the more ancient (‘low road’) and the more recently evolved (high road) pathways are easily schematized on a drawing of the human brain (Figure 23.2).
CLINICAL STUDY: FOREBRAIN AROUSAL REGULATION MECHANISMS AND NEUROLOGICAL DISORDERS OF CONSCIOUSNESS Arousal Regulation Mechanisms In the human brain, arousal regulation appears to be strongly dependent on the integrity of prefrontal and frontal lobe systems that have descending projections to brain stem and basal forebrain neurons identified as the primary arousal systems. Even mild brain trauma to the frontal lobes can be associated with decreased vigilance and fatigue, while cognitive control of behaviors that tax attentional and working memory resources require engagement of distributed neuronal systems within the frontal lobe (Knight & Stuss, 2003). The key intermediary structure mediating effects across the cortex and basal ganglia of adjustments in arousal level is the central thalamus (Steriade & Glenn, 1982; reviewed in Schiff & Purpura, 2002). Neurons within the central thalamus (thalamic intralaminar nuclei, and related paralaminar
8/17/09 2:21:27 PM
458
Neural Perspectives on Activation and Arousal
association nuclei including median dorsalis, ventral anterior, and ventral lateral) are uniquely specialized in terms of their wide point-to-point connections across cortical areas, innervation of the upper (supergranular layers) of the cerebral cortex contacting dendritic elements of both input and output neurons within the cortical column, and providing strong activating inputs to striatal neurons of the basal ganglia (Lacey, Bolam, & Magill, 2007; Purpura & Schiff, 1997). As illustrated in Figure 23.2, in the human brain the brain stem and basal forebrain arousal systems project strongly widely to the cerebral cortical and most heavily to these central thalamic neurons that play an essential role in arousal and activation of the forebrain per se. The intrinsic organization of these arousal regulation pathways in the human brain is also suggested by the effect of selective injuries that may produce global disorders of consciousness (Schiff & Plum, 2000). Subcortical injuries associated with global disorders of consciousness prominently involve the central thalamus (the intralaminar nuclei and paralaminar regions of the thalamus), the caudate nuclei of the striatum, basal forebrain, and the mesencephalic reticular formation and upper pontine brain stem reticular regions. The pioneering work of Morison and Dempsey (1942) and Moruzzi and Magoun (1949) assigned the mesencephalic reticular formation and the thalamic intralaminar nuclei the role of mediating arousal and setting the stage for sensory processing in higher integrative brain functions. Moruzzi and Magoun’s experiments showed that electrical stimulation of these mesodiencephalic structures produced electroencephalographic (EEG) desynchronization and behavioral arousal in anesthetized animals. This classical view of forebrain arousal has given way to an understanding that overall shift of shift of spectral content reflected in the EEG and increased behavior activity level associated with higher level of arousal is interdependent on the output of cholinergic, serotoninergic, adrenergic, and histaminergic nuclei located predominantly in the brain stem, basal forebrain, and posterior hypothalamus (Marracco, Witte, & Davidson, 1994; McCormick, 1994; Steriade, 1997). Forebrain arousal is now viewed in terms of global modulations of the thalamocortical system that define specific functional states (McCormick, 1994; Steriade & Llinas, 1988). Although several studies have sought to determine how necessary or sufficient for arousal different neuronal groups may be no study has provided compelling evidence that any single group is indispensable (see Steriade, 1997). Berntson, Shafi, and Sarter (2002) used an immunotoxin to lesion corticopetal cholinergic neurons while sparing septo-hippocampal neurons and found that, compared to control animals, there was a significant reduction in the spectral power of high-frequency EEG activity that is
c23.indd 458
typical of the aroused alert state. Rather than affecting the portion of the day occupied by sleeping or waking behaviors, these lesions reduced high frequency activity across all stages, sleeping and waking. Working with both alert and anaesthetized cats, Skinner and Yingling (1977) pioneered a series of experiments that examined the integrative physiology of the mesencephalic reticular formation, the reticular thalamic nucleus, and the medial thalamic-mesial frontal cortical systems (including the thalamic intralaminar nuclei and related thalamic association nuclei). These investigators proposed that gating of attention was achieved by medial thalamo-frontal cortical and mesencephalic reticular formation modulation of strong inhibition of thalamic relay nuclei by the gabaergic neurons of the reticular thalamic nucleus. Human functional neuroimaging studies demonstrate coactivation of the upper midbrain and thalamic regions consistent with anatomical model during attentional processing (Kinomura, Larssen, Gulyas, & Roland, 1996) and transitions to wakefulness (Balkin et al., 2002). In these studies, increased regional blood flow in the pontine/mesencephalic reticular formation and central thalamus is correlated with increased activations of prefrontal, frontal, and parietal and primary sensory cortices during both periods of increased vigilance and awakening. Neurological Disorders of Consciousness Perhaps unsurprising in light of the previous discussion, only relatively large bilateral injuries to the dorsal regions of the upper pons, midbrain, or central regions of the thalamus can produce unconsciousness in humans on the basis of a small structural brain injury. Coma (an unresponsive brain state with no cyclical variation in arousal as judged behaviorally and by EEG content) arising from focal injuries is usually quite brief lasting only hours or a few days, reflecting the multiplicity and pleuripotentiality of the arousal pathways. Brain stem injuries producing coma are concentrated bilaterally in the rostral pons and dorsal midbrain, regions containing the cholinergic neurons and other monoaminergic afferents from the brain stem arousal projecting to the basal forebrain, central thalamus, and wide territories of the cerebral cortex. Bilateral focal injury to the central thalamus alone can induce coma when the injuries involve both sides of the brain often including damage to the upper midbrain. Most of the time, however, coma is the result of widespread damage to the brain from trauma or anoxic/hypoxic/ ischemic injury (Posner et al., 2007). What is not well known is that the same central thalamic structures index the recovery from such multifocal brain injuries. Recovery following coma may or may not be complete, with some
8/17/09 2:21:27 PM
References 459
patients remaining in vegetative state, minimally conscious state, and other related clinical syndromes. These different conditions are often confused with coma, but each is distinct in clinical presentation and natural history following coma and may reflect both the functional integrity of the central thalamus and related frontal systems. Patterns of structural injuries producing vegetative state—a condition with cyclic eye opening and closure but no behavioral responsiveness—overlap with those producing coma. Autopsy studies of patients remaining permanently unconscious show overwhelming loss of thalamic neurons, with extensive and specific loss of central thalamic neurons, particularly the neurons of the thalamic intralaminar nuclei and closely adjacent components of thalamic association nuclei (Adams, Graham, & Jennett, 2000). Vegetative state is differentiated from minimally conscious state by evidence of unequivocal but inconsistent evidence of awareness of self or the environment that may range from isolated tracking of objects in the visual field to high-level responses such as intermittent verbalization and communication (Giacino & Whyte, 2005). Pathological studies reveal many different underlying structural pathologies with a consistent feature of loss of central thalamic neurons particularly the rostral intralaminar nuclei (Maxwell, MacKinnon, Smith, McIntosh, & Graham, 2006). Patients may remain in a minimally conscious state yet retain recruitable large-scale cerebral networks that appear to be underactivated (Schiff et al., 2005). In a single-subject study of a patient who remained in a minimally conscious state for 6 years, electrical brain stimulation of the central thalamus restored a variety of integrative behaviors including spoken language, attentive behavior, motor control, and eating (Schiff et al., 2007). Localization of electrical activity in the EEG suggested modulation of midline frontal systems during stimulation consistent with reactivation of the frontalcentral thalamic arousal regulation circuit.
SUMMARY Among a large number of outstanding questions about CNS arousal and the activation of behavior, we list just a few here as a brief summary of where we are and where we need to be. First, is our evolutionary theory about more ancient basal forebrain pathways and more recently developed thalamocortical pathways really correct, or are some medial thalamic mechanisms equally primitive? In any case, how do these two types of pathways interact? Do their effects on cortical arousal add? Multiply? Or do they interfere with each other?
c23.indd Sec3:459
Second, large numbers of genes are seen as participating in the regulation of arousal (Pfaff, 2006). How do they interact with environmental determinants of purposeful, motivated behaviors? Do their separate identities portend adjunct neurochemical therapies that could accompany and enhance the effects of deep brain stimulation as demonstrated in the Schiff et al. (2007) paper? Third, regarding such deep brain stimulation, can we improve on standard stimulation parameters (e.g., fixed numbers of pulses per second) by discovering and exploiting the nonlinear dynamics of CNS arousal systems? For example, it has been proposed that arousal-related neurons in the brain stem are susceptible to chaotic dynamics, but that they operate close to a phase transition that they cross when coming under the control of well-organized motor control mechanisms (Pfaff & Banavar, 2007). Fourth, as data are collected in coming years, how will Equation 23.1 have to be modified? The mathematical structure of CNS arousal is ready for investigation. Clinical Treatments Among the many problems of diagnosing and treating patients at various levels of vegetative states, three could be highlighted here in a logical sequence. First, in minimally conscious state patients who will recover, are there neurophysiological signatures that can be detected and that are early harbingers of recovery? Second, why do some patients have those neurophysiological signatures and not others? Third, will knowledge of such signatures be able to be used in clinical trials with deep brain stimulation with the purpose of facilitating recovery?
REFERENCES Adams, J. H., Graham, D. I., & Jennett, B. (2000). The neuropathology of the vegetative state after acute insult. Brain, 123, 1327–1338. Balkin, T. J., Braun, A. R., Wesensten, N. J., Jeffries, K., Varga, M., Baldwin, P., et al. (2002). The process of awakening: A PET study of regional brain activity patterns mediating the re-establishment of alertness and consciousness. Brain, 125(Pt. 10), 2308–2319. Berntson, G. G., Cacioppo, J., & Quigley, K. (1991). Autonomic determinism: The modes of autonomic control, the doctrine of autonomic space and the laws of autonomic constraint. Psychological Review, 98, 459–487. Berntson, G. G., Shafi, R., & Sarter, M. (2002). Specific contributions of the basal forebrain corticopetal cholinergic system to electroencephalographic activity and sleep/waking behaviour. European Journal of Neuroscience, 16, 2453–2461. Cofer, C., & Appley, M. (1964). Motivation: Theory and research. New York: Wiley. Duffy, E. (1962). Activation and behavior. New York: Wiley. Garey, J., Goodwillie, A., Frohlich, J., Morgan, M., Gustafsson, J.-A., Smithies, O., et al. (2003). Genetic contributions to generalized
8/17/09 2:21:27 PM
460
Neural Perspectives on Activation and Arousal
arousal of brain and behavior. Proceedings of the National Academy of Sciences, USA, 100, 11019–11022.
Pfaff, D., & Banavar, J. (2007). A theoretical framework for CNS arousal. BioEssays, 29, 803–810.
Giacino, J. T., & Whyte, J. (2005). The vegetative state and minimally conscious state: Current knowledge and remaining questions. Journal of Head Trauma Rehabilitation, 20(1), 30–50.
Pfaff, D., Martin, E., & Ribeiro, A. (2007). Relations between mechanisms of CNS arousal and mechanisms of stress. Stress, 10, 316–325.
Hebb, D. (1955). Drives and the CNS (conceptual nervous system). Psychiatric Review, 62, 243–254. Horvitz, J., Stewart, T., & Jacobs, B. (1997). Burst activity of ventral tegmental dopamine neurons is elicited by sensory stimuli in the awake cat. Brain Research, 759, 251–258. Hubscher, C., & Johnson, R. (2002). Inputs from spinal and vagal sources converge on individual medullary reticular neurons. Society for Neuroscience Abstracts, 28. Jones, B. E. (2003). Arousal systems. Frontiers in Biosciences 8,438–451. Kinomura, S., Larssen, J., Gulyas, B., & Roland, P. E. (1996, January 26). Activation by attention of the human reticular formation and thalamic intralaminar nuclei. Science, 271, 512–515. Knight, R., & Stuss, D. (2003). Principle of frontal lobe function. Oxford: Oxford University Press. Lacey, C. J., Bolam, J. P., & Magill, P. J. (2007). Novel and distinct operational principles of intralaminar thalamic neurons and their striatal projections. Journal of Neuroscience, 27, 4374–4384. Leung, C., & Mason, P. (1998). Physiological survey of medullary raphe and magnocellular reticular neurons in the anesthetized rat. Journal of Neurophysiology, 80, 1630–1646. Leung, C., & Mason, P. (1999). Physiological properties of raphe magnus neurons during sleep and waking. Journal of Neurophysiology, 81, 584–595. Marrocco, R. T., Witte, E., & Davidson, M. C. (1994). Arousal systems. Current Opinion in Neurobiology, 4, 166–170. Maxwell, W. L., MacKinnon, M. A., Smith, D. H., McIntosh, T. K., & Graham, D. I. (2006). Thalamic nuclei after human blunt head injury. Journal of Neuropathology and Experimental Neurology, 65, 478–488. McCormick, D. A. (1994). Neurotransmitter actions in the thalamus and cerebral cortex and their role in neuromodulation of thalamocortical activity. Progress in Neurobiology, 39, 337–388. Morison, R. S., & Dempsey, E. W. (1942). A study of thalamo-cortical relationships. American Journal of Physiology, 135, 281–292. Moruzzi, G., & Magoun, H. W. (1949). Brainstem reticular formation and activation of the EEG. Electroencephalography Clinical Neurophysiology, 1, 455–473. Peterson, B., Anderson, M., & Filion M. (1974). Responses of pontomedullary reticular neurons to cortical, tectal and cutaneous stimuli. Experimental Brain Research, 21, 19–44. Pfaff, D. (1999). Drive. Cambridge, MA: MIT Press. Pfaff, D. (2006). Brain arousal and information theory. Cambridge, MA: Harvard University Press.
c23.indd 460
Pfaff, D., Ogawa, S., Kia, K., N. Vasudevan, C. Krebs, J. Frolich, et al. (2002). Genetic mechanisms in controls over female reproductive behaviors. In D. W. Pfaff, A. P. Arnold, A. M. Etgen, S. E. Fahrbach, & R. T. Rubin (Eds.), Hormones, brain and behavior (pp. 441–510). San Diego, CA: Academic Press/Elsevier. Phillips, J., Ling, L., & Fuchs, A. (1999). Action of the brainstem saccade generator during horizontal gaze shifts: Pt. I. Discharge patterns of omnidirectional pause neurons. Journal of Neurophysiology, 81, 1284–1295. Plum, F., & Posner, J. B. (1982). The diagnosis of stupor and coma (3rd ed.). Philadelphia: Davis. Posner, J., Saper, C., Schiff, N., & Plum, F. (2007). Plum and Posner ’s Diagnosis of stupor and coma (4th ed.). Oxford: Oxford University Press. Purpura, K. P., & Schiff, N. D. (1997). The thalamic intralaminar nuclei: Role in visual awareness. Neuroscientist, 3, 8–14. Schiff, N. D., Giacino, J. T., Kalmar, K., Victor, J. D., Baker, K., Gerber, M., et al. (2007, August 2). Behavioral improvements with thalamic stimulation after severe traumatic brain injury. Nature, 448, 600–603. Schiff, N. D., & Plum, F. (2000). The role of arousal and ‘gating’ systems in the neurology of impaired consciousness. Journal of Clinical Neurophysiology, 17, 438–452. Schiff, N. D., & Purpura, K. P. (2002). Towards a neurophysiological basis for cognitive neuromodulation. Thalamus and Related Systems, 2(1), 55–69. Schiff, N. D., Rodriguez-Moreno, D., Kamal, A., Kim, K. H., Giacino, J., Plum, F., et al. (2005). FMRI reveals large-scale network activation in minimally conscious patients. Neurology, 64, 514–523. Skinner, J. E., & Yingling, C. D. (1977). Central gating mechanisms that regulate event-related potentials and behavior. In J. E. Desmedt (Ed.), Progress in clinical neurophysiology: Attention, voluntary contraction and event-related cerebral potentials (Vol. 1, pp. 30–69). Basel: Karger. Steriade, M. (1997). Thalamic substrates of disturbances in states of vigilance and consciousness in humans. In M. Steriade, E. Jones, & D. McCormick (Eds.), Thalamus (pp. 721–742). Elsevier, Amsterdam. Steriade, M., & Glenn, L. L. (1982). Neocortical and caudate projections of intralaminar thalamic neurons and their synaptic excitation from midbrain reticular core. Journal of Neurophysiology, 48, 352–371. Steriade, M., & Llinas, R. R. (1988). The functional states of the thalamus and the associated neuronal interplay. Physiological Reviews, 68, 649–742. Xiang, Z., Huguenard, R., & Prince, D. (1998, August 14). Cholinergic switching within neocortical inhibitory networks. Science, 281, 985–989.
8/17/09 2:21:28 PM
Chapter 24
Sleep and Waking Across the Life Span RETO HUBER AND GIULIO TONONI
A PRIMER ON SLEEP PHYSIOLOGY
skin near the eyes, detects small electrical fields generated by eye movements. The electromyogram (EMG), which is generally recorded from electrodes attached to the chin, is used to detect sustained (tonic) and episodic (phasic) changes in muscle activity that correlate with changes in behavioral state. In the course of the night, the EEG, EOG, and EMG patterns undergo coordinated changes that are used to distinguish among different sleep stages. A brief description of the major vigilance states follows:
Sleep can be defined behaviorally as a state of reduced responsiveness to the environment that is readily reversible. By this definition, sleep appears to be a rather universal phenomenon, being present in most if not all species investigated, from Drosophila melanogaster to humans (Borbély & Achermann, 2000; Shaw, Cirelli, Greenspan, & Tononi, 2000; Tobler, 2000). The introduction of continuous recordings of brain electrical activity (electroencephalogram, EEG) during sleep and wakefulness (Berger, 1929) has greatly enriched the study of sleep. Thus, rapid eye movement (REM) sleep was recognized as a specific state different from non-REM (NREM) sleep (Aserinsky & Kleitman, 1953). These two kinds of sleep are present in both mammals and birds (Dement & Kleitman, 1957). In humans, sleep is studied for clinical and research purposes by combining behavioral observations with electrophysiological recordings. The EEG records synchronous synaptic activity from millions of neurons underlying electrodes applied to the scalp (Figure 24.1). The electrooculogram (EOG), which is recorded from electrodes attached to the
Wakefulness
• Wakefulness: Wakefulness is reflected in the EEG by low-voltage, fast-frequency activities—also called a desynchronized or activated EEG. When eyes close in preparation for sleep, EEG alpha activity (8 to 13 Hz) becomes prominent, particularly in occipital regions. Such alpha activity is thought to correspond to an “idling” rhythm in visual areas. The waking EOG reveals frequent voluntary eye movements and eyeblinks. The EMG reveals tonic muscle activity with additional phasic activity related to voluntary movements. In wakefulness, most cortical neurons are steadily depolarized close to their firing threshold and
NREM Sleep Stage 2
REM Sleep Stage 3
EEG K-complex
Spindle
100 V
EOG
EMG 0
4
8
12
0
4
8
12 0 Seconds
4
8
12
0
4
8
12
into four substates of which stage 2 and 3 are illustrated. Stage 3 is also called slow-wave sleep. The typical EEG features of human stage 2 sleep, spindles and K-complexes, are highlighted by arrows. EEG calibration marks correspond to 100 V2/0.25Hz.
Figure 24.1 Vigilance states. Note. 12-s Electroencephalogram (EEG), electromyogram (EMG), and electrooculogram (EOG) traces are plotted for the three vigilance states wakefulness, NREM sleep and REM sleep. NREM sleep is subdivided
461
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c24.indd 461
8/18/09 5:51:40 PM
462
Sleep and Waking Across the Life Span
thus are ready to respond to stimuli. The steady depolarization is enabled by the release of acetylcholine and other neuromodulators, which close leakage potassium channels on the membrane of cortical neurons and force positively charged potassium ions to stay inside the cell. Neurons show irregular spiking patterns (Steriade, McCormick, & Sejnowski, 1993). The activity of other neuromodulatory systems, such as the noradrenergic system, enables the occurrence of plastic changes (Cirelli & Tononi, 2000, 2004). • NREM Sleep: Falling asleep is a gradual phenomenon of progressive disconnection from the environment: we stop responding to stimuli and, to the extent that we remain conscious, our experiences become largely independent of the current environment. This disconnection appears to be important because we make several behavioral adjustments to bring it about: we seek a quiet environment, find a comfortable position, and close our eyes. However, people (especially small children) can fall asleep in noisy environments and in uncomfortable positions—in the laboratory people have even slept with their eyes taped open. Everything else being equal, the threshold for responding to peripheral stimuli gradually increases with the succession of NREM sleep stages and remains high during REM sleep (Rechtschaffen, Hauri, & Zeitlin, 1966; Williams, Hammack, Daly, Dement, & Lubin, 1964). Sleep is usually entered through a transitional state, stage 1 (N1), characterized by loss of alpha activity and the appearance of a lowvoltage mixed-frequency EEG pattern with prominent theta activity (3 to 7 Hz). Eye movements become slow and rolling, and muscle tone relaxes. Although there is decreased awareness of sensory stimuli, a subject in N1 may deny that he was asleep. Motor activity may persist for a number of seconds during N1. Occasionally, individuals experience sudden muscle contractions (hypnic jerks), sometimes accompanied by a sense of falling and dreamlike imagery. Individuals deprived of sleep often have “microsleep” episodes that consist of brief (5 to 10 seconds) bouts of stage 1 sleep; these episodes can have serious consequences in situations that demand constant attention, such as driving a car. After a few minutes in N1, people usually progress to stage 2 (N2), followed, especially at the beginning of the night, by a period comprised of stage 3 (N3). N2 qualifies fully as sleep because people are partially disconnected from the environment, meaning that they do not respond to the events around them—their arousal threshold is increased. If stimuli are strong enough to wake them, people in N2 will confirm that they were asleep. During N2, the EEG shows prominent sleep spindles, brief sequences of waves at around 12 to 15 Hz.
c24.indd 462
During N3, the process of awakening is drawn out, and subjects often remain confused for some time. This change in arousal threshold is accompanied by a dramatic change in the EEG, which shows high-voltage, slowfrequency waves at around 1 to 2 Hz, which is why this stage is also known as slow wave sleep. During NREM sleep, the transition from the low-voltage, fast-activity EEG observed during wakefulness to the characteristic EEG of NREM sleep is due to the occurrence of brief periods of hyperpolarization, also called down states, in thalamocortical and cortical neurons. Down states are due to reduced activating input from ascending cholinergic and other neuromodualtory pathways (for reviews, see Llinas & Steriade, 2006; McCormick & Bal, 1997; Steriade et al., 1993), which is primarily due to and increase in leakage potassium conductances (McCormick & Pape, 1990). The resulting slow oscillation is found in virtually every cortical neuron and is synchronized across much of the cortical mantle by cortico-cortical connections, which is why the EEG records high-voltage, low frequency waves. Human EEG recordings using 256 channels have revealed that the slow oscillation behaves as a traveling wave that sweeps across a large portion of the cerebral cortex (Massimini, Huber, Ferrarelli, Hill, & Tononi, 2004). Sleep spindles occur during the depolarized phase of the slow oscillation and are generated in thalamic circuits involving the reticular thalamic nucleus as a consequence of cortical firing (Steriade et al., 1993). Imaging studies show that brain metabolism and blood flow are diffusely reduced during NREM sleep as compared to wakefulness (Braun et al., 1997), probably due to the repeated occurrence of down states characterized by synaptic silence. • REM Sleep: The eyes move little during NREM sleep, whereas REM sleep is characterized by bursts of typical rapid eye movements (Aserinsky & Kleitman, 1953). During REM sleep, the EEG shows low-voltage, fast activity similar to wakefulness, which is it is also referred to as paradoxical sleep (Jouvet, 1962, 1965, 1998). Also during this phase, muscles are paralyzed with occasional brief jerks. REM sleep is almost invariably accompanied by dreaming, though some mental activity also occurs during NREM sleep, especially during lighter stages and the morning hours (Hobson, Pace-Schott, & Stickgold, 2000). Neuroimaging studies show increased activity compared to NREM sleep, however, similarly to NREM sleep, activity in frontoparietal association cortices is reduced compared to wakefulness. This reduced activity in association cortices may explain why, during NREM and REM sleep, our thoughts are less logical.
8/18/09 5:51:40 PM
A Primer on Sleep Physiology
Sleep Architecture During adulthood, roughly 75% of total sleep time is spent in NREM sleep and 25% in REM sleep. We alternate between NREM and REM sleep throughout the night with each sleep cycle lasting between 90 and 120 minutes (Figure 24.2). Slow wave sleep is prominent early in the night, especially during the first sleep cycle, and diminishes as the night progresses. As slow-wave sleep wanes, periods of REM sleep lengthen. The proportion of time spent in each stage and the pattern of stages across the night is fairly consistent in normal adults. A healthy young adult will typically spend about 5% of the sleep period in N1 sleep, about 50% in N2 sleep, 20% to 25% in slow-wave sleep (N3), and 20% to 25% in REM sleep. These sleep cycles are an example of ultradian rhythms. Brain Centers Regulating Wakefulness and Sleep Though in the 1940s it was generally believed that sleep was a passive process, whereby a brain deprived of sensory input would fall asleep, it is clear from the well-regulated alternation of vigilance states that this is not the case. This point was nicely illustrated by an experiment in which the sensory afferents to an animal’s brain were blocked, nevertheless, the animal continued to exhibit cycles of sleep and waking (Moruzzi & Magoun, 1949). But what are the neural mechanisms controlling the alternation between sleep and waking? Two antagonistic sets of brain structures are responsible for orchestrating the regular alternation between wakefulness and sleep. The neuronal groups that promote wakefulness are located in the basal forebrain, posterior hypothalamus, and in the upper brain stem, whereas those promoting NREM sleep are located in the anterior hypothalamus and basal forebrain (Jones, 2003, 2005; McGinty et al., 2004; McGinty & Szymusiak, 2003; Saper, Scammell, & Lu, 2005; Szymusiak, Steininger, Alam, & McGinty, 2001). Other cellular groups in the dorsal part of the pons and in the medulla comprise the so-called REM
sleep generator (Figure 24.3; Basheer, Strecker, Thakkar, & McCarley, 2004; Chase & Morales, 1990; Jouvet, 1962, 1965, 1994; Kripke, Garfinkel, Wingard, Klauber, & Marler 2002; Siegel, 2005). The circadian clock, centered on the suprachiasmatic nucleus of the hypothalamus (SCN), exerts an overall control on many of these brain areas, to ensure that sleep occurs at the appropriate time of the 24-hour light-dark cycle (Aston-Jones, 2005; Mistlberger, 2005; Saper, Lu, Chou, & Gooley, 2005; Zee & Manthena, 2007). Maintenance of Wakefulness Maintenance of wakefulness is dependent on several heterogeneous cell groups extending from the upper pons and midbrain (the so-called reticular activating system, RAS; Lindsley, Bowden, & Magoun, 1949; Moruzzi & Magoun, 1949), to the posterior hypothalamus and basal forebrain. These cell groups are strategically placed so that they can release, over wide regions of the brain, neuromodulators and neurotransmitters that produce EEG activation, such as acetylcholine, hypocretin, histamine, norepinephrine, and glutamate. The main mechanism by which these neuromodulators and neurotransmitters produce cortical activation is by closing leakage potassium channels on the cell membrane of cortical and thalamic neurons, thus keeping cells depolarized and ready to fire. Falling Asleep As we seek a quiet, dark, and silent place to fall asleep, and close our eyes, the activity of the waking promoting neuronal groups is decreased due to reduced sensory input.
Cx
T OB
Cb
BF Ach GABA
Hy
M W R N1 N2 N3
ore H
Mi
Ach NA
glu glu
glu
P M 0
2
4 Hours
6
8
Figure 24.2 Hypnogram. Note. Time course of sleep stages during an 8-hour nocturnal sleep episode of a 23-year-old, healthy man. Waking (W), REM sleep (R), and NREM sleep (N) are discriminated. NREM sleep is divided into three substates (N1, N2, N3). M is movement time. 20-s time resolution was used.
c24.indd 463
463
Figure 24.3 Sleep centers. Note. The major brain areas involved in initiating and maintaining wakefulness (dark gray), NREM sleep (hatched), and REM sleep (light gray circle). Ach ⫽ acetylcholine; BF ⫽ basal forebrain; Cb ⫽ cerebellum; Cx ⫽ cortex; glu ⫽ glutamate; H ⫽ histamine; Hy ⫽ Hypothalamus; Me ⫽ medulla; Mi ⫽ midbrain; NA ⫽ noradrenaline; OB ⫽ orbitofrontal cortex; ore ⫽ orexin/hypecretin; P ⫽ pons; T = thalamus.
8/18/09 5:51:41 PM
464
Sleep and Waking Across the Life Span
In addition, several of these brain areas are actively inhibited by antagonistic neuronal populations located in the hypothalamus and basal forebrain, which become active at sleep onset. When the waking promoting neuronal groups become nearly silent, the decreasing levels of acetylcholine and other waking promoting neuromodulators and neurotransmitters lead to the opening of leak potassium channels in cortical and thalamic neurons, which become bistable and start exhibiting brief, recurring periods of hyperpolarization (down states). The importance of hypothalamic structures for sleep induction was recognized at the beginning of the twentieth century during an epidemic of a viral infection of the brain called encephalitis lethargica. von Economo concluded that if the infection destroyed the posterior hypothalamus, patients became lethargic, but if the anterior hypothalamus was lesioned, patients became severely insomniac (von Economo, 1930). Indeed, subsequent studies confirmed that cell groups within the anterior hypothalamus are involved in the initiation and maintenance of sleep. The ventrolateral preoptic area (VLPO) has been suggested as a possible sleep switch (Fuller, Gooley, & Saper, 2006; Sherin, Shiromani, McCarley, & Saper, 1996; Szymusiak, Alam, Steininger, & McGinty, 1998). However, many other neurons scattered through the anterior hypothalamus, for instance, in the median preoptic nucleus (Suntsova, Szymusiak, Alam, Guzman-Marin, & McGinty, 2002) and in the basal forebrain, also play a major role in initiating and maintaining sleep. These neurons tend to fire during sleep and stop firing during wakefulness. When they are active, many of them release GABA and the peptide galanin, and inhibit most waking-promoting areas, including cholinergic, noradrenergic, histaminergic, hypocretinergic, and serotonergic cells. In turn, the latter inhibit several sleep promoting neuronal groups (McGinty et al., 2004; McGinty & Szymusiak, 2003; Saper, Scammell, et al., 2005; Szymusiak et al., 2001). This reciprocal inhibition provides state stability, in that each state reinforces itself as well as inhibits the opponent state. Generation of REM Sleep The REM sleep generator consists of pontine cholinergic cell groups (LDT and PPT) that we have already encountered as waking-promoting areas, and of nearby cell groups in the medial pontine reticular formation and in the medulla (Basheer et al., 2004; Chase & Morales, 1990; Jouvet, 1962, 1965, 1994; Kripke et al., 2002; Siegel, 2005). Lesions in these areas eliminate REM sleep without significantly disrupting NREM sleep. REM sleep can also be eliminated by certain antidepressants, especially monoamine oxidase inhibitors. As we have seen, pontine cholinergic neurons produce EEG activation by releasing acetylcholine to the
c24.indd 464
thalamus and to cholinergic and glutamatergic basal forebrain neurons that in turn activate the limbic system and cortex. However, while during wakefulness other waking promoting neuronal groups, such as noradrenergic, histaminergic, hypocretinergic, and serotonergic neurons, are also active, they are inhibited during REM sleep. Other REM active neurons in the dorsal pons are responsible for the tonic inhibition of muscle tone during REM sleep. Finally, neurons in the medial pontine reticular formation fire in bursts and produce phasic events of REM sleep, such as rapid eye movements and muscle twitches. Molecular Correlates of Sleep and Wakefulness It might seem unlikely that this mere change from wakefulness to sleep should lead to changes in the expression of genes in the brain, but this is actually what happens, and on a massive scale. Hundreds of gene transcripts (messenger RNAs) are expressed at higher levels in the waking brain, and a different set of transcripts are expressed at higher levels in sleep (Cirelli, Gutierrez, & Tononi, 2004). Many of these molecular changes are specific to the brain, since they do not occur in other tissues such as liver and muscle. Transcripts upregulated during wakefulness code for proteins that help the brain to face high energy demand, high synaptic excitatory transmission, high transcriptional activity, as well as the cellular stress that may derive from one or more of these processes. Moreover, wakefulness is associated with the increased expression of several genes that are involved in long-term potentiation of synaptic strength, such as P-CREB, Arc, NGFI-A and BDNF (Cirelli, Pompeiano, & Tononi, 1996; Cirelli & Tononi, 2000). As has been seen, one reason these genes are expressed in wakefulness and not in sleep has to do with the release of norepinephrine, which is high during wakefulness, when animals make decisions and learn about the environment, but is low during sleep. By contrast, the genes that increase their expression during sleep include several that may be involved in long term depression of synaptic strength and possibly in synaptic consolidation (Cirelli et al., 2004). Other sleep-related genes favor protein synthesis, which is also increased in sleep. Finally, many sleep-related genes play a significant role in membrane trafficking and maintenance. Thus, these findings suggest that although sleep is a state of behavioral inactivity, it is associated not only with intense neural activity, but also with the increased expression of many genes that may favor specific cellular functions. Sleep Quality and Sleep Homeostasis It was discovered early on that arousal thresholds—measured, for example, as the duration of an acoustic stimulus required
8/18/09 5:51:41 PM
A Primer on Sleep Physiology
465
Sleep - Waking Cycle
Figure 24.4 The two-process model of sleep regulation.
S
Note. A homeostatic process, process S, and a circadian process, process C, interact to generate the sleepwake cycle. From “Timing of Human Sleep: Recovery Process Gated by a Circadian Pacemaker,” by S. Daan, G. M. B. Domien, and A. A. Borbély, 1984, American Journal of Physiology, 246, p. R163. Reprinted with permission.
C
Waking 7
Sleep 23
Waking 7
Sleep 23
to awaken a sleeping subject—is positively correlated with the amount of slow waves in the EEG of NREM sleep. It was also noticed that high amplitude slow waves predominate in the first two hours of sleep and decreases thereafter (Blake & Gerard, 1937). It was later shown that the amount of slow-wave sleep is positively correlated with the duration of prior waking (Webb & Agnew, 1971), suggesting that this aspect of sleep is homeostatically regulated. The positive relationship between slow waves and the duration of wakefulness is best seen under the influence of sleep deprivation. If we are not allowed to sleep and are forced to stay awake longer than usual, sleep pressure mounts and soon becomes overwhelming. Thus, sleep is homeostatically regulated: The more we stay awake, the longer and more intensely we sleep afterwards: arousal thresholds increase, there are fewer awakenings, and during NREM sleep the amplitude and prevalence of slow waves becomes much higher (see below). Two-Process Model of Sleep Regulation The two-process model of sleep regulation provides a conceptual framework that is frequently used in the interpretation of sleep studies. This model postulates that sleep propensity is determined by the interaction of a homeostatic process S and a circadian process C (Borbély, 1982; Figure 24.4). Process S increases during waking and decreases during sleep. An important advance has been the demonstration that Process S is reflected accurately by the amount of slow-wave activity (SWA, electroencephalographic [EEG] power in the low frequency range between 0.5 and 4.5 Hz) during NREM sleep (Borbély, 1982; Borbély & Achermann, 2000). As repeatedly shown in both humans and mammals, SWA increases exponentially with the duration of prior wakefulness and decreases exponentially during sleep, thus reflecting the accumulation of sleep pressure during wakefulness and its release during sleep (Figure 24.5). Therefore, the immediate history of sleep and waking determines the level of Process S.
c24.indd 465
7
1000 µV2
0
2
4
6
8
Hours
Figure 24.5 Slow-wave activity. Note. Time course of EEG slow-wave activity (power density in the 0.75 to 4.5 Hz frequency range) during an 8-hour nocturnal sleep episode of a 23-year-old, healthy man. The solid line indicates the exponential decline of SWA during the night.
In contrast, process C does not depend on the prior history of sleep but is generated by an intrinsic pacemaker located in the suprachiasmatic nuclei (SCN) of the hypothalamus. Process C is thought to modulate the timing of sleep episodes by enforcing an upper and a lower threshold so that whenever one of these thresholds is reached by process S a sleep episode is terminated or initiated. One important concept of the model is that NREM sleep loss can be recovered by an intensification of NREM sleep, reflected in an SWA increase, and not necessarily by an increase in duration. A second important concept is that the homeostatic and the circadian processes operate independently. This has been confirmed by sleep deprivation studies in SCN-lesioned rats. These animals no longer exhibit circadian modulation of sleep and wakefulness. Nevertheless sleep deprivation still results in an increase of SWA (Mistlberger, Bergmann, Waldenar, & Rechtschaffen, 1983; Tobler, Borbély, & Groos, 1983; Trachsel, Edgar, Seidel, Heller, & Dement, 1992). The two process model of sleep regulation has been tested under numerous experimental designs (Achermann, Dijk, Brunner, & Borbély, 1993; Daan et al., 1984) and in several mammalian species including: rats (Franken, Tobler, & Borbély, 1991), guinea pigs (Tobler, Franken, Trachsel, & Borbély, 1992), and mice (Huber, Deboer, & Tobler 2000). In these studies predictions of the time course of
8/18/09 5:51:41 PM
466
Sleep and Waking Across the Life Span
Process S are based on a mathematical model of its dynamics (Achermann & Borbély, 2003). Such an approach allows a precise quantification of the dynamics of Process S and has been used to search for genes underlying the homeostatic regulation of sleep (Franken, Chollet, & Tafti, 2001). Functions of Sleep Why we sleep—wasting in mindless slumber a third of our life—is one of the most mysterious questions in biology— one that still eludes a satisfactory scientific answer. The simplest possibility would be that sleep is just a time filler, a way to avoid trouble at times of day (or night) during which it is not safe to look for food or mates. Depending on the species, both the amount and the quality of sleep might be adjusted so as to fit the ecological niche. However, such ecological hypothesis seems at odds with some key observations. First, sleep appears to be universal. All animal species studied so far sleep, from invertebrates such as fruit flies and bees to birds and mammals. Even animals who need continuous vigilance while swimming or flying, for example, certain dolphins and migrating birds, have developed alternating unihemispheric sleep rather than eliminating sleep altogether. If sleep were dispensable, one would think that in such cases it would have disappeared. Second, sleep is carefully regulated. As we have seen, the longer we stay awake, the more and the more intensely do we need to sleep. This homeostatic regulation of sleep appears too to be universal, not just in mammals and birds, but even in fruit flies. Usually, if something is regulated, it serves some important function. Third, lack of sleep has deleterious consequences, especially for the brain. In humans, for example, the most prominent effect of total sleep deprivation, and even of sleep restriction (for several nights), is cognitive impairment, with striking practical consequences (Bonnet & Arand, 2003; Dinges, 2006). Just consider that each year drowsy driving is responsible for at least 100,000 automobile crashes, 71,000 injuries, and 1,550 fatalities (Iber Ancoli-Israel, Chesson, & Quan, 2007); Radun & Summala, 2004). A sleep-deprived person tends to take longer to respond to stimuli, particularly when tasks are monotonous and low in cognitive demands. However, sleep deprivation produces more than just decreased alertness. Tasks emphasizing higher cognitive functions, such as logical reasoning, encoding, decoding, and parsing complex sentences; complex subtraction tasks and tasks involving a flexible thinking style, and the ability to focus on a large number of goals simultaneously, are all significantly affected even after one night of sleep deprivation. Tasks requiring sustained attention, such as those including goal-directed activities, can be impaired
c24.indd 466
by even a few hours of sleep loss. For example, Barger et al. (2006) showed that medical interns make more frequent serious diagnostic errors when they worked frequent shifts of 24 hours or more than when they worked shorter shifts. There are also indications that sleep plays a role in metabolic and endocrine regulation (Spiegel, Leproult, & Van Cauter, 1999). For example, a recent study showed a close relationship between insulin sensitivity and the amount of slow-wave sleep (Tasali, Leproult, Ehrmann, & Van Cauter, 2008). And finally, unless sleep serves an important function, why should we engage every night in prolonged periods of immobility during which we are dangerously out of touch with the environment? Sleep and Memory In the past decade, numerous studies appeared which seem to support a role for sleep in learning and memory. Specifically, a growing number of studies have demonstrated that sleep can enhance performance of tasks learned during prior wakefulness. This enhancement is not merely timedependent, but specifically requires sleep, and is independent of circadian factors (Walker & Stickgold, 2004). Using a variety of behavioral paradigms, evidence of sleepdependent memory enhancement has been found in humans and nonhuman primates such as cats, rats, mice, and zebra finches (Peigneux, Laureys, Delbeuck, & Maquet, 2001; Walker & Stickgold, 2004). Initial studies focused on a role for REM sleep (Karni, Tanne, Rubenstein, Askenasy, & Sagi, 1994), but more recent studies have emphasized the importance of NREM sleep (Gais & Born, 2004; Peigneux et al., 2004), of specific components within NREM sleep such as spindles (Gais, Molle, Helms, & Born, 2002; Rosanova & Ulrich, 2005) and slow waves (Czarnecki, Birtoli, & Ulrich, 2007; Huber, Hill, Ghilardi, Massimini, & Tonomi, 2004; Schmidt et al., 2006), and of a combination of NREM and REM sleep (Mednick, Nakayama, & Stickgold, 2003; Stickgold, James, & Hobson, 2000). Behavioral studies in humans and other species leave little doubt that sleep plays a critical role in learning and memory. How sleep might promote performance enhancement is not yet understood. An intriguing possibility is that the offline reactivation during sleep of circuits involved in learning during wakefulness, and perhaps the involvement of other, connected circuits, might promote memory consolidation. Several studies in animals have shown that, during NREM sleep after learning, there is an increased correlation in the firing of cells coactivated during learning tasks in prior waking, primarily in the hippocampus (Skaggs & McNaughton, 1996; Wilson & McNaughton, 1994). In humans, neuroimaging studies have shown that hippocampal areas that are activated during route learning in a virtual
8/18/09 5:51:42 PM
A Primer on Sleep Physiology
town are likewise activated during subsequent NREM sleep (Peigneux et al., 2004). EEG studies have shown an increase in NREM spindle density after learning pairs of unrelated words as compared to a nonlearning task (Gais et al., 2002). Similar findings have been reported after learning a maze task (Meier-Koll, Bussmann, Schmidt, & Neuschwander, 1999). Finally, high-density EEG recordings show that a visuomotor learning task, compared to a control nonlearning task, produces an increase in SWA that is localized to the brain region (right parietal cortex) that is known to be involved in learning the task (Huber, Ghilardi, Massimini, & Tononi, 2004). Many unknowns remain, however. Whether sleep may favor the consolidation of newly established memories or the maintenance of older ones is not clear. The molecular correlates of such processes are still unclear. Molecular markers of memory acquisition are turned off during sleep, which may be advantageous given that the intense neural activity of sleep occurs while the animal is disconnected from the environment. Nevertheless, evidence exists that neural activity during NREM sleep may promote brain plasticity (Jha et al., 2005; Steriade, 1999), especially in developing animals (Frank, Issa, & Stryker, 2001). Sleep and Brain Restitution When we have been awake too long, we say we are tired, and after sleep we feel refreshed. Not surprisingly, the most intuitively compelling idea about the function of sleep is that sleep may restore some precious fuel or energy charge that was depleted during wakefulness. It is likely that sleep may reduce energy waste by enforcing body rest in animals with high metabolic rates (this is certainly what hibernation does by drastically reducing body and brain metabolism, shutting off brain activity, and reducing temperature). However, in humans the metabolic savings of spending the night asleep rather than quietly awake are no more than a slice of bread (Horne, 1980). Moreover, we also say we are tired after muscle exertions, yet most bodily organs can recover through quiet wakefulness and do not need sleep. The notable exception is the brain: If we do not sleep, even though we may remain immobile, we rapidly suffer cognitive impairment. Therefore, most researchers agree that sleep may be especially important for restoring the brain and provide something not afforded by quiet waking. However, there is great uncertainty when it comes to what might actually accumulate (or deplete) during waking and be restored during sleep. A long search for humoral factors that might accumulate in the brain during wakefulness has not been successful (Borbély & Tononi, 1998). One of the best-studied substances is adenosine, not surprising given the well-known anti-sleep
c24.indd 467
467
effect of the A1 antagonist caffeine (Basheer et al., 2004; Porkka-Heiskanen, Alanko, Kalinchuk, & Stenberg, 2002). Extracellular adenosine accumulates in the basal forebrain area during wakefulness, inhibiting cholinergic neurons and promoting sleep (Porkka-Heiskanen et al., 1997), although the importance of this feedback mechanism has recently been disputed (Blanco-Centurion et al., 2006). Also, in humans, extracellular adenosine does not seem to accumulate in several brain areas as a function of previous wakefulness (Zeitzer et al., 2006). Prostaglandin D2, another sleep promoting substance, acting on the prostaglandin D (PGD) receptor, indirectly activates adenosine A2A-dependent pathways in the basal forebrain (Huang, Urade, & Hayaishi, 2007). However, neither A1 nor PGD receptor knockout mice have abnormal baseline sleep. Similarly, a number of lymphokines, such as interleukin1 (IL-1) and tumor necrosis factor (TNF) alpha, modulate sleep. These effects are often species-specific and could be most relevant in the context of acute inflammation or infection. However, the TNF and IL-1 type I receptor knockouts have abnormal sleep, suggesting also a role in baseline sleep regulation (Krueger, Obal, Fang, Kubota, & Taishi, 2001). As an alternative, it has been suggested that sleep may favor not so much the elimination of some toxic factors accumulated during wakefulness, but rather the replenishment of some important resource, for instance glycogen in glial stores (Benington & Heller, 1995). However, studies show that glycogen depletion may only occur in a few brain regions and only in certain strains of animals (Franken, Gip, Hagiwara, Ruby, & Heller, 2003, 2006; Gip, Hagiwara, Ruby, & Heller, 2002). The molecular changes that take place between wakefulness and sleep suggest other possibilities as well (Cirelli et al., 2004): Sleep could counteract synaptic fatigue by favoring the replenishment of calcium in presynaptic stores, the replenishment of glutamate vesicles, the resting of mitochondria, the synthesis of proteins, or the trafficking and recycling of membranes. Unfortunately, most of these possibilities remain unexplored. Sleep and Synaptic Homeostasis Memory consolidation and brain restitution are important perspectives on the function of sleep that are not mutually exclusive. Recently, a comprehensive hypothesis concerning the function of NREM sleep has been advanced, the synaptic homeostasis hypothesis (Tononi & Cirelli, 2003, 2006; Figure 24.6). The hypothesis, which is broadly consistent with a large body of evidence, also makes specific suggestions concerning the mechanisms leading to the increase of SWA as a function of prior wakefulness.
8/18/09 5:51:42 PM
468
Sleep and Waking Across the Life Span
Slow-wave activity
Synaptic potentiation
Synaptic downscaling
W
S
Figure 24.6 Synaptic homeostasis hypothesis. Note. Synapitc strength increases during wakefulness (W) and is downscaled during sleep (S). The hypothesis proposes a close relationship between synaptic strength and sleep slow-wave activity. From “Sleep Function and Synaptic Homeostasis,” by G. Tononi and C. Cirelli, 2006, Sleep Medicine Reviews, 10, p. 50. Reprinted with permission.
The synaptic homeostasis hypothesis proposes that plastic processes occurring during wakefulness result in a net increase in synaptic strength in many brain circuits. The main function performed by sleep is to downscale synaptic strength to a baseline level that is energetically sustainable and beneficial for memory and performance. In other words, according to the synaptic homeostasis hypothesis, sleep is the price we have to pay for plasticity, and its goal is the homeostatic regulation of the total synaptic weight impinging on neurons. An appealing feature of the synaptic homeostasis hypothesis is that it reconciles the restorative, homeostatic function of sleep with its beneficial effects on learning and memory. The main points of the hypothesis are as follows. During wakefulness, we interact with the environment and acquire information about it. The EEG is activated, neurons are tonically depolarized and spontaneously active (Steriade et al., 1993), and the neuromodulatory milieu (e.g., a high level of noradrenaline [NA]; Cirelli & Tononi, 2004) favors the storage of information, which occurs largely through synaptic potentiation (Trachtenberg et al., 2002). This potentiation occurs when the firing of a presynaptic neuron is followed by the depolarization or firing of a postsynaptic neuron, and the neuromodulatory milieu signals the occurrence of salient events (Bliss & Collingridge, 1993; Bliss & Lomo, 1973). A key functional corollary of the hypothesis is that, due to the net increase in synaptic strength, waking plasticity has a cost in terms of energy requirements, space requirements, supplies of key cellular constituents, and progressively saturates our capacity to learn. When we go to sleep, we become virtually disconnected from the environment (Steriade et al., 1993). Changes in neuromodulatory milieu trigger slow oscillations, comprising depolarized
c24.indd 468
and hyperpolarized phases, which affect every neuron in the cortex, and that are reflected in the EEG as SWA (Steriade & Timofeev, 2003). The changed neuromodulatory milieu (e.g., low NA; Aston-Jones & Bloom, 1981; Cirelli & Tononi, 2004) also ensures that synaptic activity is not followed by synaptic potentiation, which makes adaptive sense given that synaptic activity during sleep is not driven by interactions with the environment. Since the average strength of synaptic interactions at the end of the wake period is high, neurons synchronize their firing better and the slow oscillations of early sleep are of high amplitude (Esser, Hill, & Tononi, 2007). The slow oscillations, however, are not just an epiphenomenon of increased synaptic strength, but would have a role to play. Specifically, the repeated sequences of depolarization—hyperpolarization would lead to the downscaling of the synapses impinging on each neuron (Turrigiano & Nelson, 2000, 2004), meaning that they all would decrease in strength proportionally. The reduced synaptic strength reduces the amplitude and synchronization of the slow oscillations, which is reflected in a reduced SWA in the sleep EEG. Because of the dampening of the slow oscillation, downscaling is progressively reduced, making the process self-limiting when synaptic strength reaches a baseline level. By returning total synaptic weight to an appropriate baseline level, sleep enforces synaptic homeostasis. Again, the key functional corollary is that synaptic homeostasis has benefits in terms of energy and space requirements, of the supply of key cellular constituents and, due to increased signal-tonoise ratios, in terms of learning and memory. Thus, when we wake up, neural circuits do preserve a trace of previous experiences, but are kept efficient at a recalibrated level of synaptic strength, and the cycle can begin again. The synaptic homeostasis hypothesis is based on a large number of observations at many different levels, from molecular and cellular biology to systems neurophysiology and neuroimaging (for more details, see Tononi & Cirelli, 2003, 2006). The best electrophysiological and molecular evidence comes from a study in rats showing that wakefulness is associated with markers of cortical synaptic potentiation (e.g., increased number of synaptic AMPARs containing GluR1 subunits), whereas sleep is associated with markers of synaptic depression (e.g., dephosphorylation of synaptic GluR1; Vyazovskiy, Cirelli, Pfister-Genskow, Faraguna, & Tononi, 2008). Moreover, the slopes of cortical evoked potentials, reflecting cortical excitability, increased after wakefulness and decreased after sleep. Other electrophysiological and behavioral evidence support the hypothesis (Hairston et al., 2004; Huber et al., 2006; Huber, Ghilardi, et al., 2004; Huber, Tononi, & Cirelli, 2007), but there are alternative explanations, and critical tests still need to be performed.
8/18/09 5:51:42 PM
Development of the Sleep-Waking Cycle and Changes during the Life Span 469
Figure 24.7 Sleep duration.
50
Note. Histogram of the reported sleep duration of more than 1 million adult Americans from the Cancer Prevention Study I. From “Short and Long Sleep and Sleeping Pills: Is Increased Mortality Associated?” by D. F. Kripke, R. N. Simons, L. Garfinkel, E. C. Hammond, 1979, Archives of General Psychiatry, 36, pp. 103–16. Adapted with permission.
Percent of People
40
30
20
10
0 4
5
6 7 8 Sleep Duration (hours)
9
Sleep Duration and Mortality
Rest Activity Cycles in the Fetus
Large surveys show that the average sleep duration for adult humans is about 7 to 8 hours (Figure 24.7; Kripke, Simons, Garfinkel, & Hammond, 1979). However, sleep duration is widely distributed: some people get along perfectly with 5 hours while others need more than 10 hours. There are indications that true short sleepers live under higher sleep pressure than long sleepers (Aeschbach, Cajochen, Landolt, & Borbély, 1996; Aeschbach et al., 2001). Such studies point to a genetic determination of sleep duration (Linkowski, Kerkhofs, Hauspie, Susanne, & Mendlewicz 1989; Partinen, Kaprio, Koskenvuo, Putkonen, & Langinvainio, 1983). In general, only a small percentage of sleep (e.g., ⫾1h) can be changed by training without an individual suffering the detrimental effects of chronic sleep deprivation (Banks & Dinges, 2007). Intriguingly, sleep duration is related to mortality. Epidemiologic studies have consistently shown that sleeping more than 8 hours per night is associated with increased mortality (Kripke et al., 2002; Youngstedt & Kripke, 2004). Youngstedt and Kripke state that “were long sleep the cause of all deaths with which it has been associated, it would be the fourth leading cause of death in the U.S.” (p. 160).
Real-time ultrasonography has made systematic and prolonged behavioral observations of the undisturbed human fetus possible. At 8 to 12 weeks, the human fetus displays episodic spontaneous movements, which are cycles of activity interspersed with periods of quiescence. These cycles increase in duration and become more regular from the mid-trimester onward (Dierker, Rosen, Pillay, & Sorokin, 1982). More specifically, the increasing synchronization of cyclic motor activity with periodic changes in heart rate and eye movements as gestation advances are considered a milestone in central nervous system development (Visser, Poelmannweesjes, Cohen, & Bekedam, 1987). At a gestational age of 16 to 20 weeks, a healthy fetus shows pronounced daily rhythms of the heartbeat and locomotor activity (Kintraia, Zarnadze, Kintraia, & Kashakashvili, 2005). The part of the brain that orchestrates such circadian rhythmicity is the suprachiasmatic nucleus (SCN) in the anterior hypothalamus. The SCN cells receive information about the presence and intensity of light in the environment via specialized retinal ganglion cells. The ganglion cells project to the SCN directly via the optic tract as well as via the intergeniculate leaflet and raphe nuclei. Because these connections are made late in human gestation, the developing fetus relies on maternal rhythms. Although information about humans is lacking, the fetal SCN of rodents is rhythmic in the absence of a functional input pathway (Reppert & Schwartz, 1984). The fetal SCN cells are exposed to maternal hormone rhythms (e.g., melatonin; Zemdegs, McMillen, Walker, Thorburn, & Nowak, 1988) and possibly to fetal pituitary hormone
DEVELOPMENT OF THE SLEEP-WAKING CYCLE AND CHANGES DURING THE LIFE SPAN Many aspects of sleep change during the life span. Next, we consider how sleep develops in utero, during childhood and adolescence, and how it changes with old age.
c24.indd 469
10
8/18/09 5:51:43 PM
470
Sleep and Waking Across the Life Span
rhythms, again in response to the mother ’s response to the environment. Newborn and Infants: From Arrhythmicity to the Appearance of Rhythms At birth, humans are essentially arrhythmic showing hardly any circadian organization of sleep and wakefulness (Figure 24.8; Swaab, Hofman, & Honnebier, 1990). This arrhythmicity is present in spite of the fact that cells in the SCN are likely already rhythmic (see Reppert & Schwartz, 1984), probably due to underdeveloped input/ output pathways. The lack of rhythmicity is evidenced by the equal distribution of sleep across day and night. However, some evidence for day/night differences in sleep and waking might be found as early as in the first days of life in some babies (Jenni, Deboer, & Achermann, 2006). In general, the lack of consolidated sleep is accompanied by the absence of hormonal and body temperature rhythmicity (Rivkees & Hao, 2000). Longitudinal studies have shown that these rhythms appear around 9 to 12 weeks of age in humans (Kennaway, Stamp, & Goble, 1992), and that the appearance of these rhythms is accompanied by the ability to sustain longer episodes of sleep and wakefulness (Kleitmann & Engelmann, 1953; Parmelee, 1961). This sleep consolidation is accompanied by an increase in sleep during the night and decrease during the day (Iglowstein, Jenni, Molinari, & Largo, 2003). Besides the influence of the circadian system, the homeostatic regulation of sleep and daily parental activities such as feeding, can exert an influence on the development of the 24-hour sleep-wake cycle.
Circadian Rhythmicity in Childhood and Adolescence The circadian timing system remains stable in the course of childhood. However, during puberty, distinct changes occur influencing the phase of the circadian timing system. Teenagers commonly show a prominent phase delay (Carskadon & Acebo, 2002). The mechanism of this phase delay is not entirely clear but may include a phase delay of the intrinsic circadian rhythm (Carskadon, Acebo, Richardson, Tate, & Seifer, 1997), a lengthening of the intrinsic period of the circadian clock (Carskadon, Acebo, & Jenni, 2004), and a heightened sensitivity to evening light or decreased sensitivity to morning light (Jenni & Carskadon, 2007). This change in the circadian rhythm in adolescents is often accompanied by difficulties falling asleep and/or daytime sleepiness (Carskadon et al., 1980) and might be best explained by the interaction of the circadian and homeostatic process of sleep regulation (see two process model; Borbély, 1982). Although it is generally accepted that the circadian and the homeostatic process of sleep regulation are independent processes, they interact in a complex way to control vigilance states and sleep timing. Thus, the increase of the homeostatic sleep pressure during wakefulness is opposed by the increasing circadian alertness in the course of the day. This interaction allows adults to maintain a constant level of vigilance throughout the waking period. In contrast, during sleep, the increasing circadian sleep tendency counteracts the declining homeostatic sleep pressure, thereby ensuring sleep maintenance. During adolescence, a delay in the circadian phase reorganizes the alignment of the two processes such that the
0 50 100
Age (days)
150 200 250
Figure 24.8 Sleep-wake pattern after birth. 300 350 400 5
c24.indd 470
11
17
22
5 11 Time of Day (h)
17
22
5
Note. Diurnal sleep-wake pattern during the first 425 days after birth of a healthy male infant. Black and white areas represent sleep and waking, respectively, as recorded by daily sleep logs. Gray dots indicate feeding episodes. From “Sleep Behavior and Sleep Regulation from Infancy through Adolescence: Normative Aspects,” by O. G. Jenni and M. A. Carskadon, 2007, Sleep Medicine Clinics. (Jenni & Carskadon, 2007), p. 322.
8/18/09 5:51:43 PM
Development of the Sleep-Waking Cycle and Changes during the Life Span 471
Total Sleep Duration (hours)
increasing circadian sleep tendency as well as the circadian evening rise in alertness occur later, thus rendering even a well-slept adolescent sleepy in the morning hours and reluctant to fall asleep at night. In other words, adolescents assume the chronotype characteristic of night people (owls). End of adolescence is associated with a change in chronotype whereby young adults assume an early–rising profile (larks; Roenneberg et al., 2004). The Delayed Sleep Phase Syndrome (DSPS) may be a distinct clinical entity or an extreme manifestation of phase delaying in adolescents. DSPS is typically diagnosed during the second decade of life or earlier (Thorpy, Korman, Spielman, & Glovinsky, 1988). Weitzman and colleagues 20
20
18
18
16
16
14
14
12
12
10
98 90 75 50 25 10 2
8 6
10 8 6 4
4 1 3 6 9 Age (months)
1
2
4
6
8 10 12 14 16 Age (years)
Figure 24.9 Sleep duration. Note. Percentiles for total sleep duration per 24 hours from infancy to adolescence. Sleep log data was obtained from 493 subjects of the Zurich Longitudinal Studies (Largo et al., 1996). From “Sleep Duration from Infancy to Adolescence: Reference Values and Generational Trends,” by I. Iglowstein, O. G. Jenni, L. Molinari, and R. H. Largo, 2003, Pediatrics, 111, p. 303 Reprinted with permission.
(1981) first characterized DSPS as a cluster of features including: a chronic inability to fall asleep and awaken at a desired clock time, consistency in reporting sleep times at later hours than other individuals, and otherwise normal sleep when measured by all-night polysomnography, if the delayed schedule is allowed. This disorder, with a prevalence of 0.1% to 3.1% (Wyatt, 2004), is significant because it has the highest prevalence during adolescence and school performance is often compromised. Changes in Sleep Duration On average, infants spend more than 50% of the time asleep (Figure 24.9). During the first year of life, total sleep duration remains relatively constant (Figure 24.9). However, the large variability of total sleep in infants decreases over time; during the first 1 to 2 years, the variability of sleep duration goes from 9 to 19 hours at 1 month to 11 to 16 hours at 2 years (Iglowstein et al., 2003). Early childhood is typically characterized by a decrease in sleep length, which at first is mainly the consequence of a reduction in daytime naps (Iglowstein et al., 2003). Most children stop napping between the ages of 3 and 5, though large cultural differences exist (Jenni & Carskadon, 2007). However, even after these ages, sleep duration continues to decrease, from an average of 14 hours in the first month to an average of 8 hours in 16-year-olds (Iglowstein et al., 2003; Klackenberg, 1982). The subsequent decline in sleep duration in the years from early adulthood into old age is moderate, 1 to 2 hours, and data collection methods may influence our knowledge of this period (Roffwarg, Muzio, & Dement, 1966). Several studies indicate that there are no significant gender differences in sleep duration during childhood (e.g., Klackenberg, 1982; Iglowstein et al., 2003).
24
24
16
16
14
14
Waking
12
12 REM sleep
Hours
10 8
8
6
6
4
70–85
50–70
33–45
19–30
5–9 10–13 14–18
3–5
2–3
0
6–23
0 Age
3–5
2
1–15
2
Months
Note. Changes in total amounts of daily NREM and REM sleep from infancy to old age. From “Ontogenetic Development of the Human SleepDream Cycle,” by H. P. Roffwarg, J. N. Muzio, and W. C. Dement, 1966, Science, 152, p. 608. Reprinted with permission.
4
NREM sleep
Days
c24.indd 471
10
Figure 24.10 Distribution of vigilance states across life.
Years
8/18/09 5:51:44 PM
472
Sleep and Waking Across the Life Span
Changes in Sleep Architecture Differences in total sleep duration across the life span are not the only changes that occur; there are also alterations in the proportion of sleep stages. In the first few months of life, infant sleep is divided evenly between NREM and REM sleep (Figure 24.10). From early childhood until adolescence, the proportion of REM sleep decreases, reaching an adult level of about 20% to 25% of nocturnal sleep (Jenni & Carskadon, 2007; Roffwarg et al., 1966). Furthermore, the composition of NREM sleep changes, in that SWS (NREM sleep stages N3) increases until puberty and subsequently shows an exponential decline during adolescence (Feinberg, 1982). The sequence of sleep stages when falling asleep also changes during infancy: When young infants fall asleep, the initial sleep stage is typically REM sleep. After 3 months, sleep onset REM periods are replaced by NREM periods, as is the case in adults (Jenni & Carskadon, 2007). In the first 6 months after birth, due to frequent muscle twitches and body jerks that break through muscle inhibition, REM sleep is also called active sleep. NREM sleep is referred to as quiet sleep. In newborns, quiet and active sleep are often disorganized and immature and thus called indeterminate or transitional sleep. However, by using fetal ultrasound, behavioral observations, and fetal heart rate monitoring, quiet and active sleep can be differentiated as early as 32 weeks gestation (Mulder, Visser, Bekedam, & Prechtl, 1987; Visser et al., 1987). Between 32 and 40 weeks of gestation, quiet sleep increases and indeterminate sleep decreases (Mulder et al., 1987). The alternation between NREM and REM sleep cycles, the ultradian sleep rhythm, is already present in newborns. The period of this ultradian rhythm gradually lengthens during childhood, from about 50 minutes in infancy, to 90 to 110 minutes around school age (Jenni & Carskadon, 2007). Large amplitude slow-wave sleep dominates the sleep cycles early at night and REM sleep the last part of the night. In early infancy, subsequent NREM sleep episodes may alternate between having high and low amounts of slow waves (Bes, Schulz, Navelet, & Salzarulo, 1991; Jenni, Borbély, & Achermann, 2004). The mechanism underlying this alternating pattern is unknown.
whereas sleep duration remains constant. Similar modifications in the homeostatic regulation of sleep are found during the human neonatal period. For example, selective or total sleep deprivation in human neonates leads to compensatory increases in NREM sleep duration only (Anders & Roffwarg, 1973; Thomas et al., 1996). Exactly when sleep deprivation produces increases in EEG slow-wave activity in human neonates is still unknown. A decline in SWA in the course of the night is first visible during the second postnatal month or even later (Bes et al., 1991). A decline in the theta frequency range (4.5 to 7 Hz) during the night may represent the first indication of homeostatic regulation (Jenni et al., 2004). Tolerance to Increased Sleep Pressure In general, neonates are unable to maintain consolidated bouts of waking comparable to those typically observed in adults. In humans, short periods of sleep deprivation that have negligible effects in adults produce compensatory increases in sleep time and/or intensity during recovery (Anders & Roffwarg, 1973; Thomas et al., 1996). Young rats (P23) show an increase of SWA after only 2 hours of sleep deprivation that is just as large as that observed after 6 hours in postpubertal rats (P40) (Alfoldi, Tobler, & Borbély, 1990). These findings suggest either that the saturation level of sleep pressure is lower during infancy or that sleep pressure accumulates at a greater rate in infancy compared with adulthood. Indeed, modeling of the dynamics of sleep homeostasis according to the twoprocess model of sleep regulation suggests that the build up of homeostatic sleep pressure during wakefulness is faster in prepubertal children compared with young adolescents (Jenni, Achermann, & Carskadon, 2005). In contrast, the decline of the homeostatic process is similar in both groups. The faster increase of sleep pressure in young children may reduce the ability for young children to stay awake at the end of the day, suggesting that they live under higher sleep pressure. Sleep Slow Waves and Brain Plasticity during Development
Homeostatic Response to Sleep Deprivation In both animals and humans, the initial response to sleep deprivation is an increase in sleep duration; changes in sleep intensity are observed only at a later stage. For example, when very young rats (P12) are sleep deprived, they mainly compensate the sleep depth by increasing sleep duration (Frank, Morrissette, & Heller, 1998). Twelve days later (P24), however, sleep deprivation results in an increase in sleep SWA, as is the case in adult animals,
c24.indd 472
The amplitude of slow waves in the sleep EEG increases during childhood. Then, during puberty, there is a dramatic decline in the amplitude of sleep slow waves and SWA (Campbell, Darchia, Khaw, Higgins, & Feinberg, 2005; Feinberg, Higgins, Khaw, & Campbell, 2006; Jenni & Carskadon, 2004). While the factors underlying such changes are not clear, increasing evidence points to the importance of changes in neural plasticity. During early childhood, neurons grow bushier and establish more numerous
8/18/09 5:51:44 PM
Development of the Sleep-Waking Cycle and Changes during the Life Span 473
Moreover, the development of sleep homeostasis may be directly related to the appearance of synaptic mechanisms leading to long-term potentiation in association with waking exploratory activities. A developmental study found that sleep deprivation in rats is followed by an induction of cortical BDNF at postnatal day 24, the age when SWA showed a compensatory response, but not at postnatal day 16 or 20, the age when no SWA rebound occurred (Hairston et al., 2004). This finding is important because BDNF is necessary for the induction of synaptic potentiation (Aicardi et al., 2004; Barco et al., 2005), and the level of BDNF induction during waking activities is positively correlated with the amount of SWA during subsequent
Glucose Utilization (LCMRgic in µmol/min/100g)
70
60
300
50
100
30 20
10 Age (years)
15
40 30 20 10
50
0 0
5
10
15
20
25
0
Age (years)
Figure 24.11
5
Note. Glucose utilization was obtained by positron emission tomography (PET) and the tracer 2-deoxy-2-[18F]fluoro-D-glucose. From “A Critical Period of Brain Development: Studies of Cerebral Glucose Utilization with PET,” by H. T. Chugani, 1998, Preventive Medicine, 27, pp. 184–188. Adapted with permission.
350
150
40
Figure 24.12 Glucose consumption during childhood.
70
200
50
0
400
250
60
10
Synapses/100 µm3
Slow-Wave Amplitude (µV)
connections to other cells (De Felipe, Marco, Fairen, & Jones, 1997). Moreover, axons initially explore areas much wider than their final targets (Gao, Yue, Cerretti, Dreyfus, & Zhou, 1999). Then, in the course of adolescence, more synapses are eliminated than formed (Zuo, Lin, Chang, & Gan, 2005), in part through activity-dependent processes (Hua & Smith, 2004). Synaptic pruning during adolescence is accompanied by a reorganization of neuronal connections whereby mistargeted axons and unused synapses are eliminated, and connectivity becomes more specific. The decrease of synaptic density during adolescence, which is reflected in changes in grey matter, proceeds asynchronously in different brain areas (Paus, 2005), in line with the maturation of specific cognitive functions (Shaw et al., 2006). As illustrated in Figure 24.11, changes in slow-wave amplitude are paralleled by changes in synaptic density (Feinberg, 1982; Huttenlocher & Dabholkar, 1997). This observation has been confirmed both in humans and in rats (Glantz, Gilmore, Hamer, Lieberman, & Jarskog, 2007; Nakamura, Kobayashi, Ohashi, & Ando, 1999). Moreover, glucose consumption shows a similar profile (Figure 24.12; Chugani, 1998), presumably due to the increased energy requirements associated with increased synaptic activity. As suggested by the synaptic homeostasis hypothesis (Tononi & Cirelli, 2006) and confirmed by computer simulations and experimental studies in both humans and rats, changes in synaptic efficacy can account for the observed changes in sleep slow waves (Esser et al., 2007; Riedner et al., 2007; Vyazovskiy, Riedner, Cirelli, & Tononi, 2007). Thus, sleep slow-wave activity could be taken as a reliable indicator of net changes in average synaptic density/ strength both in the course of the night (sleep homeostasis) and in the course of development.
5
10
15
20
25
Age (years)
Changes in slow-wave amplitude and synapse density across life.
Note. The development of the amplitude of slow waves is reproduced from (Feinberg, 1982), the changes in synapse density is reproduced from (Huttenlocher & Dabholkar, 1997). For the slow-wave amplitude, each point represents the mean of 50 waves for one subject selected as described in (Feinberg et al., 1967). Synapse density was obtained from a postmortem specimen of normal human brains.
c24.indd 473
8/18/09 5:51:44 PM
474
Sleep and Waking Across the Life Span
sleep (Huber et al., 2007). In humans, the increase of BDNF mRNA levels in the dorsolateral prefrontal cortex during adolescence coincides with the time when the frontal cortex matures both structurally and functionally (Webster, Weickert, Herman, & Kleinman, 2002). Changes in sleep parameters may not only reflect plastic processes during development, but may play an active role in shaping such processes. Frank et al. (2001) showed that sleep greatly enhanced the synaptic changes induced by a preceding period of monocular deprivation, while wakefulness in complete darkness did not. In another set of experiments, the same group showed that ocular dominance plasticity was significantly reduced in cats whose visual cortices were reversibly silenced during sleep (Jha et al., 2005). These findings demonstrate that the mechanisms governing this form of plasticity requires specific cortical activity during sleep. Dark-rearing of cats and mice produced a robust and reversible decrement of slow-wave electrical activity during sleep that was restricted to the visual cortex and impaired by gene-targeted reduction of NMDA receptor function (Miyamoto, Katagiri, & Hensch, 2003). A role for sleep in brain plasticity during development is not necessarily limited to slow-wave sleep. REM sleep is more abundant during periods of rapid brain development and synaptic plasticity than at any other time of life (Frank & Heller, 1997; Jouvet-Mounier, Astic, & Lacote, 1970; Roffwarg et al., 1966; Shaw et al., 2000). Indeed, several experiments employing REM sleep deprivation in animals have suggested that REM sleep may play a key role in maturational processes (Blumberg & Lucas, 1996; Mirmiran et al., 1983; Shaffery et al., 1998; Shaffery, Sinton, Bissette, Roffwarg, & Marks, 2002). For example, 1 week of REM sleep deprivation in immature rats prolonged the critical period for plasticity of the visual system and can alter the balance between inhibitory and excitatory mechanisms in the visual cortex (Shaffery, Lopez, Bissette, & Roffwarg, 2006; Shaffery et al., 2002). Furthermore, besides slow waves during NREM sleep, sleep spindles also seem to be involved in plastic processes (Sirota, Csicsvari, Buhl, & Buzsaki, 2003) and were suggested as an indicator of the severity of developmental disorders in mental retardation and abnormal maturational processes (De Gennaro & Ferrara, 2003). For example, mentally retarded children show a decreased spindle density as compared to normal full-term children (Shibagaki, Kiyono, & Watanabe, 1982).
evidence in favor of such a relationship is becoming available. For example, a positive correlation between increased sleep/earlier bedtimes and higher school grades was found in a representative population of high school students (Wolfson & Carskadon, 1998). Moreover, actigraphy, an objective measure for evaluating sleep patterns, revealed that sleep fragmentation correlates significantly with daytime sleepiness, attentional deficits, and learning impairments (Sadeh, Raviv, & Gruber, 2000). Such effects seem to be more evident in younger children (Sadeh, Gruber, & Raviv, 2002). Finally, evidence for a link between sleep and cognition during childhood comes from the study of sleep-disorder breathing (SDB; O’Brien et al., 2004). Several studies have shown an association between SDB, attention deficits, excessive daytime sleepiness, cognitive impairment, and poor learning in children (Halbower & Mahone, 2006; Urschitz et al., 2003). In addition, habitual snoring is found in 1 in 10 primary school children (Urschitz et al., 2003) and is associated with sleep fragmentation (Halbower & Mahone, 2006). Children who snored habitually had at least twice the risk of performing poorly at school. Notably, this association became stronger with increased snoring frequency (Urschitz et al., 2003). The relationship between habitual snoring and poor academic performance did not appear to be mediated via intermittent hypoxia because it was not diminished after excluding children with intermittent hypoxia in an overnight study (Urschitz et al., 2003). Finally, there is evidence for earlyonset sleep disturbances in children with several neurological and psychiatric disorders. For example, based on parent reports, autistic children show a prevalence of 44% to 83% for sleep disorders (Gail Williams, Sears, & Allard, 2004). Another example is the Fragile X Syndrome (FraX), the most common inherited cause of mental retardation. FraX patients show abnormal dendritic spine morphology, with more long and thin spines. Fragile X boys have difficulty in sleep maintenance compared to control subjects (Gould et al., 2000), suggesting again a possible link between sleep, synapses, and cognitive function. In conclusion, the evidence for an association between sleep and cognitive development is intriguing, but it remains largely correlative, and the long-term effects of primary disturbances of sleep is unknown. Moreover, we still ignore whether there are critical time windows of cortical development that are particularly sensitive to sleep disturbances.
Sleep and Cognitive Functioning during Development
Old Age and Sleep
If sleep does play a critical role in brain development and learning, then sleep disorders, sleep restriction, and sleep loss early in life should impair cognitive functioning. Some
It is well known that memory function in general, and declarative memory in particular, progressively declines after the age of 30 (Prull, Gabrieli, & Bunge, 2000). This age-related
c24.indd 474
8/18/09 5:51:45 PM
Development of the Sleep-Waking Cycle and Changes during the Life Span 475
Sleep Disorders It is estimated that 15% to 35% of people suffer from sleep disturbances (Weyerer & Dilling, 1991), ranging from breathing disorders such as sleep apnea (Banno & Kryger, 2007), abnormal motor behaviors including restless legs syndrome (Montplaisir, 2004), insomnia due to hyperarousal (Roth, Roehrs, & Pies, 2007), and narcolepsy (Dauvilliers, Arnulf, & Mignot, 2007). The prevalence of sleep disorders changes across life. For example, the onset of insomnia seems to take place around puberty (Johnson, Roth, Schultz, & Breslau, 2006), and the percentage of people suffering from insomnia increases with age, from about 2% in adolescents to about 27% in persons 70⫹ years of age (Figure 24.13; Weyerer & Dilling, 1991). There are also clear gender differences, with females showing a much stronger increase in insomnia prevalence than males. An increase in insomnia prevalence is also observable in psychiatric
c24.indd 475
Insomnia Prevalence (%)
40 Male Female Total
30
20
10
0 15–19
20–29
30–39
40–49
50–59
60–69
70⫹
Age (years)
Figure 24.13 Insomnia prevalence across life span. Note. Moderate to severe insomnia within last 7 days. For both sexes, the prevalence of insomnia increases with age. From “Prevalence and Treatment of Insomnia in the Community: Results from the Upper Bavarian Field Study,” by S. Weyerer and H. Dilling, 1991, Sleep, 14, 392–398. Adapted with permission.
14
14 Girls Boys
12 Cumulative Incidence
decline in memory performance is accompanied by structural and functional changes in cortical and subcortical areas (Hedden & Gabrieli, 2004; Hof & Morrison, 2004). Once again, changes in brain structure reflecting a decrease in synaptic density are associated with a continuous decrease in sleep slow-wave activity after 30 years of age (Carrier, Land, Buysse, Kupfer, & Monk, 2001; Feinberg, 1982; Landolt, Dijk, Achermann, & Borbely, 1996). Moreover, the sleep-dependent enhancement of declarative memory that occurs in young subjects decreases during midlife, in line with a decrease in early nocturnal SWS (Backhaus et al., 2007). However, sleep homeostasis seems to be intact in old age: relative SWA changes in response to high sleep pressure (sleep deprivation) and low sleep pressure (naps) were similar in old and young subjects (Cajochen, Munch, Knoblauch, Blatter, & Wirz-Justice, 2006). Unlike SWS, REM sleep does not change much after age 30 (Van Cauter, Leproult, & Plat, 2000). Since the near-complete loss of REM sleep due to brain stem lesions (Vertes & Eastman, 2000) or to antidepressant treatment does not seem to have significant consequence on memory performance, it is possible that the connection between REM sleep and plasticity may be limited to early development. Finally, old age is associated with a weakening of circadian regulation, as suggested by the diminished secretion of melatonin—the major circadian signal in old subjects (Cajochen et al., 2006)—and a degeneration of the SCN observed in human postmortem studies (Hofman & Swaab, 2006). A dysregulation of the circadian timing system is especially pernicious in Alzheimer ’s disease because it impairs the maintenance of a normal sleep-wake cycle (Mishima et al., 1999), with further negative consequences on memory (Cole & Richards, 2005).
12
10
10
8
8
6
6
4
4
2
2
0
0 1
2
3
4
5
6
7 8 9 10 11 12 13 14 15 Age (years)
Figure 24.14 Age at onset of insomnia. Note. Cumulative incidence of insomnia by gender. Females reported a significantly older median age at onset of insomnia (age 12) than did males (age 10). From “Epidemiology of DSM-IV Insomnia in Adolescence: Lifetime Prevalence, Chronicity, and an Emergent Gender Difference,” by E. O. Johnson, T. Roth, L. Schultz, and N. Breslau, 2006, Pediatrics, 117, e247–e256. Adapted with permission.
disorders—about 50% of patients suffering from depressive disorders also suffer from insomnia (Weyerer & Dilling, 1991), although it is not yet clear whether chronic insomnia is causally related to the development of depression. Other sleep disturbances show an age-specific incidence. Thus, parasomnias like sleepwalking are common during childhood (Kales, Soldatos, & Kales, 1987; Mahowald & Rosen, 1990), adolescence is associated with the occurrence of narcolepsy and the delayed sleep phase syndrome (Crowley, Acebo, & Carskadon, 2007; Dauvilliers et al., 2007),
8/18/09 5:51:45 PM
476
Sleep and Waking Across the Life Span
obstructive sleep apnea (Tabba & Johnson, 2006) normally occurs in adults, and REM sleep behavior disorders appear more commonly in older adults, often anticipating by many years the development of Parkinson’s disease (Mahowald, Schenck, & Bornemann, 2007). A large percentage of sleep disorders, especially related to old age can be attributed to bad sleep hygiene. Among the measures for improving sleep hygiene are: maintaining a regular daily schedule of activities, utilizing the bedroom for rest and sleep rather than conflict and worry, and improving the sleep environment by minimizing noise and disruptions. Also, regular exercise, not close to bedtime, has been shown to increase early night slow-wave sleep in normal sleepers. In the elderly, special instructions include education regarding effects of age on sleep patterns; discouraging multiple naps; and suggesting daytime activities such as hobbies and special interests (Kamel & Gammack, 2006; Vgontzas & Kales, 1999).
SUMMARY Sleep is an active process that is tightly regulated. Thus, every night we cycle through a seemingly predefined series of discrete states (NREM and REM sleep) each with its characteristic activity pattern. Sleep need is also regulated and depends to a large extent on how long we stay awake. Moreover, the longer we stay awake the more intense sleep becomes, which is reflected by the homeostatic regulation of slow-wave activity (SWA), a quantification of the low-frequency (⬍4 Hz) EEG components. A recent hypothesis about the function of sleep now considers that there is a cellular need for sleep that is triggered by the induction of plastic changes during wakefulness. Our brain undergoes prominent plastic changes across a life span. A major change is the increase in connectivity until puberty and its decrease thereafter. Interestingly, the change in connectivity is paralleled by changes in sleep intensity, i.e. the amount of SWA, during childhood and adolescence. Thus, the question arises whether there exists a relationship between sleep and brain maturation.
Aeschbach, D., Postolache, T. T., Sher, L., Matthews, J. R., Jackson, M. A., & Wehr, T. A. (2001). Evidence from the waking electroencephalogram that short sleepers live under higher homeostatic sleep pressure than long sleepers. Neuroscience, 102, 493–502. Aicardi, G., Argilli, E., Cappello, S., Santi, S., Riccio, M., Thoenen, H., et al. (2004). Induction of long-term potentiation and depression is reflected by corresponding changes in secretion of endogenous brainderived neurotrophic factor. Proceedings of the National Academy of Sciences, USA, 101, 15788–15792. Alfoldi, P., Tobler, I., & Borbély, A. A. (1990). Sleep regulation in rats during early development. Journal of General Physiology, 258, R634–R644. Anders, T. F., & Roffwarg, H. P. (1973). The effects of selective interruption and deprivation of sleep in the human newborn. Developmental Psychobiology, 6, 77–89. Aserinsky, E., & Kleitman, N. (1953, September 4). Regularly occurring periods of eye motility, and concomitant phenomena, during sleep. Science, 118, 273–274. Aston-Jones, G. (2005). Brain structures and receptors involved in alertness. Sleep Medicine, 6(Suppl, 1), S3–S7. Aston-Jones, G., & Bloom, F. E. (1981). Activity of norepinephrine-containing locus coeruleus neurons in behaving rats anticipates fluctuations in the sleep-waking cycle. Journal of Neuroscience, 1, 876–886. Backhaus, J., Born, J., Hoeckesfeld, R., Fokuhl, S., Hohagen, F., & Junghanns, K. (2007). Midlife decline in declarative memory consolidation is correlated with a decline in slow wave sleep. Learning and Memory, 14, 336–341. Banks, S., & Dinges, D. F. (2007). Behavioral and physiological consequences of sleep restriction. Journal of Clinical Sleep Medicine, 3, 519–528. Banno, K., & Kryger, M. H. (2007). Sleep apnea: Clinical investigations in humans. Sleep Medicine, 8, 400–426. Barco, A., Patterson, S., Alarcon, J. M., Gromova, P., Mata-Roig, M., Morozov, A., et al. (2005). Gene expression profiling of facilitated L-LTP in VP16-CREB mice reveals that BDNF is critical for the maintenance of LTP and its synaptic capture. Neuron, 48, 123–137. Barger, L. K., Ayas, N. T., Cade, B. E., Cronin, J. W., Rosner, B., Speizer, F. E., et al. (2006). Impact of extended-duration shifts on medical errors, adverse events, and attentional failures. PLoS Medicine, 3, e487. Basheer, R., Strecker, R. E., Thakkar, M. M., & McCarley, R. W. (2004). Adenosine and sleep-wake regulation. Progress in Neurobiology, 73, 379–396. Benington, J. H., & Heller, H. C. (1995). Restoration of brain energy metabolism as the function of sleep. Progress in Neurobiology, 45, 347–360. Berger, H. (1929). On the electroencephalogram of man. Archives of Psychiatrie, 87, 527–570. Bes, F., Schulz, H., Navelet, Y., & Salzarulo, P. (1991). The distribution of slow-wave sleep across the night: A comparison for infants, children, and adults. Sleep, 14, 5–12. Blake, H., & Gerard, R. W. (1937). Brain potentials during sleep. American Journal of Physiology, 119, 692–703.
REFERENCES Achermann, P., & Borbély, A. A. (2003). Mathematical models of sleep regulation. Frontiers in Bioscience, 8, S683–S693. Achermann, P., Dijk, D. J., Brunner, D. P., & Borbély, A. A. (1993). A model of human sleep homeostasis based on EEG slow-wave activity: Quantitative comparison of data and simulations. Brain Research Bulletin, 31, 97–113. Aeschbach, D., Cajochen, C., Landolt, H., & Borbély, A. A. (1996). Homeostatic sleep regulation in habitual short sleepers and long sleepers. Journal of General Physiology, 270, R41–R53.
c24.indd 476
Blanco-Centurion, C., Xu, M., Murillo-Rodriguez, E., Gerashchenko, D., Shiromani, A. M., Salin-Pascual, R. J., et al. (2006). Adenosine and sleep homeostasis in the Basal forebrain. Journal of Neuroscience, 26, 8092–8100. Bliss, T. V., & Collingridge, G. L. (1993, January 7). A synaptic model of memory: Long-term potentiation in the hippocampus. Nature, 361, 31–39. Bliss, T. V., & Lomo, T. (1973). Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path. Journal of Physiology, 232, 331–356. Blumberg, M. S., & Lucas, D. E. (1996). A developmental and component analysis of active sleep. Developmental Psychobiology, 29, 1–22.
8/18/09 5:51:45 PM
References 477 Bonnet, M. H., & Arand, D. L. (2003). Clinical effects of sleep fragmentation versus sleep deprivation. Sleep Medicine Reviews, 7, 297–310.
De Gennaro, L. & Ferrara, M. (2003) Sleep spindles: an overview. Sleep Med Rev, 7, 423–440.
Borbély, A. A. (1982). A two process model of sleep regulation. Human Neurobiology, 1, 195–204.
Dement, W., & Kleitman, N. (1957). Cyclic variations in EEG during sleep and their relation to eye movements, body motility, and dreaming. Electroencephalography and Clinical Neurophysiology, 9, 673–690.
Borbély, A. A., & Achermann, P. (2000). Homeostasis of human sleep and models of sleep regulation. In M. H. Kryger, T. Roth, & W. C. Dement (Eds.), Principles and practice of sleep medicine (pp. 377–390). Philadelphia: Saunders. Borbély, A. A., & Tononi, G. (1998). The quest for the essence of sleep. Daedalus, 127, 167–196. Braun, A. R., Balkin, T. J., Wesenten, N. J., Carson, R. E., Varga, M., Baldwin, P., et al. (1997). Regional cerebral blood flow throughout the sleep-wake cycle. An H2(15)O PET study. Brain, 120(Pt. 7), 1173–1197. Cajochen, C., Munch, M., Knoblauch, V., Blatter, K., & Wirz-Justice, A. (2006). Age-related changes in the circadian and homeostatic regulation of human sleep. Chronobiology International, 23, 461–474. Campbell, I. G., Darchia, N., Khaw, W. Y., Higgins, L. M., & Feinberg, I. (2005). Sleep EEG evidence of sex differences in adolescent brain maturation. Sleep, 28, 637–643. Carrier, J., Land, S., Buysse, D. J., Kupfer, D. J., & Monk, T. H. (2001). The effects of age and gender on sleep EEG power spectral density in the middle years of life (ages 20–60 years old). Psychophysiology, 38, 232–242. Carskadon, M. A., & Acebo, C. (2002). Regulation of sleepiness in adolescents: Update, insights, and speculation. Sleep, 25, 606–614. Carskadon, M. A., Acebo, C., & Jenni, O. G. (2004). Regulation of adolescent sleep: Implications for behavior. Annals of the New York Academy of Sciences, 1021, 276–291. Carskadon, M. A., Acebo, C., Richardson, G. S., Tate, B. A., & Seifer, R. (1997). An approach to studying circadian rhythms of adolescent humans. Journal of Biological Rhythms, 12, 278–289. Carskadon, M. A., Harvey, K., Duke, P., Anders, T. F., Litt, I. F., & Dement, W. C. (1980). Pubertal changes in daytime sleepiness. Sleep, 2, 453–460. Chase, M. H., & Morales, F. R. (1990). The atonia and myoclonia of active (REM) sleep. Annual Review of Psychology, 41, 557–584. Chugani, H. T. (1998). A critical period of brain development: Studies of cerebral glucose utilization with PET. Preventive Medicine, 27, 184–188. Cirelli, C., Gutierrez, C. M., & Tononi, G. (2004). Extensive and divergent effects of sleep and wakefulness on brain gene expression. Neuron, 41, 35–43. Cirelli, C., Pompeiano, M., & Tononi, G. (1996, November 15). Neuronal gene expression in the waking state: A role for the locus coeruleus. Science, 274, 1211–1215. Cirelli, C., & Tononi, G. (2000). Differential expression of plasticityrelated genes in waking and sleep and their regulation by the noradrenergic system. Journal of Neuroscience, 20, 9187–9194.
Dinges, D. F. (2006). The state of sleep deprivation: From functional biology to functional consequences. Sleep Medicine Reviews, 10, 303–305. Esser, S. K., Hill, S., & Tononi, G. (2007). Sleep homeostasis and cortical synchronization: Pt. I. Modeling the effects of synaptic strength on sleep slow waves. Sleep, 30, 1617–1630. Feinberg, I. (1982). Schizophrenia: Caused by a fault in programmed synaptic elimination during adolescence? Journal of Psychiatric Research, 17, 319–334. Feinberg, I., Higgins, L. M., Khaw, W. Y., & Campbell, I. G. (2006). The adolescent decline of NREM delta, an indicator of brain maturation, is linked to age and sex but not to pubertal stage. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 291, R1724–R1729. Feinberg, I., Koresko, R.L. & Heller, N. (1967) EEG sleep patterns as a function of normal and pathological aging in man. J Psychiatr Res, 5, 107–144. Frank, M. G., & Heller, H. C. (1997). Development of REM and slow wave sleep in the rat. Journal of General Physiology, 272, R1792–R1799. Frank, M. G., Issa, N. P., & Stryker, M. P. (2001). Sleep enhances plasticity in the developing visual cortex. Neuron, 30, 275–287. Frank, M. G., Morrissette, R., & Heller, H. C. (1998). Effects of sleep deprivation in neonatal rats. Journal of General Physiology, 275, R148–R157. Franken, P., Chollet, D., & Tafti, M. (2001). The homeostatic regulation of sleep need is under genetic control. Journal of Neuroscience, 21, 2610–2621. Franken, P., Gip, P., Hagiwara, G., Ruby, N. F., & Heller, H. C. (2003). Changes in brain glycogen after sleep deprivation vary with genotype. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 285, R413–R419. Franken, P., Gip, P., Hagiwara, G., Ruby, N. F., & Heller, H. C. (2006). Glycogen content in the cerebral cortex increases with sleep loss in C57BL/6J mice. Neuroscience Letters, 402, 176–179. Franken, P., Tobler, I., & Borbély, A. A. (1991). Sleep homeostasis in the rat: Simulation of the time course of eeg slow-wave activity. Neuroscience Letters, 130, 141–144.
Cirelli, C., & Tononi, G. (2004). Locus ceruleus control of state-dependent gene expression. Journal of Neuroscience, 24, 5410–5419.
Fuller, P. M., Gooley, J. J., & Saper, C. B. (2006). Neurobiology of the sleep-wake cycle: Sleep architecture, circadian regulation, and regulatory feedback. Journal of Biological Rhythms, 21, 482–493.
Cole, C. S., & Richards, K. C. (2005). Sleep and cognition in people with Alzheimer ’s disease. Issues in Mental Health Nursing, 26, 687–698.
Gail Williams, P., Sears, L. L., & Allard, A. (2004). Sleep problems in children with autism. Journal of Sleep Research, 13, 265–268.
Crowley, S. J., Acebo, C., & Carskadon, M. A. (2007). Sleep, circadian rhythms, and delayed phase in adolescence. Sleep Medicine, 8, 602–612.
Gais, S., & Born, J. (2004). Declarative memory consolidation: Mechanisms acting during human sleep. Learning and Memory, 11, 679–685.
Czarnecki, A., Birtoli, B., & Ulrich, D. (2007). Cellular mechanisms of burst firing-mediated long-term depression in rat neocortical pyramidal cells. Journal of Physiology, 578, 471–479.
Gais, S., Molle, M., Helms, K., & Born, J. (2002). Learning-dependent increases in sleep spindle density, Journal of Neuroscience, 22, 6830–6834.
Daan, S., Domien, G. M. B., & Borbély, A. A. (1984). Timing of human sleep: Recovery process gated by a circadian pacemaker. American Journal of Physiology, 246, R161–R178.
Gao, P. P., Yue, Y., Cerretti, D. P., Dreyfus, C., & Zhou, R. (1999). Ephrindependent growth and pruning of hippocampal axons. Proceedings of the National Academy of Sciences, USA, 96, 4073–4077.
Dauvilliers, Y., Arnulf, I., & Mignot, E. (2007). Narcolepsy with cataplexy. Lancet, 369, 499–511.
Gip, P., Hagiwara, G., Ruby, N. F., & Heller, H. C. (2002). Sleep deprivation decreases glycogen in the cerebellum but not in the cortex of young rats. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 283, R54–R59.
De Felipe, J., Marco, P., Fairen, A., & Jones, E. G. (1997). Inhibitory synaptogenesis in mouse somatosensory cortex. Cerebral Cortex, 7, 619–634.
c24.indd 477
Dierker, L. J., Rosen, M. G., Pillay, S., & Sorokin, Y. (1982). Correlation between gestational-age and fetal activity periods. Biology of the Neonate, 42, 66–72.
8/18/09 5:51:46 PM
478
Sleep and Waking Across the Life Span
Glantz, L. A., Gilmore, J. H., Hamer, R. M., Lieberman, J. A., & Jarskog, L. F. (2007). Synaptophysin and postsynaptic density protein 95 in the human prefrontal cortex from mid-gestation into early adulthood. Neuroscience, 149, 582–591. Gould, E. L., Loesch, D. Z., Martin, M. J., Hagerman, R. J., Armstrong, S. M., & Huggins, R. M. (2000). Melatonin profiles and sleep characteristics in boys with fragile X syndrome: A preliminary study. American Journal of Medical Genetics, 95, 307–315. Hairston, I. S., Peyron, C., Denning, D. P., Ruby, N. F., Flores, J., Sapolsky, R. M., et al. (2004). Sleep deprivation effects on growth factor expression in neonatal rats: A potential role for BDNF in the mediation of delta power. Journal of Neurophysiology, 91, 1586–1595. Halbower, A. C., & Mahone, E. M. (2006). Neuropsychological morbidity linked to childhood sleep-disordered breathing. Sleep Medicine Reviews, 10, 97–107. Hedden, T., & Gabrieli, J. D. (2004). Insights into the ageing mind: A view from cognitive neuroscience. Nature Reviews: Neuroscience, 5, 87–96. Hobson, J. A., Pace-Schott, E. F., & Stickgold, R. (2000). Dreaming and the brain: Toward a cognitive neuroscience of conscious states. Behavioral and Brain Sciences, 23, 793–842; discussion 904–1121. Hof, P. R., & Morrison, J. H. (2004). The aging brain: Morphomolecular senescence of cortical circuits. Trends in Neurosciences, 27, 607–613. Hofman, M. A., & Swaab, D. F. (2006). Living by the clock: The circadian pacemaker in older people. Ageing Research Reviews, 5, 33–51. Horne, J. A. (1980). Sleep and body restitution. Experientia, 36, 11–13.
Jenni, O. G., Deboer, T., & Achermann, P. (2006). Development of the 24-h rest-activity pattern in human infants. Infant Behavior and Development, 29, 143–152. Jha, S. K., Jones, B. E., Coleman, T., Steinmetz, N., Law, C. T., Griffin, G., et al. (2005). Sleep-dependent plasticity requires cortical activity. Journal of Neuroscience, 25, 9266–9274. Johnson, E. O., Roth, T., Schultz, L., & Breslau, N. (2006). Epidemiology of DSM-IV insomnia in adolescence: Lifetime prevalence, chronicity, and an emergent gender difference. Pediatrics, 117, e247–e256. Jones, B. E. (2003). Arousal systems. Frontiers in Bioscience, 8, s438–s451. Jones, B. E. (2005). From waking to sleeping: Neuronal and chemical substrates. Trends in Pharmacological Sciences, 26, 578–586. Jouvet, M. (1962). [Research on the neural structures and responsible mechanisms in different phases of physiological sleep.]. Archives Italiennes de Biologie, 100, 125–206. Jouvet, M. (1965). Paradoxical sleep: A study of its nature and mechanisms. Progress in Brain Research, 18, 20–62. Jouvet, M. (1994). Paradoxical sleep mechanisms. Sleep, 17, S77–S83. Jouvet, M. (1998). Paradoxical sleep as a programming system. Journal of Sleep Research, 7(Suppl, 1), 1–5. Jouvet-Mounier, D., Astic, L., & Lacote, D. (1970). Ontogenesis of the states of sleep in rat, cat, and guinea pig during the first postnatal month. Developmental Psychobiology, 2, 216–239.
Hua, J. Y., & Smith, S. J. (2004). Neural activity and the dynamics of central nervous system development. Journal of Neuroscience, 7, 327–332.
Kales, A., Soldatos, C. R., & Kales, J. D. (1987). Sleep disorders: Insomnia, sleepwalking, night terrors, nightmares, and enuresis. Annals of Internal Medicine, 106, 582–592.
Huang, Z. L., Urade, Y., & Hayaishi, O. (2007). Prostaglandins and adenosine in the regulation of sleep and wakefulness. Current Opinion in Pharmacology, 7, 33–38.
Kamel, N. S., & Gammack, J. K. (2006). Insomnia in the elderly: Cause, approach, and treatment. American Journal of Medicine, 119, 463–469.
Huber, R., Deboer, T., & Tobler, I. (2000). Effects of sleep deprivation on sleep and sleep EEG in three mouse strains: Empirical data and simulations. Brain Research, 857, 8–19. Huber, R., Ghilardi, M. F., Massimini, M., Ferrarelli, F., Riedner, B. A., Peterson, M. J., et al. (2006). Arm immobilization causes cortical plastic changes and locally decreases sleep slow wave activity. Journal of Neuroscience, 9, 1169–1176. Huber, R., Ghilardi, M. F., Massimini, M., & Tononi, G. (2004, July 1). Local sleep and learning. Nature, 430, 78–81. Huber, R., Tononi, G., & Cirelli, C. (2007). Exploratory behavior, cortical BDNF expression, and sleep homeostasis. Sleep, 30, 129–139. Huttenlocher, P. R., & Dabholkar, A. S. (1997). Regional differences in synaptogenesis in human cerebral cortex. Journal of Comparative Neurology, 387, 167–178. Iber, C., Ancoli-Israel, S., Chesson, A. & Quan, S.F. (2007) The AASM manual for the scoring of sleep and associated events: rules, terminology and technical specifications. In Medicine, A.A.o.S. (ed), Westchester, Illinois. Iglowstein, I., Jenni, O. G., Molinari, L., & Largo, R. H. (2003). Sleep duration from infancy to adolescence: Reference values and generational trends. Pediatrics, 111, 302–307. Jenni, O. G., Achermann, P., & Carskadon, M. A. (2005). Homeostatic sleep regulation in adolescents. Sleep, 28, 1446–1454. Jenni, O. G., Borbély, A. A., & Achermann, P. (2004). Development of the nocturnal sleep electroencephalogram in human infants. American Journal of Physiology: Regulatory, Integrative and Comparative Physiology, 286, R528–R538.
Karni, A., Tanne, D., Rubenstein, B. S., Askenasy, J. J., & Sagi, D. (1994, July 29). Dependence on REM sleep of overnight improvement of a perceptual skill. Science, 265, 679–682. Kennaway, D. J., Stamp, G. E., & Goble, F. C. (1992). Development of melatonin production in infants and the impact of prematurity. Journal of Clinical Endocrinology and Metabolism, 75, 367–369. Kintraia, P. I., Zarnadze, M. G., Kintraia, N. P., & Kashakashvili, I. G. (2005). Development of daily rhythmicity in heart rate and locomotor activity in the human fetus. Journal of Circadian Rhythms, 3, 5. Klackenberg, G. (1982). Sleep behavior studied longitudinally: Data from 4–16 years on duration, night-awakening and bed-sharing. Acta Paediatrica Scandinavica, 71, 501–506. Kleitmann, N., & Engelmann, T. G. (1953). Sleep characteristics of infants. Journal of Applied Physiology, 6, 269–282. Kripke, D. F., Garfinkel, L., Wingard, D. L., Klauber, M. R., & Marler, M. R. (2002). Mortality associated with sleep duration and insomnia. Archives of General Psychiatry, 59, 131–136. Kripke, D. F., Simons, R. N., Garfinkel, L., & Hammond, E. C. (1979). Short and long sleep and sleeping pills. Is increased mortality associated? Archives of General Psychiatry, 36, 103–116. Krueger, J. M., Obal, F. J., Fang, J., Kubota, T., & Taishi, P. (2001). The role of cytokines in physiological sleep regulation. Annals of the New York Academy of Sciences, 933, 211–221. Landolt, H. P., Dijk, D. J., Achermann, P., & Borbély, A. A. (1996). Effect of age on the sleep EEG: Slow-wave activity and spindle frequency activity in young and middle-aged men. Brain Research, 738, 205–212.
Jenni, O. G., & Carskadon, M. A. (2004). Spectral analysis of the sleep electroencephalogram during adolescence. Sleep, 27, 774–783.
Largo, R.H., Molinari, L., von Siebenthal, K. & Wolfensberger, U. (1996) Does a profound change in toilet-training affect development of bowel and bladder control? Dev Med Child Neurol, 38, 1106–1116.
Jenni, O.G. & Carskadon, M.A. (2007) Sleep behavior and sleep regulation from infancy through adolescence: Normative aspects. Sleep Medicine Clinics, 2, 321–329.
Lindsley, D. B., Bowden, J. W., & Magoun, H. W. (1949). Effect upon the EEG of acute injury to the brainstem activating system. EEG Clinical Neurophysiology, 1, 475–486.
c24.indd 478
8/18/09 5:51:46 PM
References Linkowski, P., Kerkhofs, M., Hauspie, R., Susanne, C., & Mendlewicz, J. (1989). EEG sleep patterns in man: A twin study. Electroencephalography and Clinical Neurophysiology, 73, 279–284.
Partinen, M., Kaprio, J., Koskenvuo, M., Putkonen, P., & Langinvainio, H. (1983). Genetic and environmental determination of human sleep. Sleep, 6, 179–185.
Llinas, R. R., & Steriade, M. (2006). Bursting of thalamic neurons and states of vigilance. Journal of Neurophysiology, 95, 3297–3308.
Paus, T. (2005). Mapping brain maturation and cognitive development during adolescence. Trends in Cognitive Sciences, 9, 60–68.
Mahowald, M. W., & Rosen, G. M. (1990). Parasomnias in children. Pediatrician, 17, 21–31.
Peigneux, P., Laureys, S., Delbeuck, X., & Maquet, P. (2001). Sleeping brain, learning brain: The role of sleep for memory systems. NeuroReport, 12, A111–A124.
Mahowald, M. W., Schenck, C. H., & Bornemann, M. A. (2007). Pathophysiologic mechanisms in REM sleep behavior disorder. Current Neurology and Neuroscience Reports, 7, 167–172. Massimini, M., Huber, R., Ferrarelli, F., Hill, S., & Tononi, G. (2004). The sleep slow oscillation as a traveling wave. Journal of Neuroscience, 24, 6862–6870. McCormick, D. A., & Bal, T. (1997). Sleep and arousal: Thalamocortical mechanisms. Annual Review of Neuroscience, 20, 185–215. McCormick, D. A., & Pape, H. C. (1990). Properties of a hyperpolarization-activated cation current and its role in rhythmic oscillation in thalamic relay neurones. Journal of Physiology, 431, 291–318. McGinty, D., Gong, H., Suntsova, N., Alam, M. N., Methippara, M., Guzman-Marin, R., et al. (2004). Sleep-promoting functions of the hypothalamic median preoptic nucleus: Inhibition of arousal systems. Archives Italiennes de Biologie, 142, 501–509.
Peigneux, P., Laureys, S., Fuchs, S., Collette, F., Perrin, F., Reggers, J., et al. (2004). Are spatial memories strengthened in the human hippocampus during slow wave sleep? Neuron, 44, 535–545. Porkka-Heiskanen, T., Alanko, L., Kalinchuk, A., & Stenberg, D. (2002). Adenosine and sleep. Sleep Medicine Reviews, 6, 321–332. Porkka-Heiskanen, T., Strecker, R. E., Thakkar, M., Bjorkum, A. A., Greene, R. W., & McCarley, R. W. (1997, May 23). Adenosine: A mediator of the sleep-inducing effects of prolonged wakefulness. Science, 276, 1265–1268. Prull, M. W., Gabrieli, J. D. E., & Bunge, S. A. (2000). Age-related changes in memory: A cognitive neuroscience perspective. In F. I. M. Craik & T. A. Salthouse (Eds.), The handbook of aging and cognition (2nd ed., pp. 91–153). Mahawah, NJ: Lawrence Earlbaum.
McGinty, D., & Szymusiak, R. (2003). Hypothalamic regulation of sleep and arousal. Frontiers in Bioscience, 8, s1074–s1083.
Radun, I., & Summala, H. (2004). Sleep-related fatal vehicle accidents: Characteristics of decisions made by multidisciplinary investigation teams. Sleep, 27, 224–227.
Mednick, S., Nakayama, K., & Stickgold, R. (2003). Sleep-dependent learning: A nap is as good as a night. Journal of Neuroscience, 6, 697–698.
Rechtschaffen, A., Hauri, P., & Zeitlin, M. (1966). Auditory awakening thresholds in REM and NREM sleep stages. Perceptual and Motor Skills, 22, 927–942.
Meier-Koll, A., Bussmann, B., Schmidt, C., & Neuschwander, D. (1999). Walking through a maze alters the architecture of sleep. Perceptual and Motor Skills, 88, 1141–1159.
Reppert, S. M., & Schwartz, W. J. (1984). The suprachiasmatic nuclei of the fetal-rat: Characterization of a functional circadian clock using C14-labeled deoxyglucose. Journal of Neuroscience, 4, 1677–1682.
Mirmiran, M., Scholtens, J., van de Poll, N. E., Uylings, H. B., van der Gugten, J., & Boer, G. J. (1983). Effects of experimental suppression of active (REM) sleep during early development upon adult brain and behavior in the rat. Brain Research, 283, 277–286.
Riedner, B. A., Vyazovskiy, V. V., Huber, R., Massimini, M., Esser, S. K., Murphy, M., et al. (2007). Sleep homeostasis and cortical synchronization: Pt. III. A high-density EEG study of sleep slow waves in humans. Sleep, 30, 1643–1657.
Mishima, K., Tozawa, T., Satoh, K., Matsumoto, Y., Hishikawa, Y., & Okawa, M. (1999). Melatonin secretion rhythm disorders in patients with senile dementia of Alzheimer ’s type with disturbed sleep-waking. Biological Psychiatry, 45, 417–421.
Rivkees, S. A., & Hao, H. P. (2000). Developing circadian rhythmicity. Seminars in Perinatology, 24, 232–242.
Mistlberger, R. E. (2005). Circadian regulation of sleep in mammals: Role of the suprachiasmatic nucleus. Brain Research: Brain Research Reviews, 49, 429–454. Mistlberger, R. E., Bergmann, B. M., Waldenar, W., & Rechtschaffen, A. (1983). Recovery sleep following sleep deprivation in intact and suprachiasmatic nuclei-lesioned rats. Sleep, 6, 217–233. Miyamoto, H., Katagiri, H., & Hensch, T. (2003). Experience-dependent slow-wave sleep development. Nature Neuroscience, 6, 553–554. Montplaisir, J. (2004). Abnormal motor behavior during sleep. Sleep Medicine, 5(Suppl. 1), S31–S34. Moruzzi, G., & Magoun, H. W. (1949). Brain stem reticular formation and activation of the EEG. Electroencephalography and Clinical Neurophysiology, 1, 455–473. Mulder, E. J., Visser, G. H., Bekedam, D. J., & Prechtl, H. F. (1987). Emergence of behavioural states in fetuses of type-1-diabetic women. Early Human Development, 15, 231–251. Nakamura, H., Kobayashi, S., Ohashi, Y., & Ando, S. (1999). Age-changes of brain synapses and synaptic plasticity in response to an enriched environment. Journal of Neuroscience Research, 56, 307–315.
c24.indd 479
479
Roenneberg, T., Kuehnle, T., Pramstaller, P. P., Ricken, J., Havel, M., Guth, A., et al. (2004). A marker for the end of adolescence. Current Biology, 14, R1038–R1039. Roffwarg, H. P., Muzio, J. N., & Dement, W. C. (1966, April 29). Ontogenetic development of the human sleep-dream cycle. Science, 152, 604–619. Rosanova, M., & Ulrich, D. (2005). Pattern-specific associative long-term potentiation induced by a sleep spindle-related spike train. Journal of Neuroscience, 25, 9398–9405. Roth, T., Roehrs, T., & Pies, R. (2007). Insomnia: Pathophysiology and implications for treatment. Sleep Medicine Reviews, 11, 71–79. Sadeh, A., Gruber, R., & Raviv, A. (2002). Sleep, neurobehavioral functioning, and behavior problems in school-age children. Child Development, 73, 405–417. Sadeh, A., Raviv, A., & Gruber, R. (2000). Sleep patterns and sleep disruptions in school-age children. Developmental Psychology, 36, 291–301. Saper, C. B., Lu, J., Chou, T. C., & Gooley, J. (2005). The hypothalamic integrator for circadian rhythms. Trends in Neurosciences, 28, 152–157. Saper, C. B., Scammell, T. E., & Lu, J. (2005, October 27). Hypothalamic regulation of sleep and circadian rhythms. Nature, 437, 1257–1263.
O’Brien, L. M., Mervis, C. B., Holbrook, C. R., Bruner, J. L., Smith, N. H., McNally, N., et al. (2004). Neurobehavioral correlates of sleep-disordered breathing in children. Journal of Sleep Research, 13, 165–172.
Schmidt, C., Peigneux, P., Muto, V., Schenkel, M., Knoblauch, V., Munch, M., et al. (2006). Encoding difficulty promotes postlearning changes in sleep spindle activity during napping. Journal of Neuroscience, 26, 8976–8982.
Parmelee, A. H. (1961). Sleep patterns in infancy: A study of one infant from birth to 8 months of age. Acta Paediatrica, 50, 160.
Shaffery, J. P., Lopez, J., Bissette, G., & Roffwarg, H. P. (2006). Rapid eye movement sleep deprivation in post-critical period, adolescent rats
8/18/09 5:51:47 PM
480
Sleep and Waking Across the Life Span
alters the balance between inhibitory and excitatory mechanisms in visual cortex. Neuroscience Letters, 393, 131–135. Shaffery, J. P., Oksenberg, A., Marks, G. A., Speciale, S. G., Mihailoff, G., & Roffwarg, H. P. (1998). REM sleep deprivation in monocularly occluded kittens reduces the size of cells in LGN monocular segment. Sleep, 21, 837–845. Shaffery, J. P., Sinton, C. M., Bissette, G., Roffwarg, H. P., & Marks, G. A. (2002). Rapid eye movement sleep deprivation modifies expression of long-term potentiation in visual cortex of immature rats. Neuroscience, 110, 431–443. Shaw, P. J., Cirelli, C., Greenspan, R. J., & Tononi, G. (2000, March 10). Correlates of sleep and waking in Drosophila melanogaster. Science, 287, 1834–1837. Shaw, P.J., Greenstein, D., Lerch, J., Clasen, L., Lenroot, R., Gogtay, N., et al. (2006, March 30). Intellectual ability and cortical development in children and adolescents. Nature, 440, 676–679. Sherin, J. E., Shiromani, P. J., McCarley, R. W., & Saper, C. B. (1996, January 12). Activation of ventrolateral preoptic neurons during sleep. Science, 271, 216–219. Shibagaki, M., Kiyono, S., & Watanabe, K. (1982). Spindle evolution in normal and mentally retarded children: A review. Sleep, 5, 47–57. Siegel, J. M. (2005, October 27). Clues to the functions of mammalian sleep. Nature, 437, 1264–1271. Sirota, A., Csicsvari, J., Buhl, D., & Buzsaki, G. (2003). Communication between neocortex and hippocampus during sleep in rodents. Proceedings of the National Academy of Sciences, USA, 100, 2065–2069. Skaggs, W. E., & McNaughton, B. L. (1996, March 29). Replay of neuronal firing sequences in rat hippocampus during sleep following spatial experience. Science, 271, 1870–1873. Spiegel, K., Leproult, R., & Van Cauter, E. (1999). Impact of sleep debt on metabolic and endocrine function. Lancet, 354, 1435–1439. Steriade, M. (1999). Coherent oscillations and short-term plasticity in corticothalamic networks. Trends in Neurosciences, 22, 337–345. Steriade, M., McCormick, D. A., & Sejnowski, T. J. (1993, October, 29). Thalamocortical oscillations in the sleeping and aroused brain. Science, 262, 679–685.
states, breathing events, peripheral chemoresponsiveness and arousal propensity in healthy 3 month old infants. European Respiratory Journal, 9, 932–938. Thorpy, M. J., Korman, E., Spielman, A. J., & Glovinsky, P. B. (1988). Delayed sleep phase syndrome in adolescents. Journal of Adolescent Health Care, 9, 22–27. Tobler, I. I. (2000). Phylogeny of sleep regulation. In M. H. Kryger, T. Roth, & W. C. Dement (Eds.), Principles and practice of sleep medicine (pp. 72–81). Philadelphia: Saunders. Tobler, I. I., Borbély, A. A., & Groos, G. (1983). The effect of sleep deprivation on sleep in rats with suprachiasmatic lesions. Neuroscience Letters, 42, 49–54. Tobler, I. I., Franken, P., Trachsel, L., & Borbély, A. A. (1992). Models of sleep regulation in mammals. Journal of Sleep Research, 1, 125–127. Tononi, G., & Cirelli, C. (2003). Sleep and synaptic homeostasis: A hypothesis. Brain Research Bulletin, 62, 143–150. Tononi, G., & Cirelli C. (2006). Sleep function and synaptic homeostasis. Sleep Medicine Reviews, 10, 49–62. Trachsel, L., Edgar, D. M., Seidel, W. F., Heller, H. C., & Dement, W. C. (1992). Sleep homeostasis in suprachiasmatic nuclei-lesioned rats: Effects of sleep deprivation and triazolam administration. Brain Research, 589, 253–261. Trachtenberg, J. T., Chen, B. E., Knott, G. W., Feng, G., Sanes, J. R., Welker, E., et al. (2002, December 19). Long-term in vivo imaging of experience-dependent synaptic plasticity in adult cortex. Nature, 420, 788–794. Turrigiano, G. G., & Nelson, S. B. (2000). Hebb and homeostasis in neuronal plasticity. Current Opinion in Neurobiology, 10, 358–364. Turrigiano, G. G., & Nelson, S. B. (2004). Homeostatic plasticity in the developing nervous system. Nature Reviews: Neuroscience, 5, 97–107. Urschitz, M. S., Guenther, A., Eggebrecht, E., Wolff, J., UrschitzDuprat, P. M., Schlaud, M., et al. (2003). Snoring, intermittent hypoxia and academic performance in primary school children. American Journal of Respiratory and Critical Care Medicine, 168, 464–468.
Steriade, M., & Timofeev, I. (2003). Neuronal plasticity in thalamocortical networks during sleep and waking oscillations. Neuron, 37, 563–576.
Van Cauter, E., Leproult, R., & Plat, L. (2000). Age-related changes in slow wave sleep and REM sleep and relationship with growth hormone and cortisol levels in healthy men. Journal of the American Medical Association, 284, 861–868.
Stickgold, R., James, L., & Hobson, J. A. (2000). Visual discrimination learning requires sleep after training. Nature Neuroscience, 3, 1237–1238.
Vertes, R. P., & Eastman, K. E. (2000). The case against memory consolidation in REM sleep. Behavioral and Brain Sciences, 23, 867–876; discussion 904–1121.
Suntsova, N., Szymusiak, R., Alam, M. N., Guzman-Marin, R., & McGinty, D. (2002). Sleep-waking discharge patterns of median preoptic nucleus neurons in rats. Journal of Physiology, 543, 665–677.
Vgontzas, A. N., & Kales, A. (1999). Sleep and its disorders. Annual Review of Medicine, 50, 387–400.
Swaab, D. F., Hofman, M. A., & Honnebier, M. B. O. M. (1990). Development of vasopressin neurons in the human suprachiasmatic nucleus in relation to birth. Developmental Brain Research, 52, 289–293. Szymusiak, R., Alam, N., Steininger, T. L., & McGinty, D. (1998). Sleepwaking discharge patterns of ventrolateral preoptic/anterior hypothalamic neurons in rats. Brain Research, 803, 178–188. Szymusiak, R., Steininger, T., Alam, N., & McGinty, D. (2001). Preoptic area sleep-regulating mechanisms. Archives Italiennes de Biologie, 139, 77–92. Tabba, M. K., & Johnson, J. C. (2006). Obstructive sleep apnea: A practical review. Missouri Medicine, 103, 509–513. Tasali, E., Leproult, R., Ehrmann, D. A., & Van Cauter, E. (2008). Slowwave sleep and the risk of type 2 diabetes in humans. Proceedings of the National Academy of Sciences, USA, 105, 1044–1049. Thomas, D. A., Poole, K., McArdle, E. K., Goodenough, P. C., Thompson, J., Beardsmore, C. S., et al. (1996). The effect of sleep deprivation on sleep
c24.indd 480
Visser, G. H. A., Poelmannweesjes, G., Cohen, T. M. N., & Bekedam, D. J. (1987). Fetal behavior at 30 to 32 weeks of gestation. Pediatric Research, 22, 655–658. von Economo, C. (1930). Sleep as a problem of localization. Journal of Nervous and Mental Diseases, 71. Vyazovskiy, V. V., Cirelli, C., Pfister-Genskow, M., Faraguna, U., & Tononi, G. (2008). Molecular and electrophysiological evidence for net synaptic potentiation in wake and depression in sleep. Journal of Neuroscience, 11, 200–208. Vyazovskiy, V. V., Riedner, B. A., Cirelli, C., & Tononi, G. (2007). Sleep homeostasis and cortical synchronization: Pt. II. A local field potential study of sleep slow waves in the rat. Sleep, 30, 1631–1642. Walker, M. P., & Stickgold, R. (2004). Sleep-dependent learning and memory consolidation. Neuron, 44, 121–133. Webb, W. B., & Agnew, H. W., Jr. (1971, December 24). Stage 4 sleep: Influence of time course variables. Science, 174, 1354–1356. Webster, M. J., Weickert, C. S., Herman, M. M., & Kleinman, J. E. (2002). BDNF mRNA expression during postnatal development, maturation and
8/18/09 5:51:47 PM
References aging of the human prefrontal cortex. Brain Research: Developmental Brain Research, 139, 139–150.
Wyatt, J. K. (2004). Delayed sleep phase syndrome: Pathophysiology and treatment options. Sleep, 27, 1195–1203.
Weitzman, E. D., Czeisler, C. A., Coleman, R. M., Spielman, A. J., Zimmerman, J. C., Dement, W., et al. (1981). Delayed sleep phase syndrome: A chronobiological disorder with sleep-onset insomnia. Archives of General Psychiatry, 38, 737–746.
Youngstedt, S. D., & Kripke, D. F. (2004). Long sleep and mortality: Rationale for sleep restriction. Sleep Medicine Reviews, 8, 159–174.
Weyerer, S., & Dilling, H. (1991). Prevalence and treatment of insomnia in the community: Results from the upper Bavarian field study. Sleep, 14, 392–398. Williams, H. L., Hammack, J. T., Daly, R. L., Dement, W. C., & Lubin, A. (1964). Responses to auditory stimulation, sleep loss and the eeg stages of sleep. Electroencephalography and Clinical Neurophysiology, 16, 269–279. Wilson, M. A., & McNaughton, B. L. (1994, July 29). Reactivation of hippocampal ensemble memories during sleep. Science, 265, 676–679. Wolfson, A. R., & Carskadon, M. A. (1998). Sleep schedules and daytime functioning in adolescents. Child Development, 69, 875–887.
c24.indd 481
481
Zee, P. C., & Manthena, P. (2007). The brain’s master circadian clock: Implications and opportunities for therapy of sleep disorders. Sleep Medicine Reviews, 11, 59–70. Zeitzer, J. M., Morales-Villagran, A., Maidment, N. T., Behnke, E. J., Ackerson, L. C., Lopez-Rodriguez, F., et al. (2006). Extracellular adenosine in the human brain during sleep and sleep deprivation: An in vivo microdialysis study. Sleep, 29, 455–461. Zemdegs, I. Z., McMillen, I. C., Walker, D. W., Thorburn, G. D., & Nowak, R. (1988). Diurnal rhythms in plasma melatonin concentrations in the fetal sheep and pregnant ewe during late gestation. Endocrinology, 123, 284–289. Zuo, Y., Lin, A., Chang, P., & Gan, W. B. (2005). Development of longterm dendritic spine stability in diverse regions of cerebral cortex. Neuron, 46, 181–189.
8/18/09 5:51:47 PM
Chapter 25
Consciousness CHADD M. FUNK, MARY COLVIN PUTNAM, AND MICHAEL S. GAZZANIGA
experience, maintaining that phenomenal conscious experience is associated with activity in specific modules rather than activity in a distributed system connecting these modules and suggesting that an interpretive process in the left hemisphere ensures that conscious experience is unified. We conclude by attempting to reconcile these opposing perspectives.
While the topic of consciousness has puzzled thinkers for millennia, the scientific endeavor to understand consciousness in terms of brain function is relatively new. Empirical investigation of the mechanisms underlying or related to conscious experience is flourishing, and the results are providing astounding insight into the very essence of human behavior. But even as neuroscientists and modern philosophers move toward an understanding of the complex neural interactions that are both necessary and sufficient for conscious experience, we remain keenly aware that solving what Chalmers (1995) referred to as the “hard problem” of consciousness, that is, achieving an understanding of how an individual’s specific patterns of neuronal firing create rich, textured, and unique conscious experience, is but a distant goal. Presently, the foci of scientific investigation are the more tractable, but nonetheless daunting, “soft problems,” that address how brain activity related to various cognitive functions, such as attention, perception, and action, gives rise to conscious experience.
DEFINING CONSCIOUSNESS The term consciousness is typically used in one of two ways. One meaning refers to the general state of being conscious, as opposed to being unconscious. The second definition refers to being conscious of specific content. Though an intuitive and useful distinction, it can be misleading in two ways when used as a framework to guide investigation of the neuroanatomical basis of conscious experience. First, it may veil an important characteristic of general conscious states, specifically, that the state of being conscious inherently provides a certain amount of information to an organism. Second, the content of consciousness may vary along a continuum of complexity. This may not be immediately apparent because it is inescapable human nature to associate conscious content with the fullest realization of the content spectrum, the healthy human end, in which vivid subjective representations stir rich and varied cognitive associations. Nonetheless, the content spectrum also dwindles down to basic conscious representations, stripped of cognitive associations, if for no other reason than their being processed in more primitive neural apparatus. One may argue that refining the definition of consciousness in order to guard against neglect of these characteristics is a mere technicality, unlikely to affect theoretical or empirical inquiries of the neural basis of conscious experience. However, overlooking these simple facts may have real consequences because it may lead one to oversimplify the role of neural structures that provide the foundation for
In this chapter, we first evaluate the uses of the term consciousness and address the neural foundations of conscious states. We then consider theoretical accounts of the mechanisms by which distributed modules are integrated to generate unified conscious experience and assess the present understanding of the neural underpinnings of separate functions central to conscious experience, each organized in discrete modules that process highly specific information and produce distinct conscious content. Next, we address a peculiar observation from parallel investigations of split-brain patients, namely, that the isolated left hemisphere reports no disruption to its conscious experience following callosal transection. Theories that posit long-distance, reentrant connectivity as the basis of all conscious experience appear unable to explain the striking fact that severing interhemispheric fibers produces little subjectively noticeable effect on the scope or unity of conscious experience. Accordingly, we promote a different perspective on the neural basis of conscious 482
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c25.indd 482
8/17/09 2:23:57 PM
Foundations of Consciousness 483
conscious experience. In turn, potential inconsistencies within theoretical explanations are masked. We return to this possibility in subsequent sections. We favor Damasio’s (1999) proposed distinction between core consciousness and extended consciousness, which better accounts for the properties of general conscious states and the gradient of possible content. Core consciousness provides an organism with a sense of “here and now.” An organism with core consciousness is awake and aware at a very basic level. Furthermore, they are aware of specific content at the simple end of the content continuum, or representations of objects as they exist in a given moment. Notably absent from this sort of content are explicit associations with past encounters or plans for future ways of interacting with the represented object. For instance, core consciousness may include a crude perception of a piece of chocolate, but it would not evoke memories of a favorite chocolate shop in Santa Barbara, nor would it conjure plans to pick up that chocolate and give it to a lover to regain their favor. Edelman (1989) calls this cognitively isolated moment the “remembered present” (incidentally, in many ways, his conceptualization of “primary” and “high-order consciousness” is in the same spirit as Damasio’s proposal). Importantly, core consciousness is necessary for extended consciousness, the lavish form of consciousness that allows us to perceive, to form associations with an object, to explicitly recall past events, to think about and plan for the future, and to perform countless other mental operations. A critical attribute of extended consciousness is self-consciousness, the understanding that “we,” agents who perceive and act, are indeed conscious. As these examples suggest, extended consciousness encompasses the rich and complex end of the content continuum; it includes the subjective flavor associated with our many special capabilities. Whereas core consciousness is defined as a set of neural interactions that provides the foundation for all conscious experience and thus only supports basic awareness of specific content, extended consciousness purely enriches and expands the scope of conscious experience.
FOUNDATIONS OF CONSCIOUSNESS We now address the critical neural nodes that compromise the neural basis of consciousness, beginning with core consciousness. Consistent with the previous theoretical depiction, core consciousness depends upon brain areas that are necessary for any form of conscious experience (Damasio, 1999). Studies of patients who have sustained neurological damage and are subsequently left unconscious, or
c25.indd Sec1:483
completely lacking core consciousness, have provided the most direct means of identifying the brain regions necessary for core consciousness. This research suggests that even core consciousness exists along a continuum. The loss of core consciousness can be absolute, resulting in coma, or it can be slightly less severe, ranging from vegetative states to minimally conscious states. The lines between these states are often vague at best, and it is yet unclear whether these states vary with respect to providing a platform to support extended consciousness. Nonetheless, the patterns of neurological damage distinguishing these states has revealed much about the brain regions involved in enabling core consciousness, and thus, these states will be briefly described here. The most profound of the unconscious states is coma. Comatose patients cannot be aroused and are not aware of either self or environment. They do not make purposeful movements in response to stimulation and may retain only very basic motor reflexes (Giacino et al., 2002; Laureys, Perrin, & Brédart, 2007). The comatose state arises from a significant disruption of the ascending arousal system located throughout the pons and midbrain, or of this system’s two primary targets, the thalamus or hypothalamus. It may also arise from widespread dysfunction within both cerebral hemispheres (e.g., following a metabolic or toxic condition impacting the entire brain; Saper, 2000). Thus, in the comatose state, cortical neurons are unable to receive ascending signals, and incoming sensory stimuli are not represented at the levels of core or extended consciousness. The vegetative state (VS) is differentiated from coma by a greater level of arousal, specifically the presence of sleep and wake cycles. Vegetative patients exhibit periods of wakefulness, accompanied by eye opening and spontaneous eye movements. Like comatose patients, basic reflexes may be intact. However, vegetative patients do not exhibit purposeful responses to sensory stimuli that would indicate an awareness of either self or environment (Laureys et al., 2007; Schiff & Plum, 2000). Thus, this condition demonstrates a dissociation between arousal and awareness. While arousal is necessary to support awareness, it is not sufficient for a conscious state to emerge. Finally, there is the minimally conscious state (MCS). Patients in this state demonstrate some level of awareness of either self or environment but are unable to communicate consistently. According to the most recently defined criteria for MCS, the patient must demonstrate at least one of the following behaviors, either more than once or on a sustained basis: (a) follow simple commands, (b) gestural or verbal yes/no response (regardless of accuracy), (c) intelligible verbalization, or (d) purposeful behavior not due to reflexive activity (Laureys et al., 2007). In other words, the MCS patient is able to either inconsistently
8/17/09 2:23:57 PM
484
Consciousness
communicate with others or inconsistently respond appropriately to environmental stimuli. The border between the VS and the MCS is anything but precise. Such ambiguity may easily lead to misdiagnosis, which in these cases, often carries significant legal and ethical weight. Traditionally, such classifications relied purely on behavioral responses and occasionally, the patterns of neuronal firing noted on electroencephalogram (EEG). More recently, neuroimaging technology has provided a unique opportunity to look at the function of brains in the various unconscious or minimally conscious states. The results have been both illuminating and controversial. Multiple studies have shown that the amount of neural processing that can occur in the VS is far greater than expected (Coleman et al., 2007; Owen et al., 2006; see Owen & Coleman, 2007, for a review). It is important to emphasize that evidence of neural activity does not indicate that all patients in a VS are conscious; instead, it exposes limitations in current diagnostic protocols. As a clearer understanding of the neural areas and interactions necessary and sufficient for conscious experience emerges, it could immediately impact and dramatically improve diagnosis of the various global disorders of consciousness. As previously discussed, coma can result from disruption along the ascending arousal system and/or widespread cortical dysfunction. Coma and VS may also result from small lesions to particular areas of the brain, including the brain stem reticular formation and the intralaminar nuclei of the thalamus (Baars, 1995; Bogen, 1995), while MCS may arise from lesions to these areas or the anterior cingulate cortex (Damasio, 1999; Laureys et al., 2007). The brain stem, which includes the medulla, pons, and midbrain, contains various nuclei responsible for basic autonomic functions that are necessary for an organism’s survival and is involved in relaying an enormous amount of information about the state of the organism to higher brain areas (Parvizi & Damasio, 2001). These efferent pathways include serotonergic, noradrenergic, dopaminergic, cholinergic, and glutamergic projections. As observed in the case of coma, total disruption of these projections to the thalamus or hypothalamus results in loss of arousal and therefore, core consciousness. Similar states, ranging from coma to MCS, can arise from bilateral damage to the intralaminar nuclei (ILN) of the thalamus (Bogen, 1995), which receives dense glutamergic projections from the ascending reticular formation. Schiff and colleagues (2007) reported that stimulation of the ILN in a minimally conscious patient dramatically improved his level of awareness and enabled him to communicate with family members after 6 years of silence. Though one must exercise caution when drawing conclusions from a single patient, this report nonetheless
c25.indd 484
provides highly suggestive evidence of the important role that the ILN play in supporting core consciousness. The cellular composition of the ILN is consistent with this conclusion. In fact, the ILN are distinguishable from other thalamic nuclei based on their cellular composition. The ILN are composed mainly of matrix cells, which project diffusely throughout the cortex, rather than to specific cortical targets (Jones, 1998a, 1998b). These diffuse projections are believed to orchestrate long-distance interactions between various cortical areas, which are necessary for extended consciousness (Jones, 2002a, 2002b). In contrast, other thalamic nuclei are comprised mainly of core cells, which project in a highly specific pattern to distinct cortical areas (Jones, 1998a, 1998b). These cells presumably relay specific content and may provide the nonconscious foundation for modality-specific perception (Jones, 2002a, 2002b). Though the ILN projects in a fairly diffuse manner, it sends its densest projections to the ACC (Van der Werf, Witter, & Groenewegen, 2002). Dysfunction of this connection, as evidenced by altered functional connectivity between the ILN and ACC, has been noted in a patient in a VS (Laureys et al., 2000). Furthermore, though selective damage that encompasses the entirety of the ACC is rare, when it occurs, it results in a condition usually referred to as akinetic mutism (Damasio, 1999; Devinsky, Morrell, & Vogt, 1995). Akinetic mutism is a type of minimally conscious state, characterized by an inability to generate or internally guide action as well as abolishment or severe impairment of thought, speech, and emotion (Laureys et al., 2007). Though awake and occasionally able to track a visual stimulus with their eyes, these patients demonstrate few signs of core consciousness (Damasio, 1999). Taken together, these studies indicate that the ACC is a critical node in the neural circuitry of core consciousness. However, relative to the brain stem and the ILN, the ACC interacts to a far greater extent with cortical areas associated with sensory, motor, cognitive, and emotional cortical areas. Thus, the ACC is uniquely poised at the interface between areas that support core consciousness and those that contribute to the complex content of extended consciousness. In the next section, we describe the neural processes involved in supporting extended consciousness in greater depth.
EXTENDED CONSCIOUSNESS: INTEGRATION ACROSS CONTENT MODULES Though necessary for core consciousness, neural activity in the brain stem, ILN, and possibly also some types of activity in the ACC, is not modulated in a highly specific
8/17/09 2:23:57 PM
Extended Consciousness: Integration across Content Modules 485
fashion by the content of conscious experience (Koch, 2004). Thus, core consciousness encompasses the basic end of the content continuum; it is unlikely (but perhaps impossible to empirically demonstrate, see Nagel, 1974) that the conscious representations emerging solely from these basic processes is comparable to the conscious experience with which we are familiar. Rather, areas throughout the rest of the thalamus and cortex are responsible for the integration of an enormous amount of information, related to perception, cognition, and action, that eventually leads to the content of extended consciousness. According to most current theories of consciousness (Baars, 1988; Crick & Koch, 2003; Dehaene, Kerszberg, & Changeux, 1998; Dehaene & Naccache, 2001; Tononi & Edelman, 1998) in any given moment, there is a privileged, dynamic subset of this cortical activity, likely summated over innumerable simultaneous neural interactions yet excluding countless others, that represents the current content of the conscious experience of an individual. Groups of neurons gain, briefly maintain, and finally surrender this transient “cerebral celebrity” (Dennett, 1993) to the next subset of neurons whose content then becomes conscious. These theories can be characterized as an attempt to explain how a group of neurons distinguishes itself from other groups of neurons in order to consciously “broadcast” its content (Baars, 1988), generally through some form of competition. If correct, these theories could potentially identify the leading coalition of neurons in a given instant and thereby determine the particular content of one’s conscious experience. However, an actual understanding of how these neurons (and the mechanisms they employ to out-compete their rivals) generate specific qualia, or the phenomenal constituents of conscious experience, is currently beyond comprehension. This is the unrelenting “hard problem.” We contend that many of these theories overemphasize the importance of long-distance cortical connectivity in sustaining phenomenal conscious experience. It may be a necessary condition, in concert with activity in certain modules, for some kinds of phenomenal conscious content but it is not necessary for all conscious content. This is subtly predicted by the possibility that core consciousness, which does not depend on the distributed cortical activity postulated by the global workspace theory to underlie conscious experience (Baars, 1988; Dehaene & Naccache, 2001), may generate some basic form of phenomenal conscious experience. But before more thoroughly addressing these conflicting accounts, we review a breadth of studies in order to identify cortical areas that contribute specific content to extended consciousness, to understand their patterns of neuronal activity, and to characterize how these areas interact.
c25.indd 485
The core assumption involved is that the brain is organized in a highly modular fashion, with unique modules processing highly specific information (Fodor, 1983; Gazzaniga, 1989). When neuronal activity within an individual module is disrupted or eliminated, the ability to process that category of information is lost and the content of consciousness is altered, such that disorders of extended consciousness emerge (Cooney & Gazzaniga, 2003). For example, among the cortical areas involved in processing visual stimuli, there are functionally specific areas that process specific classes of stimuli. One such area is the fusiform face area (Kanwisher, 2001; Kanwisher, McDermott, & Chun, 1997); selective damage to this region can lead to impaired recognition of known faces, while other aspects of visual perception are preserved. Thus, these cortical modules are not necessary for core or extended consciousness. They contribute only one element to extended consciousness, an indication that the content of extended consciousness must be the product of integration across multiple brain areas that operate in parallel. Consequently, understanding extended consciousness broadly requires investigation of both the individual modules and the mechanisms by which they interact. The diversity of potential conscious content indicates that the tedious nonconscious/conscious divide must be inspected for a great variety of neuronal processing. For the purpose of this chapter, we choose to define modules as cortical areas with distinct functions and focus on those modules that are best understood and arguably most relevant to conscious experience. Specifically, we discuss the cortical areas involved in the diverse processes of attention, visual perception, emotion, memory, and motor function. Because it is not enough to examine each of these modules in isolation, at the end of the chapter we discuss how the products of these various content-laden processes are assimilated into our undeniably united conscious experience. We initially consider the concept of a “global workspace,” proposed by Dehaene and Naccache (2001), after Baars (1988). Briefly, the workspace consists of the top levels of hierarchical processors that are connected by long-distance, reciprocal projections. The various modular processors compete to “broadcast” their content into the workspace, and the content of the workspace in a given instant is tantamount to the content of consciousness. Importantly, the reciprocal connections serve to sustain the firing in the victorious processor and are thus considered to be necessary for conscious experience. While active in the workspace, content of one processor is available to other processors, enabling the great many possible operations that can be consciously performed. Though we believe that the workspace ultimately fails to explain certain observations gleaned from split-brain patients,
8/17/09 2:23:58 PM
486
Consciousness
we nonetheless refer to it throughout the following sections and emphasize the apparent importance of connectivity between various modules and areas thought to be integral components of the workspace. We employ this unconventional approach for two reasons: First, the workspace theory is representative of the prevalent notion that extensive reciprocal connectivity between cortical modules is a necessary substrate of conscious experience and thus merits equitable review. Second, it may be possible to delineate specific kinds of conscious content that require a neural architecture similar to the workspace, which would ultimately enable beneficial integration of the two conflicting theories.
MODULES OF EXTENDED CONSCIOUSNESS Attention and Consciousness Attention, as we discuss it here, refers to the ability to orient to an external stimulus and to briefly sustain the representation of that stimulus in mind. This process involves two different mechanisms: stimulus-driven, bottom-up attention and selective, top-down attention (see Chapters 17 and 18). In both cases, neuronal activity is amplified within cortical modules processing especially pertinent content, thereby facilitating conscious awareness of that content (Dehaene & Naccache, 2001; Posner, 1994; Posner & Dehaene, 1994). It is easy to see how the terms attention and consciousness are often confused; as attention can be viewed as responsible for providing access to consciousness (Baars, 1997; Dehaene & Changeux, 2004). Yet consciousness is not strictly reliant on attention, nor does attention assure access into consciousness (Koch & Tsuchiya, 2007). In what follows, we discuss recent research into brain activity at the boundary between attention and consciousness, which has provided support for this double dissociation. Manipulating Attention Prevents Stimuli from Reaching Consciousness In many situations, attention is necessary for conscious experience of specific content, and manipulating attention can alter perception. The relationship between attention and the content of consciousness has been most rigorously explored in the sensory modality of vision. We address higher-order visual functions more thoroughly in the next section, focusing now on insights gained from the phenomena of the attentional blink and change blindness. In each of these situations, varying one’s level of attention leads to important omissions in the content of consciousness. The attentional blink paradigm probes the temporal limits of visual processing by presenting two visual targets
c25.indd 486
in rapid succession. Repeated studies have demonstrated that if the second target follows the first by less than 500 ms, subjects do not consciously report the presence of the second target (Broadbent & Broadbent, 1987; Raymond, Shapiro, & Arnell, 1992). Reporting the first target engages a limited capacity processing stage, or bottleneck, impeding similar processing of the second target (Marois & Ivanoff, 2005). However, while the content of the second stimulus escapes conscious awareness, it is still processed on a nonconscious level, and may therefore impact subsequent behavior. For instance, this phenomenon was first demonstrated by a patient who could correctly identify two stimuli as identical or different despite only being consciously aware of one stimulus (Volpe, Ledoux, & Gazzaniga, 1979). Thus, by manipulating the extent of processing for two sequentially presented stimuli, the attentional blink paradigm provides a tactic for understanding the fine line separating nonconscious and conscious processing. Using the attentional blink paradigm, Marois, Yi, and Chun (2004) performed a clever experiment to compare patterns of cortical activity involved in processing the perceived stimulus and the masked stimulus. They took advantage of the specificity of the inferotemporal (IT) area, by using faces (processed in the fusiform face area, or FFA) and places (processed in the parahippocampal place area, or PPA). When the first target was a face and the second target was a picture of a scene, they found that both IT regions (i.e., the FFA and PPA) were active, but that specific frontal lobe regions were only active in concert with the activity in the PPA (related to processing of the scene) when subjects consciously reported detecting the scene. Importantly, while PPA activity was observed regardless of whether the scene was consciously detected, PPA activity was modulated by conscious awareness, such that activity was further increased by detection. Thus, it appears that a certain level of cortical activity involved in processing a stimulus, in concert with frontal activity, may be important for conscious experience. Electroencephalography (EEG) studies investigating the temporal dynamics of cortical activity in the attentional blink paradigm have suggested that the modulation of activity in lower-level visual areas associated with conscious awareness occurs after the initial perception of the stimulus. When Sergent, Baillet, and Dehaene (2005) compared EEG response patterns to seen and unseen targets in an attentional blink paradigm, they found that early processing (up to 170 ms) of the second target did not differ in seen or unseen trials. However, processing of seen versus unseen targets had different EEG profiles from 170 ms on. For seen targets only, late activations lasting 200 to 300 ms were observed in the dorsolateral prefrontal cortex (DLPFC), anterior cingulate cortex (ACC), and inferior
8/17/09 2:24:00 PM
Modules of Extended Consciousness 487
parietal lobe. Seen targets also evoked a late P3b waveform, indicative of top-down processing, likely initiated and sustained by frontal activity. Furthermore, as the peak of the P3b waveform resolved, the second target became more likely to be reported, suggesting that the top-down activity underlying the P3b waveform may compete with bottom-up activity of the second stimulus, preventing the coherence necessary for further processing. Altogether, this research suggests that early visual processing is not sufficient for visual consciousness and that coherence with frontal and parietal areas, which likely provide top-down feedback modulating activity in visual areas, may be necessary for consciousness (Sergent et al., 2005). Using the related paradigm of change blindness, Beck, Rees, Frith, and Lavie (2001) replicated the findings described previously. Participants were presented with two identical or slightly different pictures in rapid sequence and were asked to judge whether the pictures were the same or different while performing a distracter task. Cortical activity in the fusiform area varied depending on whether a change occurred, regardless of whether the subject actually detected the change. When subjects did detect a change, cortical activity in the DLPFC, parietal lobe, and fusiform gyrus was greater relative to when they did not detect the change. Taken together, these results suggest that activity in cortical areas involved in lower-level visual processing (i.e., the fusiform area) is independent from conscious awareness, and that cortical networks involving frontal and parietal areas must interact with visual areas to facilitate the conscious representation of a stimulus (Rees, Kreiman, & Koch, 2002). The different levels of cortical activity in the inferotemporal and the subsequent interactions with frontal/parietal areas reflects the roles of bottom-up and top-down attentional mechanisms in promoting conscious representation of a stimulus. In both the attentional blink and change blindness paradigms, bottom-up mechanisms are involved in processing the masked stimuli and the changed picture, respectively, but whether this information enters conscious awareness is determined by top-down mechanisms. There is a complex relationship between these two mechanisms such that bottom-up processing engages top-down processing and top-down processing modulates bottom-up processing. A key question then, relates to the neural processes at the intersection of bottom-up and top-down attentional systems. To explore this question, Crottaz-Herbette and Menon (2006) used an oddball paradigm, in which one stimulus occurs 80% of the time and the oddball stimulus occurs 20%. Subjects come to expect the usual stimulus, so the oddball stimulus automatically captures the attention of the participant. Cortical activity in modality-specific
c25.indd 487
regions (primary auditory or primary visual cortex) and ACC increased after presentation of an oddball stimulus. The investigators then assessed the effective connectivity between the ACC and the modality-specific regions. They found that visual oddballs induced increased effective connectivity between the ACC and striate cortex, while auditory oddballs induced increased effective connectivity between the ACC and Heschl’s gyrus. These results indicated that the ACC plays an important role in directing top-down attentional mechanisms in order to enhance stimulus processing. Using EEG, Crottaz-Herbette and Menon (2006) further explored the temporal dynamics underlying top-down enhancement of stimulus processing. Dipole modeling revealed that the ACC was the source of the N2b wave, which occurs between 200 to 300 ms poststimulus and is thought to be associated with controlled orientation to salient environmental stimuli. About 50 ms before the N2b wave appears, a negative deflection was recorded above the primary sensory cortex after presentation of oddball stimuli, in keeping with bottom-up enhancement of cortical activity. This bottom-up activity appears to activate the ACC, which results in the N2b wave. A second waveform, the P3a, then arises from various frontal areas and may be indicative of the reallocation of attention and the various cognitive resources that accompany attention. Thus, these results illustrate an interaction between bottom-up and topdown attention mechanisms and further indicate that attention often facilitates access to consciousness (Dehaene & Changeux, 2004). Separation of Consciousness and Attention Having cited evidence demonstrating how attentional mechanisms contribute to conscious awareness, we now consider evidence supporting the other half of the double dissociation between attention and consciousness, specifically evidence for consciousness without attention. Such evidence is commonly generated by studies that have employed divided attention paradigms, during which participants perform simultaneous tasks. In these studies, the primary task is designed to consume all available attentional resources, while a second task is used to assess perception of a peripheral stimulus. In general, subjects cannot successfully perform the primary task without an associated performance cost on the second task. Similarly, when attention to the primary task wavers, or subjects attempt to perform both tasks simultaneously, a performance cost is seen for the primary task. Using a dual-task paradigm, Li, VanRullen, Koch, and Perona (2002) demonstrated that particular kinds of stimuli may reach conscious awareness even when attentional mechanisms are not directed toward their processing.
8/17/09 2:24:03 PM
488
Consciousness
Subjects performed a demanding primary task, requiring that they indicate whether five letters, presented in various directions and in nine possible locations around a fixation point, were identical. When their performance on this task under dual-task conditions was equivalent to their performance on the same task when presented alone, subjects were unable to simultaneously perform a secondary task requiring the discrimination of letters or colored shapes. Yet when the secondary task involved determining whether an animal or vehicle was present in more visually complex pictures of a natural scene, subjects were able to accurately perform the task. These surprising results suggest that particular kinds of stimuli may be perceived without attention. In other words, for particular kinds of stimuli, their “gist” may be extracted and consciously represented without direct attentional resources (Koch, 2004). As Koch (2004) has pointed out, gist is a powerful concept that could explain how stimuli beyond the focus of attention are consciously perceived. This idea is similar to Hochstein and Ahissar ’s (2002) reverse hierarchy theory, which maintains that conscious visual perception does not necessarily follow the traditional, bottom-up visual processing hierarchy. “Vision at a glance” quickly provides us with a semantic representation of objects void of details (gist), while “vision under scrutiny” provides detailed perception by employing attentional resources and increasing activity in the various early visual areas. Given the research discussed earlier, one might expect that top-down projections from frontal and parietal areas to high-level visual modules facilitate “vision at a glance” (Koch, 2004), while bottom-up cortical activity in primary sensory areas may be amplified by top-down feedback and thereby support the more detailed “vision under scrutiny” (Hochstein & Ahissar, 2002). This theory of perception emphasizes the important, but easy to overlook, fact that conscious perception is not a direct representation of the physical world. Instead, it is a reconstruction that depends on the integrity of the various processing modules, including bottom-up modules processing the broad spectrum of perceptual content, and top-down modules supporting extraction of the gist of external stimuli. This general description of neural processing is aligned with the spectrum of conscious content we described in a previous section. On one end, stimuli receiving enhanced processing through allocation of attentional modules (i.e., “vision under scrutiny”) are consciously constructed with great accuracy and precision. On the opposite end, entire aspects of consciousness can be missing due to damage to specific modules or attention modules that serve to enhance activity in other modules. The hypothesis of specialized modules aided by attentional amplification neatly accounts for absent content
c25.indd 488
and accurate content, the two extremes of the content continuum. But between these two extremes exists activity in intact perceptual modules that, at a given instant, is not enhanced by attentional modules. Even if the concept of gist is able to explain how we perceive the content in unattended regions of space or entire unattended modules, the mechanism underlying the seamless, dynamic integration of unattended content with attended content remains a mystery. Presumably, this integration is responsible for creating the wonderful illusion that we perceive the vivid panorama before us as an equal whole rather than a highly specific area surrounded by only crude conscious representations. Thus, investigation of the unique role that attention often, but not always, plays in facilitating perception provides important clues about the nature of the neural areas underlying consciousness. For instance, we have repeatedly emphasized the important role that assembling information from various modules appears to play in generating conscious experience. Empirical investigation of attention suggests that the parietal lobe is a critical component of modules responsible for attentional amplification of information in other modules. Furthermore, there is evidence that suggests frontal areas, such as the DLPFC and ACC, also may need to properly orchestrate their activity with parietal and more posterior temporal and occipital areas in order for conscious experience to emerge. These observations lend credibility to the global workspace theory (Dehaene & Naccache, 2001). Yet, frontal and parietal areas that function to amplify activity elsewhere in the brain require a substrate on which to act. Accordingly, we proceed by considering various types of highly specialized cortical processors for functions such as vision, emotion, memory, and action. Visual Awareness Arguably the best characterized class of modules are those responsible for processing visual stimuli. Not surprisingly, then, the search for the neural correlates of consciousness has been most intense in the field of vision (Koch, 2004). Yet, despite all we know about vision, relating neural activity to conscious visual content poses a major problem to investigators because the latter is directly unobservable. Though this impasse may be a central reason why neuroscientists long avoided the problem of consciousness (Searle, 2000), it is not an insurmountable obstacle. Selfreport provides a window into the content of consciousness, and participants (human and nonhuman alike) can also be trained to perform behavioral responses that inform the investigators of the current conscious content. In this way, investigators can associate objective measures of
8/17/09 2:24:05 PM
Modules of Extended Consciousness 489
neural activity with the reported subjective experience, though there is no absolute way of verifying the veracity of the report. Much insight has been drawn from the subjective reports and neuropsychological testing of patients with damage to visual areas. Deficits arising after such damage provide excellent evidence of the modular organization of the brain and also reveal critical neural regions for visual perception. Blindsight In the cortex, visual processing begins in the primary visual cortex (V1). Damage to V1 may result in blindsight, a condition characterized by a profound loss of subjective visual experience, with preserved abilities to act on nonconscious visual information (Weiskrantz, Warrington, Sanders, & Marshall, 1974). Humphrey and Weiskrantz (1967) first described this phenomenon in a rhesus monkey with damage to V1 that nonetheless retained certain unexpected visual abilities, such as the ability to accurately reach for a piece of food that should be in the blind field. Weiskrantz and colleagues (1974) then studied patient D. B., who had part of his occipital cortex surgically removed to treat an arteriovenous malformation (AVM). As expected, he lost the ability to perceive stimuli presented in one half of his visual field (hemianopsia). However, on investigation, he demonstrated an intact ability to reach for an object presented in his blind field. He also demonstrated some intact visual discrimination functions, including the ability to distinguish between vertical and horizontal lines, as well as Xs and Os. The neural basis of these preserved visual abilities is debated. Some hypothesize that they arise from projections that bypass primary visual cortex, specifically projections from the superior colliculi and the pulvinar nuclei of the thalamus to extrastriate areas (Weiskrantz, 1996). However, other research indicates that these abilities depend on remnants of the geniculostriate projections (Fendrich, Wessinger, & Gazzaniga, 1992, 2001). By this view, the circuitry necessary for visual conscious experience could be disrupted despite somewhat intact output from V1 that facilitates certain other functions. It is possible to induce the experience of blindsight in healthy individuals using metacontrast masking. Lau and Passingham (2006) presented subjects with a diamond or a square, followed by a metacontrast mask either 33 or 100 ms after the stimulus. Subjects were forced to choose whether they had seen the diamond or square, then immediately were asked if they had actually seen it or if they had guessed. The performance at the two different time points was not significantly different; however, subjects reported seeing the stimulus rather than guessing significantly more often when the mask followed the stimulus by 100 ms.
c25.indd 489
The increased level of subjective awareness of the stimulus with the later mask onset was associated with increased activity in the left mid-dorsolateral prefrontal cortex (BA 46). Whereas self-reports or behavioral responses usually used to assess the content of consciousness are associated with performance (i.e., the response used on a task is the response used to assess conscious contents), this study is unique in that the self-report is directly about the conscious experience rather than about the stimulus itself, eliminating potential performance-related confounds. These results indicate that performance, specifically in forced-judgment tasks, may not be the best indicator of subjective experience. Furthermore, they indicate that this dlPFC region may be associated with subjective visual experience. Ventral Visual Stream Disorders Higher-order visual processing has been broadly divided into two functionally and anatomically dissociable streams that emerge from V1. The ventral stream extends to the IT, while the dorsal stream leads to the posterior parietal cortex (PPC) (Ungerleider & Mishkin, 1982). Neurons in the ventral stream are tuned for specific stimulus features that allow for object identification, such as color, shape, and texture (Desimone & Gross, 1979), thus the ventral visual stream is commonly referred to as the “what” pathway (Ungerleider & Mishkin, 1982). In contrast, neurons in the dorsal stream are sensitive to stimulus features that allow for object localization, such as speed and direction of stimulus motion (Andersen, Snyder, Bradley, & Xing, 1997; Newsom, Britten, Salzman, & Movshon, 1990) thus the dorsal visual pathway is commonly referred to as the “where” pathway (Ungerleider & Mishkin, 1982). Damage to either stream eliminates certain visual functions while leaving others unscathed. Damage to the ventral stream may result in agnosia, which is broadly defined as “impairment of object recognition not caused by sensory deficits or generalized intellectual loss” (Farah, 1992, p. 162). Traditionally, a distinction has been made between associative agnosias, stemming from loss of access to semantic knowledge associated with an object, and apperceptive agnosias, which are due to higher-level perceptual disturbances (Goldberg, 1990). Visual agnosias can be highly specific and dissociable, reflecting the specific tunings of neurons in different areas of the occipital and IT areas. For instance, patients with damage to occipital or inferior temporal areas may suffer from prosopagnosia, or an inability to recognize familiar or unfamiliar faces (Damasio, 1985; Warrington & James, 1967). Patients may also exhibit color agnosia, which may include the inability to appreciate differences between colors and/or the ability to associate colors with objects in the presence of intact color vision (Gloning, Gloning, & Hoff, 1968; Lennie, 2001).
8/17/09 2:24:06 PM
490
Consciousness
Dorsal Visual Stream Disorders In contrast, damage to the dorsal stream typically results in deficits involving the integration of attention, vision, motor information, and spatial information. The two hemispheres play different roles in these capacities; damage to the left dorsal stream areas typically results in greater disruption of visuomotor functions, while damage to the right dorsal stream areas typically results in greater disruption of visuospatial functions (Colvin, Handy, & Gazzaniga, 2003). This dissociation reflects differences between the two hemispheres in terms of their respective contributions to conscious experience, which will be discussed further later in the chapter. Damage to the left inferior parietal region may result in apraxia, defined as a deficit in performing learned movements, often impacting both hands (Goodale, 1990; Kimura, 1982). There are two common forms of apraxia. Ideomotor apraxia, characterized by an inability to execute movements in response to a verbal command (Weintraub, 2000), has been associated with damage to the supramarginal guyrs (Leiguarda & Marsden, 2000). Conceptual apraxia, characterized by an inability to sequence a series of actions necessary to execute a complex, goal-directed action involving the use of tools, has been associated with damage to the left parieto-occipital or temporoparietal regions. Importantly, while apraxic patients are unable to execute these complex movements in response to a verbal command, they are able to spontaneously perform them (Weintraub, 2000). This suggests that the left parietal region is critical for the conscious awareness and execution of motor programs, but is not responsible for representing the motor program itself. Optic ataxia, defined as an inability to make accurate movements to objects with the contralesional hand, is a related condition, and may result from left or right superior parietal damage (Milner & Goodale, 1995). Yet reflecting the differential roles of the two hemispheres in attention, optic ataxia patients with left parietal damage make errors with their right hand throughout the entire visual field. In contrast, optic ataxia patients with right parietal damage only make errors with their left hands in the left visual field (Perenin & Vighetto, 1988), suggesting that the left parietal lobe plays a unique role in attending to motor acts throughout the entire visual field. From the perspective of understanding extended consciousness, what is important about this condition is that the actual perception of objects is unaffected in optic ataxia; it is the patient’s ability to interact with those objects through conscious visuomotor acts that is solely impacted. The right superior parietal lobe plays a complementary role in directing attention to space throughout the entire
c25.indd 490
visual field. Following unilateral damage to the right inferior parietal lobe, patients may experience unilateral neglect, or selective inattention to stimuli presented in the contralateral (i.e., left) hemifield (Driver & Vuilleumier, 2001). Patients with unilateral neglect may often fail to eat from the left side of a plate, or fail to complete dressing on the left side of their body. Clinically, they also demonstrate the phenomenon of extinction. When presented with a stimulus in the field contralateral to the lesion, the patient readily perceives it. However, if stimuli are simultaneously presented in the contralateral and ipsilateral visual fields, the patient is likely to only report the ipsilateral stimulus. Using the extinction paradigm, Vuilleumier, Armony, Driver, and Dolan (2001) compared patterns of cortical activity associated with perceived and extinguished faces in a patient with unilateral neglect. As might be expected based on the literature reviewed in a previous section, perceived and extinguished faces induced activity in the primary visual cortex and IT but only perceived faces were associated with intact parietal lobe activity (Vuilleumier et al., 2001). Once again, this study suggests that modular processing within the basic visual areas may rely on input from frontoparietal networks in order to generate conscious awareness. Finally, simultaneous agnosia is characterized as an inability to perceive more than one object or point in space at a time, and is seen following bilateral damage to the parietal lobes (Luria, 1959). Clinically, this condition comes to attention when the patient is unable to perceive either of two simultaneously presented stimuli (simultanagnosia). These patients often have difficulty directing their gaze and shifting from a fixation point (Rizzo & Robin, 1990). For these patients, the world is perceived as a series of fragmented still frames; both time and space are impacted, such that the patient no longer perceives a continuous flow of visual information. However, as is the case in the other types of agnosia discussed previously, the basic elements of perception are undisturbed, such that the representations emerging from intact brain areas are integrated into extended consciousness. Thus, patients with specific patterns of neurological damage have revealed much about the modular organization of visual processing, and that the elimination of these modules may result in the loss of particular types of content in extended consciousness, but do not eliminate extended consciousness itself. Bistable Perception In this section, we return to studies of visual processing in healthy brains, focusing on how simultaneously presented stimuli are integrated into the content of consciousness. Bistable perception makes use of stimuli that do not physically change throughout experimental trials but nonetheless
8/17/09 2:24:12 PM
Modules of Extended Consciousness 491
result in alternating unique percepts. The underlying assumption is that there should not be any difference in activity in areas that unconsciously process the stable physical stimulus (or stimuli) but there must be changes at higher levels that induce or correlate with the alternating percepts. In other words, an area responding purely to the physical stimulus should not demonstrate any changes in activity during different perceptions, while areas necessary for creating the percept should vary in activity as perception changes over time. By far the most widely used inducer of bistable perception is binocular rivalry. During binocular rivalry, corresponding areas of each eye are shown separate stimuli of equal saliency. Rather than perceiving both stimuli simultaneously or perceiving a composite of the stimuli, subjects report only seeing one stimulus at a time. Over time, the percept alternates between from one stimulus to the other. This phenomenon has been investigated using single cell recording in monkeys and using fMRI in humans. Nikos Logothetis and colleagues (Logothetis & Sheinberg, 1996; Logothetis, Leopold, & Sheinberg, 1996; Leopold & Logothetis, 1996; Logothetis, 1998; Sheinberg & Logothethis, 1997) have used single-unit recordings to search for neurons in various visual areas whose activity is modulated by perception rather than physical stimulus. They found only a small fraction in early visual areas such as V1 and V2 (18%), but an increasing amount in V4 (38%), MT (43%), and inferior temporal cortex IT (90%). The variance of these firing patterns is consistent with the increase in complexity of preferred stimuli ascending through these visual areas. These areas constitute the previously discussed ventral pathway for visual processing (Ungerleider & Mishkin, 1982), with the IT being the final purely visual area in this object-recognition pathway (Logothetis & Sheinberg, 1996). The IT also projects to and receives input from the prefrontal cortex, indicating that it could be a transition zone, processing and classifying highlevel visual stimuli and passing condensed but informationrich input to executive frontal areas. Koch (2004) calls this the “executive summary,” which he speculates may be one of the reasons that consciousness evolved. Human brains are bombarded with a vast amount of sensory input, and to survive, they must figure out which information is relevant and devote their cognitive resources to processing that particular information. An unanswered question central to this research and the theoretical speculation that it inspires is this: To what extent is the activity of the IT a neural correlate of consciousness? Is it responsible for the actual percept, or is it responsible for the experience of recognizing and semantically labeling that percept? Evidence of covert object recognition in visual aphasias suggests that recognition may
c25.indd 491
still occur after damage to the IT (Farah, 1992). Awareness may be lost due to an inability for that module to interact with frontal areas, or to “broadcast” its content in the global workspace (Deheane & Naccache, 2001). As clinical evidence and imaging studies identify specific modules necessary for awareness of certain visual features, it is necessary to understand the interactions of these modules and other cortical areas. We have already encountered evidence that implicates frontal and parietal activity with awareness of the content in various processors, though the specific dynamics of these interactions demand future investigation. A commendable attempt to better characterize these dynamics is found in a study that assessed the visual abilities of patients with focal lesions to the dorsolateral prefrontal cortex (Barceló, Suwazono, & Knight, 2000). Patients and controls had to detect inverted triangles in sequences that contained the target, as well as upright triangles and novel stimuli (pictures of fish or flowers). One stimulus at a time was presented either to the contralesional or ipsilesional visual field. Dorsolateral prefrontal patients demonstrated a deficit in detecting the target when it appeared in the contralesional field. Furthermore, EEG data revealed reduced neural activity in the ipsilesional extrastriate cortex of dlPFC patients at early (125 ms) and late (200 to 650 ms) time points. Neural activity at these time points is thought to indicate a tonic maintenance of an “attention template” and phasic reentrant feedback from the frontal lobes, respectively. The observed deficits in both of these systems after dlPFC damage further implicate this frontal area in visual awareness. Koch (2004) has made a strong case that visual awareness is the most tractable aspect of conscious experience to empirically approach, as evidenced by the impressive progress in identifying potential neural correlates of visual consciousness described previously. This is surely a valid point, but it does not follow (nor does Koch imply) that investigation of the neural correlates of other aspects of conscious experience is entirely futile. On the contrary, as the topic of consciousness has returned to scientific discussion, investigators have begun exploring with great vigor the curious split between conscious and nonconscious across a breadth of neural functions. Emotion and Consciousness Some of the earliest work on emotional processing in the brain identified critical roles for neural structures comprising the limbic system (MacLean, 1949; Papez, 1937). This interconnected network of subcortical nuclei and cortical areas includes the amygdala, thalamus, hypothalamus, cingulate cortex, insula, and orbitofrontal cortex. Particular
8/17/09 2:24:13 PM
492
Consciousness
regions within this system are involved in nonconscious emotional processing. For example, it has long been recognized that the amygdala plays a critical role in fear conditioning, binding the physical sensations of fear mediated by the autonomic nervous system and relayed through the hypothalamus, to the perception of a particular environmental cue. As such, the amygdala and hypothalamus, together with neighboring medial temporal structures (such as the hippocampus) are involved in implicit learning of appropriate fight-or-flight responses to dangerous cues, and these structures have become a focus of research into understanding the neurobiological mechanisms underlying anxiety disorders involving a fear-conditioned response, such as posttraumatic stress disorder and social phobia (Etkin & Wager, 2007; Rauch, Shin, & Phelps, 2006). At this basic level, the emotional binding process is nonconscious. Multiple neuroimaging studies have demonstrated that the amygdala is active when fearful faces or fear-relevant stimuli are presented, regardless of whether those stimuli are consciously perceived (Jiang & He, 2006; Morris, Ohman, & Dolan, 1998; Vuilleumier et al., 2001). Similarly, in unilateral neglect patients, the amygdala responds to fearful faces that are extinguished from conscious awareness by the simultaneous presentation of a second stimulus (Vuilleumier et al., 2002). This is not surprising given our subjective sense that the physical sensations associated with emotions often precede our conscious understanding of their origin. It is the activity within these basic modules that is then broadcast via connections with higher-order cortical areas, resulting in the emergence of an emotional experience. Conscious emotional processing involves two related phenomena—the awareness of one’s own feelings and the awareness of another ’s feelings. One can be conscious of his or her feelings, and one can also be conscious of the feelings of another. The extent to which these two functions can be dissociated is yet unclear. This is likely in part because while one’s accuracy in discriminating and perceiving the emotions of others is quantifiable, it is more difficult to capture the validity of one’s own emotional experience. However, it also likely stems from the fact that the conscious experience of one’s own emotions is a more automatic process than empathizing with another (de Vignemont & Singer, 2006). One possibility is that the topdown processes that engage the limbic system drive cognitive processes involved with empathy (Carr, Iacoboni, Dubeau, Mazziotta, & Lenzi, 2003), while bottom-up processes drive the cognitive processes involved in one’s own conscious experience. This would allow for dissociation between the two types of conscious emotional experience. For example, one could be acutely attuned to one’s personal emotional experience, but have particular difficulty
c25.indd 492
empathizing with another ’s experience. Such dissociation could exist even if the two types of conscious emotional processing share neural substrates. Indeed, both the awareness of one’s own emotional experiences and the emotions of others depend critically on two cortical areas at the interface between conscious and nonconscious processing, the insula and the ACC. Neuroimaging studies have demonstrated that both regions are active when one is experiencing pain or empathizing with someone else who is experiencing pain (Singer et al., 2004, 2006). These regions likely make unique contributions to the conscious experience of emotions. The insula has been repeatedly linked to the experience and perception of disgust (Adolphs, Tranel, & Damasio, 2003; Calder et al., 2000), and may play a unique role in processing emotional responses elicited by physical contact between individuals (Olausson et al., 2002). Similarly, activity in the ACC is modulated by the extent to which one is aware of his or her emotions (Lane et al., 1998), and also relates to the experience of basic interoceptive stimuli (Liotti, Brannan, Egan, Shade, Madden, Abplanalp et al. 2001). This function may reflect a broader role in mediating conscious awareness across perceptual and cognitive domains; as we discussed earlier, the ACC is also critically involved in visual awareness. There is considerable convergent evidence that modules within the right hemisphere play a preferential role in emotional processing. Some of the earliest observations of behavioral changes following unilateral damage to the right hemisphere noted reduced emotional expression and inappropriate affect (e.g., Mills, 1912a, 1912b). Since then, subsequent studies have revealed that right hemisphere damage impacts nearly all aspects of emotional expression and interpretation. For example, following right hemisphere damage, individuals may have difficulty conveying affective information through the tone of their voice (i.e., prosody) and may also have difficulty interpreting the tone of another ’s voice to better understand their emotional experience (Blonder, Bowers, & Heilman, 1991). More recent studies have pointed to a specialized role of the right hemisphere in nonconscious processing of facial expressions (Morris et al., 1998), suggesting that the specialized role of the right hemisphere in emotional processing extends to both nonconscious and conscious processing. The etiology of the right hemisphere’s specialization for emotional processing has been widely debated, but one leading hypothesis is that the differential role of the two hemispheres in emotional processing stems from an asymmetry in the autonomic nervous system. Specifically, the left hemisphere has been associated with parasympathetic activity, while the right hemisphere has been associated with sympathetic activity. Based on this proposal, the left hemisphere is involved in approach or “group-oriented”
8/17/09 2:24:13 PM
Modules of Extended Consciousness 493
emotions, while the right hemisphere is involved in withdrawal or “individual-oriented” emotions (e.g., Craig, 2005). This model neatly accounts for the fact that right hemisphere damage, from a variety of neurological conditions, may result in secondary mania (for review, see Cummings, 1997), which can be conceptualized as a lack of withdrawal behavior. Similarly, abnormal patterns of right hemisphere activity have been observed in patients with bipolar affective disorder who are currently in the manic state (Caligiuri et al., 2004) and visuospatial weaknesses are often observed in patients with chronic bipolar affective disorder (Osuji & Cullum, 2005). However, the model would also predict that social deficits in relating to others would be associated with left hemisphere dysfunction. The preponderance of evidence accumulated to date suggests that these abilities are also related to right hemisphere function. Degeneration of the frontotemporal regions of the right hemisphere may be associated with acquired sociopathy, or diminished regard for the impact of one’s behaviors on others (Mendez, Chen, Shapira, & Miller, 2005), perhaps due to downstream disruption of ACC functioning (Rudebeck, Buckley, Walton, & Rushworth, 2006). Similarly, impaired social communication skills, including empathy for others’ experiences, are the hallmark of several developmental disorders associated with right hemisphere dysfunction. Patients with certain autism-spectrum disorders, specifically nonverbal learning disability (NLD) and Asperger ’s syndrome often exhibit excellent verbal skills, yet struggle to understand the complexities of everyday language, including the implications of prosody and pragmatics, and exhibit concomitant cognitive weaknesses suggestive of right hemisphere dysfunction (Rourke et al., 2002). It is yet unclear whether the autonomic nervous system asymmetry model can account for these findings, but regardless of etiology, it is clear that the right hemisphere has a specialized role in emotional processing. Altogether, current research strongly suggests that the limbic system is comprised of the neural modules involved in emotional processing, and that some of those modules within the right hemisphere play a specialized role in emotional processing. At the most fundamental level, the hypothalamus and amygdala are involved in nonconscious processing of affective stimuli, particularly those related to the basic emotion of fear. The interaction of these modules with higher-order cortical areas, particularly the insula and the ACC, gives rise to the conscious experience of feelings, likely combining input from widespread cortical modules involved in other cognitive functions. Ultimately, this system enables not only an understanding of one’s own emotional experiences, but also an understanding of the emotional experiences of others.
c25.indd 493
Memory and Consciousness Recall that in defining the components of consciousness, core consciousness refers to an awareness of a stimulus that is limited to the present moment, while extended consciousness refers to the integration of past experiences to elaborate on this representation (Damasio, 1999). As such, extended consciousness is inextricably linked to memory, the latter of which inserts novel subjective qualia into the present experience and thereby allows us to forge a temporal link between the present moment, the past, and an imagined future (Tulving, 2002). One may have some limited conscious awareness of this binding process by explicitly recollecting a past experience. However, the fusion between one’s current experience and one’s past may also occur on an implicit level. These parallel processes reflect the structure of networks involved in the consolidation of new memories. A distinction is often drawn between long-term memories that are procedural and those that are declarative (Tulving, 1983). Procedural memory encompasses learned perceptual, motor, and cognitive skills and is largely nonconscious. For example, learning to ride a bicycle requires rote practice until the integration of the sensory and motor components is seamless. Once learned, this skill can be called on automatically, so that you can ride almost any bicycle, in almost any situation, with relative ease and with relatively little explicit thought to the process involved. If you imagine your own experience of learning to ride a bicycle, and your most recent experience of riding a bicycle, it quickly becomes clear that procedural learning, and the retrieval of procedural memories, is largely unavailable to conscious awareness. Procedural learning is critically dependent on basal ganglia functioning, particularly that of the striatum. Patients with neurodegenerative diseases that disrupt striatal functioning, such as Parkinson’s or Huntington’s disease, have difficulty learning new skilled procedures (Squire & Zola, 1996). Yet, representations of procedural skills are likely diffusely distributed throughout the brain, involving particular regions of frontal cortex, the basal ganglia, and the cerebellum (e.g., Ullman, 2001), thus the sudden loss of a wide variety of well-learned skills is relatively rare. While the procedural memory system operates automatically and implicitly, the experiences spanning acquisition and application of these skills are stored in a declarative memory system that has two components—semantic and episodic memory. Semantic memory refers to a particular class of learned information that is impersonal and not necessarily tied to a specific moment in the past. Thus, semantic memory can be conceptualized as a collection of facts. Recollection of these facts is associated with noetic conscious experience, or the feeling of knowing (Tulving, 1985).
8/17/09 2:24:13 PM
494
Consciousness
The second component of declarative memory, episodic memory, is thought to have evolved from semantic memory because it relates to a special type fact—memory of the actual experience of events from the first-person point of view (Squire & Zola, 1996). For instance, recalling the capitol city of one’s home state, an objective fact, requires semantic memory. However, the memory of a particular visit to that city, a specific and personal past experience, requires episodic memory. Both semantic and episodic knowledge require explicit learning and retrieval, and the consolidation of this information into permanent representations is critically dependent on medial temporal structures. This is demonstrated most clearly by anterograde amnesia, a disorder characterized by an inability to form new declarative memories that results from bilateral damage to medial temporal structures, including the hippocampus (Scoville & Milner, 1957). Episodic memory is critically linked to extended consciousness. In the words of Endel Tulving, “The essence of episodic memory lies in the conjunction of three concepts—self, autonoetic awareness, and subjectively sensed time” (2002, p. 5). Each of these three concepts plays a central role in extended consciousness, an indication of why episodic memory itself is critical for the fullest realization of extended consciousness. First, episodic memory provides the basis for a much more elaborate form of self, which Damasio (1999) has called the “autobiographical self.” This construction of the self is not merely a distinction between the neural representation of the physical self and representations of external objects. Instead, it is a self with explicit access to its past. Second, just as recalling a fact has a specific subjective flavor, remembering an experience is associated with a unique feeling, which Tulving refers to as “autonoetic awareness.” Investigation of this experience provides another angle from which to investigate the neural correlates of consciousness. Finally, episodic memory liberates humans from the present moment, enabling mental time-travel forward and backward. In addition to producing anterograde amnesia, bilateral damage to the medial temporal lobe structures can disrupt autonoetic conscious experience and one’s ability to think about the future. Tulving (1985) describes patient K. C., who sustained damage to the hippocampus and parahippocampal gyrus, and who is unable to conceptualize what he may be doing in the near future, describing his state of mind when asked to do so as “blank.” A similar pattern of deficit has been noted in other case studies (Kitchener, Hodges, & McCarthy, 1998; Klein & Loftus, 2002), and a study of five patients with retrograde episodic amnesia caused by bilateral hippocampus damage revealed a diminished ability to image future events (Hassabis, Kumaran, Vann, & Maguire, 2007). Additionally, age-related deterioration of
c25.indd 494
episodic memory has recently been demonstrated to correlate with a decreased ability to mentally simulate rich, detailed future events (Addis, Wong, & Schacter, 2008). Thus, at a general level, extended consciousness without episodic memory resembles in many ways primitive core consciousness. In both cases, the organism is trapped in the present moment. However, this version of impaired extended consciousness still provides a greater breadth of conscious experience and conscious cognitive function. For instance, anterograde amnesia patients exhibit remarkable preservation of functions in other cognitive domains, including attention, perception, and motor abilities (Tulving, 2002). Similarly, aspects of one’s core self, including values, beliefs, and morals, may be preserved, even in the absence of the knowledge base that may have guided their formation (Corkin, 2002). This is further indication of the modular nature of extended consciousness. Because consciousness emerges from concerted activity across distributed modules, even damage that eliminates the subjective past and future is unable to decimate the entirety of extended consciousness. Instead, it dramatically reduces the repertoire of modules available to be integrated into conscious experience. Importantly, associating activity in episodic memory modules with autonoetic conscious experience provides another approach for investigating the neural basis of extended consciousness. Based on the location of K. C.’s damage, the medial temporal lobe is presumably involved. Additionally, early positron emission tomography (PET) studies found that frontal lobe activity was associated with autonoetic consciousness (Tulving et al., 1994), leading to a theory that tied episodic memory to frontal lobe function (Wheeler, Stuss, & Tulving, 1997). Subsequent EEG evidence supported the distinction between noetic and autonoetic conscious experience because each was associated with a distinct EEG pattern that indicated orchestrated activity in temporal, parietal, and frontal regions (Düzel, Yonelinas, Mangun, Heinze, & Tulving, 1997). More recently, Addis and colleagues (2008) performed a study in which they used an event-related design to distinguish between the correlates of constructing an episodic memory or imagined future event and the correlates of elaborating (remembering or imagining specific details) that event. Construction of events, both past and future, was associated with activity in the left hippocampus, bilateral inferior temporal and fusiform cortices, and superior and middle occipital gyrus. Construction of future events also engaged frontal and prefrontal regions associated with prospective thinking and semantic generation. Elaboration of past or imagined events commonly involved engagement of a network associated with autobiographical memory retrieval (Maguire, 2001), including the parahippocampal
8/17/09 2:24:14 PM
Modules of Extended Consciousness 495
cortex, retrosplenial cortex, posterior cingulate, and precuneus. Additional activity was seen in the left medial PFC. These results, as well as other similar findings (see Schacter, Addis, & Buckner, 2007, for a review), are remarkable for at least two reasons. First, they have greatly enhanced our understanding of memory and are the basis of new theories about how the past contributes to construction of mental images of the future (Hassabis & Maguire, 2007; Schacter et al., 2007). Second, though it has rarely been stated, these results also inform our understanding of how the brain generates conscious experience. Indeed, modules that have classically been associated with memory, such as medial temporal lobe structures, consistently interact with frontal and parietal areas, as well as posterior modules involved in visual perception when visual imagery is involved, in order to produce autonoetic conscious experience. Remembering and imagining specific future events involve similar, distributed neural correlates that generate distinct conscious content. Furthermore, we begin to see how integration, the topic of the final section of this chapter, may occur. In this case, modules that initially processed an experience (e.g., visual modules) were recruited during elaboration of episodic memory of that experience. An elegant study by Daselaar and colleagues (2008) expanded on this observation. They found that vividness of recall, which they call “reliving,” was associated with increased activity in extrastriate visual cortex. They also demonstrated that emotional intensity ratings correlated with frontopolar activity during elaboration of the memory (other areas, such as the amygdala, temporoparietal regions, and inferior frontal gyrus were modulated by emotional intensity during the retrieval process only). Thus, while all memories invoked an autonoetic experience, each experience was associated with unique perceptual and emotional flavors that could be traced to activity in discrete modules. This distributed activity was integrated to form subjectively rich conscious content. Conscious Awareness of Action There are some patients who possess functioning modules responsible for attention, perception, and memory but are nonetheless nearly diagnosed as being in a vegetative state. It seems unthinkable that a patient possessing much of the expanse of extended consciousness could be mistaken for someone without even core consciousness, but this is an unavoidable consequence of trying to diagnose an intrinsically first-person experience from a third-person point of view. As we have repeatedly conceded, scientists are utterly reliant on self-report to assess consciousness in other individuals. If all motor function, including that required for speech, is lost, it is very difficult to ascribe
c25.indd 495
consciousness to another individual. Thus, patients suffering from locked-in syndrome are essentially conscious beings trapped in a paralyzed body. Their only means of communication is often a preserved ability to execute vertical or lateral eye movements (Laureys et al., 2007). This simple, subtle means of communication reveals a fully functional mind. In fact, one locked-in patient wrote a stunning memoir by using left eyeblinks to select letters from a recited stream of letters, crafting his masterpiece literally letter-by-letter (Bauby, 1997). It is important to realize that locked-in syndrome further contributes to the modular hypothesis of extended consciousness. Abolishing motor modules does not eliminate extended consciousness altogether; it merely limits its scope. Yet, there is something peculiar about addressing this set of modules. It has nothing to do with actual neural mechanisms; indeed, the principles that govern these modules, their interactions, and their role in generating conscious experience are presumably no different from those of the various other modules discussed thus far. Rather, it is a philosophical point. Understanding the neural basis of the conscious experience of intending and executing an act inevitably mounts an attack on free will, one of the most fundamental and deeply cherished human doctrines. Objective reality clashes with intuition, cruelly leveling the edifice of free will. However, we view this conclusion in a far more optimistic light. Just as understanding the neural basis of consciousness will not make our conscious experience any less rich and wondrous, understanding that the brain determines behavior while simultaneously producing the useful illusion of free will does not detract from our day-to-day feeling of controlling our thoughts and actions (see Aharoni, Funk, Sinnott-Armstrong, & Gazzaniga, 2008; Gazzaniga, 2000, 2005). Despite the weighty issues surrounding this subject, we shall continue to adhere to empirical data and consider the various modules involved in intention and motor awareness. Because we are purely interested in examining the neural bases of conscious experience and not the intricacies of the various neural causes of action initiation and execution, we chose here to focus on the two most tractable types of conscious content associated with this class of modules: conscious intention and motor awareness. The former is associated with voluntary actions and subjectively precedes awareness of action initiation. The latter is not specific to a class of actions and is a more general experience that is associated with movement, regardless of cause. Conscious Intentions The work of the late Benjamin Libet was responsible for igniting years of intense research on conscious intentions (see Libet, 2004, for a summary of his work). In their
8/17/09 2:24:14 PM
496
Consciousness
classic experiment, Libet, Gleason, Wright, and Pearl (1983) recorded event-related potentials (ERPs) while subjects voluntarily lifted their right hand at a time of their choosing. Subjects were instructed to observe the location of a dot rotating around a screen in front of them and to remember the location of the dot when they first felt the onset of the conscious decision to move their hand, or the onset of the intention. EEG recordings revealed a distinct wave of activity that preceded each hand movement by around 500 ms, called the “readiness potential.” Libet found that the onset of the readiness potential also preceded the conscious experience of intention by about 300 ms. The brain appeared to be preparing for the action before the subject had any subjective notion of intending to move his or her hand. This observation prompted the speculation that such activity might be the neural precursor or perhaps correlate of conscious intention. Though some have questioned Libet’s methods and conclusions (Kilner, Vargas, Duval, Blakemore, & Sirigu, 2004; Mele, 2006), the spirit of his work indisputably pervades present investigations of the neural correlates of conscious intention. While the readiness potential (RP), a rise in negative potential (i.e., change in electrical activity) preceding voluntary initiation of a movement, was initially associated with the conscious experience of intending and perhaps motor initiation itself, more recent research has revealed that the readiness potential is also present when an individual watches another person perform an act (Kilner et al., 2004). Thus, this activity may reflect a more general motor processes engaged by planning, acting, or observing action. So while the RP is, in principle, quite interesting, it is necessary to go beyond the poor spatial resolution of EEG and more closely scrutinize the neural events underlying intention formation. Toward this end, Lau, Rogers, Haggard, and Passingham (2004) found that attending to the experience of intention is marked by increased activity in the presupplementary motor area (pre-SMA) relative to activity observed when one attends to the actual action. This increased activity was accompanied by increased activity in the dorsal prefrontal cortex (dPFC; BA 46) and the intraparietal sulcus. Furthermore, connectivity between the pre-SMA and dPFC was enhanced in the attention to intention condition. The authors suggest that “attention to intention may be one mechanism by which effective conscious control of actions becomes possible” (p. 1210). Thus, these results are crucial for two reasons: Pre-SMA activity is associated with awareness of intentions and connectivity between the pre-SMA and dPFC may represent a pathway by which cognitive control systems could influence intentions and thus behavior. Additional studies have provided further evidence that the pre-SMA is a key node for intention formation (Hoshi & Tanji, 2004; Nachev, Rees, Parton, Kennard, & Husain,
c25.indd 496
2005). However, a study of patients with damage to the angular gyrus of the parietal lobe (Sirigu et al., 2004) suggests that these frontal areas alone may not be sufficient for proper conscious experience of intentions. Sirigu and colleagues repeated Libet’s experiments with three groups of subjects: patients with lesions to the angular gyrus, patients with cerebellar damage, and healthy controls. Parietal patients reported the experience of intention to be nearly at the initiation of action (20 ms after action initiation), whereas the other two groups properly judged the intention as occurring about 200 ms before action initiation. None of the groups had any difficulty initiating voluntary movements, and all three groups properly judged the time of action initiation on separate trials. Importantly, EEG data indicated a lack of an RP when the parietal patients judged the time of the onset of intention and a greatly reduced RP when they judged the time of action onset. Though the RP is not uniquely associated with the conscious experience of intention, the reduction of the RP in patients with parietal damage is nonetheless significant because it is likely indicative of a loss of reentrant feedback from the parietal lobe to the frontal lobe (Sirigu et al., 2004). Because this disruption to the normal connectivity is associated with a deficit in conscious intention formation, these results emphasize the importance of proper connectivity, particularly between specific modules in the frontal lobe and parietal lobe, for normal conscious experience. In summary, study of conscious intention indicates that attending to the experience of intention requires proper connectivity between the dPFC and the pre-SMA, while the experience of intention relies on proper connectivity between the frontal motor areas and the angular gyrus of the parietal lobe. Motor Awareness In the experiments described in the preceding section, judgments of the onset of action rely on conscious awareness of the movement. Yet, awareness of limb movement is a difficult problem to experimentally approach due to the unchanging nature of bodily awareness. The constant input received by areas that represent limbs is more difficult to manipulate than the content of visual or emotional experience. Nonetheless, Tsakiris, Hesse, Boy, Haggard, and Fink (2007) devised a clever solution to this problem. They found that simultaneously stroking a visible rubber hand and the subject’s hand, which was hidden from view, led to the subjective experience of ownership over that limb. Expressed ownership of the limb was associated with activity in the right posterior insula and right frontal operculum. Thus, by artificially engaging the area of the brain that represents bodily awareness, they were able to lead individuals to believe that a rubber arm belonged to them.
8/17/09 2:24:14 PM
Unity of Consciousness 497
Additional evidence for the role of the posterior insula in bodily awareness is found in the patient literature. Damage to the posterior insula results in an inability to detect one’s paralysis. A study of two groups of patients, one with anosognosia for hemiplegia and one with hemiplegia without anosognosia, found that the group with anosognosia (or denial of their deficit) had differential damage to the posterior insula (Karnath, Baier, & Nägele, 2005). Thus, patients with hemiplegia/hemiparesis who nonetheless have intact posterior insula are aware of their paralysis, while patients with damage to this area are oblivious. These results suggest that the posterior insula is critical for bodily awareness and consequently, for proper awareness of action execution (Farrer et al., 2008).
UNITY OF CONSCIOUSNESS Until now, we have focused extensively on the various modules that process different classes of conscious content. We have repeatedly encountered empirical evidence of the importance of reciprocal connectivity between these modules and frontal and parietal areas for normal conscious experience. Such evidence supports theories, like the global workspace, that posit long-range connectivity as a mechanism for the sharing of information across processors. Indeed, the integration of activity across interconnected modules throughout the cortex is a hallmark of most leading theories of the neural basis of consciousness (Crick & Koch, 2003; Dehaene & Naccache, 2001; Tononi & Edelman, 1998). But despite the empirical momentum afforded to these theories, their requirement of extensive, long-distance reciprocal connectivity as a basis for normal conscious experience seems fundamentally incongruent with the undisturbed conscious unity experienced by the left hemisphere of the split-brain. The Split-Brain and the Interpreter Over the past 5 decades, investigation of patients who have had their corpus callosum transected in order to control the spread of epileptic seizures has led to great insight into the functional specialization of the hemispheres of the human cerebral cortex (see Gazzaniga, 2000, for a review). Disconnection of the hemispheres leads to specific deficits in each hemisphere, revealing a substantial amount of specialization in each hemisphere. Yet, despite these deficiencies, each hemisphere retains a surprising amount of functionality. Furthermore, though it is difficult to directly characterize the experience of the mute right hemisphere, there is considerable evidence that each hemisphere generates a unique conscious experience (Gazzaniga, Ledoux, &
c25.indd Sec2:497
Wilson, 1977). Even more striking, the left brain does not miss the right brain and the conscious experience that it generates (Gazzaniga & Miller, 2009). Conscious unity is somehow preserved in the left hemisphere of the splitbrain patient, an observation that dramatically undermines explanations of the neural basis of consciousness that rely upon extensive reentrant connectivity in a global network that spans both hemispheres. A second observation derived from split-brain research that we maintain is of singular importance to coherent, integrated conscious experience is the remarkable tendency of the left hemisphere to tirelessly generate, and wholeheartedly believe, hypotheses that explain the actions initiated by the disconnected right hemisphere. This interpretive process was first discovered while testing split-brain patient P.S. on a simultaneous concept task (Gazzaniga & LeDoux, 1978). In this task, each hemisphere is simultaneously shown a different picture and is asked to select a related picture out of a lineup of eight pictures. The left hemisphere was shown a chicken claw, and the right hemisphere was shown a snow scene. Next, P.S. was asked to choose a related picture with both hands. His right hand (controlled by his left hemisphere) selected a picture of a chicken, and his left hand selected a shovel. The hemispheres could not exchange visual information but could, however, see the picture selected by the other hemisphere. Because language is almost always localized in the left hemisphere (Gazzaniga et al., 1977; Lenneberg, 1967), asking a patient to describe something verbally assesses the knowledge of only the left hemisphere in callosectomy patients. Interestingly, when P. S. was asked why he chose the shovel, rather than replying “I don’t know. I had a surgery and now transfer of information between my left and right hemispheres is hindered, so my left hand does things for reasons I cannot describe,” he said “Easy. The chicken goes with the claw, and you need the shovel to clean out the chicken house.” This shocking response led one of us (MSG) to propose the existence of an interpretative process, located in the left hemisphere, which is responsible for constantly generating explanations that make sense of our interactions in the physical and social environment. In support of this theory, further investigation revealed that the left hemisphere automatically interprets even more complicated actions initiated by the right hemisphere. For instance, when a command such as “laugh” is tachistoscopically presented to the right hemisphere, the patient begins laughing. When asked why, the patient responds, “You guys come up and test us every month. What a way to make a living!” Or, if the command is “walk,” the patient typically stands up and begins to leave the testing van. When asked where he or she is going, the interpreter responds, “I’m going into the house to get a Coke” (Gazzaniga, 1983).
8/17/09 2:24:14 PM
498
Consciousness
The existence of the interpreter explains previous experimental results obtained by Schacter and Singer (1962). They injected subjects with epinephrine and subsequently allowed them to interact with either a euphoric or an angry confederate. Epinephrine activates the sympathetic nervous system, causing an increased heart rate, hand tremors, and facial flushing. Some participants were told that they had been injected with epinephrine and were informed about its effects. When asked why they were aroused and displayed the symptoms they did during their interactions with the confederate, they attributed the response to the drug. However, other subjects were not informed of the drug’s properties, and this group attributed the autonomic arousal to the interaction with the confederate. They felt the interaction resulted in an experience of elation or anger, and that this was the source of the autonomic arousal. This finding illustrates the human tendency to generate causal explanations for events, which the foregoing split-brain research indicates is a function of the left hemisphere. Accordingly, we assert that the brain continuously constructs our conscious experience bit by bit via activity in parallel, specialized modules, and it quickly interprets this experience after it occurs (Roser & Gazzaniga, 2004). This theory of conscious experience, derived from split-brain research, explains other extraordinary deficits and bizarre patterns of deficit denial arising after various types of cortical damage (Cooney & Gazzaniga, 2003; Gazzaniga, 2000). For instance, patients suffering from anosognosia for hemiplegia strangely deny that their limbs are paralyzed or, in even more extreme forms, deny ownership of the limb and attribute it to another agent (Karnath et al., 2005). Cereda Ghika, Maeder, and Bogousslausky, (2002) report a 75-yearold woman who “woke up suddenly in the night with a sensation of being touched by a stranger hand and alarmed by a foreign body in her bed, not recognizing her own left upper limb” (p. 1953). Though unusual and by most accounts difficult to comprehend, we posit that such responses are a natural consequence of the modular brain and the ever-active interpreter. The interpreter analyzes activity in modules that are accustomed to receiving input about the state of the body. When the input to these modules is unusual or absent, the module conveys the problem and, for instance, the subjective experience of being paralyzed emerges. However, when these bodily awareness modules are damaged, the conscious representation of the limb simply disappears. Meanwhile, the interpreter does the best it can with whatever information is present. Without input from the apparatus that usually indicates that “my hand is doing what I want it to” or even “I can’t feel or control my hand,” the interpreter concludes the hand must belong to someone else. Another telling example is that of Capgras’ syndrome. Patients are fully able to perceive that the person in front of
c25.indd 498
them looks identical to a loved one, but they are adamant that it is really an imposter. The present understanding of these delusions is that the emotional response normally elicited by a familiar individual (an example of nonconscious emotional processing because one does not feel this every time one sees one’s significant other) becomes disconnected from the representation of the person (Ramachandran, 1998). Normally, a simultaneous visual and emotional response leads to positive identification. When the emotional response is absent, the interpreter is faced with paradoxical information, so it hypothesizes that the person must be an imposter or alien. Perhaps one of the most unusual syndromes is reduplicative paramnesia, a deficit that involves believing multiple copies of a place exist (Gazzaniga, 2000). One patient mistakenly believed that she was in her home in Freeport, Maine, while she was examined at New York Hospital. She was otherwise quite intelligent and fully understood that her opinion was at odds with that of her doctors. But she held steadfast to her strange belief. Attempts to convince her that she could not possibly be at her house were quickly processed and refuted by the interpreter. For instance, when the doctors pointed out elevators in the hallway, she quickly responded, “Doctor, do you know how much it cost me to have those put in?” (Gazzaniga, 2000). The activity of the interpreter may also explain other, more subtle observations. Recall that patients with damage to the angular gyrus have difficulty making temporal judgments of the onset of intention (Sirigu et al., 2004). They judge the onset of intention at the same time that they judge the onset of action (on separate trials). It is possible that they know intentions are meant to precede voluntary actions and when forced to identify the onset of intention, they report it as soon as they note movement because they realize it should have occurred by that time. If this is the case, conscious intention may be absent but the interpreter may infer the occurrence of intention in order to account for the action. The concept of an interpreter compellingly explains how the feeling of mental unity can arise from the modular brain. Conscious experience emerges from activity in discrete modules, leading to specific subjective content. The interpreter makes sense of this content by making causal inferences immediately after the experiences occur, constantly interpreting and seamlessly linking the serial conscious moments into a united whole. Insults to the nervous system corrupt the input to modules specialized for processing and representing certain information. When this occurs, the conscious content constructed by these processors conveys this malfunction and the deficit is experienced as such. However, if the processor itself is
8/17/09 2:24:15 PM
Unity of Consciousness 499
corrupted, the bit of conscious experience it usually contributed disappears without any higher cortical remnant to note its absence. The interpreter, not at all disconcerted by such destruction, continues interpreting the remaining conscious experiences as they occur. Thus, as one of us has previously written: “The ‘interpreter ’ is a specialized system that makes sense of all the information bombarding the brain, interpreting our responses—cognitive or emotional—to what we encounter in our environment, asking how one thing relates to another, making hypotheses, bringing order out of chaos, creating a running narrative of our actions, emotions, thoughts, and dreams. The interpreter is the glue that keeps our story unified and creates our sense of being into a coherent, rational agent. It is the insertion of the interpreter into an otherwise functioning brain that enables such a rich experience” (Gazzaniga, 2008). The interpreter fills a substantial, and often ignored, void in other accounts of conscious unity. Other theories focus on a separate aspect of conscious integration, the socalled “binding problem,” or the problem of how various features and objects that are processed in distributed brain regions end up correctly integrated in a percept (see Engel, Fries, König, Brecht, & Singer, 1999, for a review). Though the binding problem is most often discussed in regard to visual perception, some have suggested that the underlying mechanism of synchronized activity across modular processors must be more robustly applicable to other aspects of conscious experience (Engel et al., 1999; W. Singer, 2001). For example, precisely synchronized activity may explain how emotional and visual content is integrated into autonoetic experience (Daselaar et al., 2008). Certainly, understanding the set of neural mechanisms that operate in tandem to generate simultaneous, properly bound qualia and thus a united conscious landscape is essential to a complete characterization of the neural basis of conscious experience. But such mechanisms are purely constructive and thus require a complementary, interpretive mechanism for truly coherent, united conscious experience to emerge (Roser & Gazzaniga, 2004). Integrating Empirical Evidence for Frontal and Parietal Connectivity Evidence from a spectrum of patient populations, most notably split-brain patients, indicates that discrete aspects of conscious experience emerge from localized modules rather than a global network. Nonetheless, much of this chapter highlighted contrary evidence that associated long-distance connectivity between modules with conscious experience. Rather than reject or discredit this evidence, we instead embrace it and speculate that it can be
c25.indd 499
incorporated into the proposed framework when cast under a different light. Philosopher Ned Block (1995) argued for two different types of consciousness, each supported by unique neural correlates. He maintains that phenomenal consciousness is the “what it’s like” to be in a conscious state, and access consciousness is the information processing component of conscious experience. Importantly, one hallmark of access consciousness is that it is reportable, which requires maintenance of the content to be reported and likely involves working memory and attention (Block, 1995, 2005; Sergent et al., 2005; Courtney, Petit, Haxby, & Ungerleider, 1998). Though this perspective is quite useful in the present context, we do not see the need to define two separate types of consciousness. Instead, we view the processes that make up access consciousness as a subset of modules that are associated with unique phenomenal content. The global workspace is fit to describe the function of processes that are associated with access consciousness, such as working memory, episodic memory, attention, and even explicit emotion (Block, 2005). These processes require coordination by a central executive, or areas involved with cognitive control (Dehaene & Naccache, 2001), and likely involve mechanisms of competition between modules in a way that construction of conscious experience more generally does not. Tasks aimed at elucidating the mechanisms of pure conscious construction may be confounded by the fact that assessment of conscious content requires self-report, which in turn engages attention and working memory systems. Some studies that have associated frontoparietal activity with conscious visual perception have acknowledged that this activity could be responsible for the ability to report the experience rather than construction of the conscious percept itself (Lau & Passingham, 2006; Sergent et al., 2005). This view of consciousness explains a key characteristic of conscious experience that was alluded to in the earlier discussion of attention. While cognitive processes like attention and working memory are known to be limited (Marois & Ivanoff, 2005), subjective conscious experience operates under different constraints. To demonstrate this, consider the act of reading this page. It is impossible to attend to all of the words simultaneously, and we can only maintain a limited number of words in working memory if we close our eyes. Nonetheless, there is a phenomenal representation of the rest of the page, the book, and its surroundings. Rather than thinking of the neural basis of this experience as an enormous, interacting coalition, we maintain it is constructed piecewise by parallel activity. Though many qualia may be produced in parallel, conscious experience feels serial in nature because it is constrained by reportability, a process that relies on limited processes
8/17/09 2:24:15 PM
500
Consciousness
like attention and working memory. Furthermore, limited resources produce competition, so the foundational notion of competition within the workspace need not be abandoned when considering the facets of conscious experience that pertain to Block’s access consciousness (1995). By this view, the workspace does not underlie conscious experience in a general sense. Consequently, splitting the workspace in half by severing the corpus callosum is not predicted to have a drastic effect on conscious experience. One hemisphere of the split-brain patient cannot report stimuli presented to the opposite hemisphere, but this is of little concern because the ignorant hemisphere does not realize that the other side of the world exists. Unaffected connections between frontal and parietal areas with more posterior regions, a hemiworkspace, facilitate the existing ability to generate content related to access consciousness, such as attending to and reporting stimuli presented to that hemisphere. This is consistent with Tononi’s (2004) claim that the internal architecture of each hemisphere is sufficient to generate private conscious experience based on its ability to process and integrate a broad range of information. Hence, through the activity of both localized modules and distributed modules that depend on the hemiworkspace, conscious experience of one side of the world and of the remaining capabilities of that hemisphere emerges. The disappearance of a whole other side of the world and a more complete set of capabilities, a seemingly epic event, passes quietly. And in the left hemisphere, the resilient interpreter continues to narrate.
REFERENCES Addis, D., Wong, A., & Schacter, D. (2008). Age-related changes in the episodic simulation of future events. Psychological Science, 19, 33–41. Adolphs, R., Tranel, D., & Damasio, A. (2003). Dissociable neural systems for recognizing emotions. Brain and Cognition, 52, 61–69. Aharoni, E., Funk, C., Sinnott-Armstrong, W., & Gazzaniga, M. (2008). Can neurological evidence help courts assess criminal responsibility? Lessons from law and neuroscience. Annals of the New York Academy of Sciences, 1124, 145–160. Andersen, R., Snyder, L., Bradley, D., & Xing, J. (1997). Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annual Review of Neuroscience, 20, 303–330. Baars, B. J. (1995). Tutorial commentary: Surprisingly small subcortical structures are needed for the state of waking consciousness, while cortical projection areas seem to provide perceptual contents of consciousness. Consciousness and Cognition, 4, 159–162. Baars, B. J. (1997). Some essential differences between consciousness and attention, perception, and working memory. Consciousness and Cognition, 6(2–3), 363–371. Baars, B. J. (1988). A cognitive theory of consciousness. New York: Cambridge University Press. Barceló, F., Suwazono, S., & Knight, R. (2000). Prefrontal modulation of visual processing in humans. Journal of Neuroscience, 3, 399–403. Bauby, J.-D. (1997). The diving bell and the butterfly. New York: Knopf/ Random House. Beck, D., Rees, G., Frith, C., & Lavie, N. (2001). Neural correlates of change detection and change blindness. Journal of Neuroscience, 4, 645–650. Block, N. (1995). On a confusion about a function of consciousness. Behavioral and Brain Sciences, 18, 227–247. Block, N. (2005). Two neural correlates of consciousness. Trends in Cognitive Sciences, 9, 46–52. Blonder, L., Bowers, D., & Heilman, K. (1991). The role of the right hemisphere in emotional communication. Brain, 114, 1115–1127. Bogen, J. (1995). On the neurophysiology of consciousness: Pt. I. An overview. Consciousness and Cognition, 4, 52–62.
SUMMARY Analysis of a considerably diverse spectrum of conscious experience, from the basic representations facilitated by core consciousness to emotional episodic memories and stirring visual panoramas, has led to the conclusion that conscious experience is an emergent property that is associated with activity in specific modules. Localized activity in these modules is both necessary and sufficient for a range of phenomenal conscious experience to emerge, though some conscious content, such as visual images sustained by working memory or the experience of reporting conscious content, inherently requires long-distance connectivity between modules. These latter connections and the mechanisms by which they operate contribute serial conscious content associated with specific functions, which is assimilated with the rest of the conscious content that is constructed in parallel by other local modules. Meanwhile, the interpreter busily explains this experience immediately after it occurs. The overall product is an endlessly rich and amazingly coherent conscious experience.
c25.indd 500
Broadbent, D., & Broadbent, M. (1987). From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception and Psychophysics, 42, 105–113. Calder, A. J., Keane, J., Manes, F., Antoun, N., & Young, A.W. (2000). Impaired recognition and experience of disgust following brain injury. Nature Neuroscience, 3, 1077–1078. Caligiuri, M., Brown, G., Meloy, M., Eyler, L., Kindermann, S., Eberson, S., et al. (2004). A functional magnetic resonance imaging study of cortical asymmetry in bipolar disorder. Bipolar Disorders, 6, 183–196. Carr, L., Iacoboni, M., Dubeau, M., Mazziotta, J., & Lenzi, G. (2003). Neural mechanisms of empathy in humans: A relay from neural systems for imitation to limbic areas. Proceedings of the National Academy of Sciences, USA, 100, 5497–5502. Cereda, C., Ghika, J., Maeder, P., & Bogousslavsky, J. (2002). Strokes restricted to the insular cortex. Neurology, 59, 1950–1955. Chalmers, D. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2, 200–219. Coleman, M., Rodd, J., Davis, M., Johnsrude, I., Menon, D., Pickard, J., et al. (2007). Do vegetative patients retain aspects of language comprehension? Evidence from fMRI. Brain, 130, 2494–2507. Colvin, M., Handy, T., & Gazzaniga, M. S. (2003). Hemispheric asymmetries in the parietal lobes. Advances in Neurology, 93, 321–334. Cooney, J., & Gazzaniga, M. S. (2003). Neurological disorders and the structure of human consciousness. Trends in Cognitive Sciences, 7, 161–165.
8/17/09 2:24:15 PM
References 501 Corkin, S. (2002). What’s new with the amnesic patient, H. M.? Nature Reviews: Neuroscience, 3, 153–160. Courtney, S., Petit, L., Haxby, J., & Ungerleider, L. (1998). The role of prefrontal cortex in working memory: Examining the contents of consciousness. Philosophical Transactions of the Royal Society of London, B, 353, 1819–1828. Craig, A. (2005). Forebrain emotional asymmetry: A neuroanatomical basis? Trends in Cognitive Sciences, 9, 566–571. Crick, F., & Koch, C. (2003). A framework for consciousness. Journal of Neuroscience, 6, 119–126.
Fodor, J. A. (1983). Modularity of mind: An essay on faculty psychology. Cambridge, MA: MIT Press. Gazzaniga, M. S. (1983). Right hemisphere language following brain bisection. A 20-year perspective. American Psychologist, 38, 525–537. Gazzaniga, M. S. (1989). Organization of the human brain. Science, 245, 947–952. Gazzaniga, M. S. (2000). Cerebral specialization and interhemispheric communication: Does the corpus callosum enable the human condition? Brain, 123(Pt. 7), 1293–1326.
Crottaz-Herbette, S., & Menon, V. (2006). Where and when the anterior cingulate cortex modulates attentional response: Combined fMRI and ERP evidence. Journal of Cognitive Neuroscience, 18, 766–780.
Gazzaniga, M. S. (2005). The ethical brain. New York: Dana Press.
Cummings, J. (1997). Neuropsychiatric manifestations of right hemisphere lesions. Brain and Language, 57, 22–37.
Gazzaniga, M. S. (2008). Human: The science behind what makes us unique. New York: HarperCollins.
Damasio, A. (1985). Prosopagnosia. Trends in Neurosciences, 8, 132–135.
Gazzaniga, M. S., & LeDoux, J. E. (1978). The integrated mind. New York: Plenum Press.
Damasio, A. (1999). The feeling of what happens: Body and emotion in the making of consciousness (1st ed.). New York: Harcourt Brace. Daselaar, S., Rice, H., Greenberg, D., Cabeza, R., LaBar, K., & Rubin, D. (2008). The spatiotemporal dynamics of autobiographical memory: Neural correlates of recall, emotional intensity, and reliving. Cerebral Cortex, 18, 217–229. Dehaene, S., & Changeux, J.-P. (2004). Neural mechanisms for access to consciousness. In M. Gazzaniga (Ed.), The Cognitive Neurosciences III. Cambridge, MA: MIT Press. Dehaene, S., Kerszberg, M., & Changeux, J. P. (1998). A neuronal model of a global workspace in effortful cognitive tasks. Proceedings of the National Academy of Sciences, USA, 95,14529–14534. Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness: Basic evidence and a workspace framework. Cognition, 79(1–2), 1–37. Dennett, D. (1993). The message is: There is no medium. Philosophy and Phenomenological Research, 53, 889–931. Desimone, R., & Gross, C. (1979). Visual areas in the temporal cortex of the macaque. Brain Research, 178(2–3), 363–380. de Vignemont, F., & Singer, T. (2006). The empathic brain: How, when and why? Trends in Cognitive Sciences, 10, 435–441. Devinsky, O., Morrell, M., & Vogt, B. (1995). Contributions of anterior cingulate cortex to behaviour. Brain, 118, 279–306. Driver, J., & Vuilleumier, P. (2001). Perceptual awareness and its loss in unilateral neglect and extinction. Cognition, 79(1–2), 39–88. Düzel, E., Yonelinas, A., Mangun, G., Heinze, H., & Tulving, E. (1997). Event-related brain potential correlates of two states of conscious awareness in memory. Proceedings of the National Academy of Sciences, USA, 94, 5973–5978. Edelman, G. M. (1989). The remembered present: A biological theory of consciousness. New York: Basic Books. Engel, A., Fries, P., König, P., Brecht, M., & Singer, W. (1999). Temporal binding, binocular rivalry, and consciousness. Consciousness and Cognition, 8, 128–151. Etkin, A., & Wager, T. (2007). Functional neuroimaging of anxiety: A meta-analysis of emotional processing in PTSD, social anxiety disorder, and specific phobia. American Journal of Psychiatry, 164, 1476–1488.
c25.indd 501
Fendrich, R., Wessinger, C., & Gazzaniga, M. (2001). Speculations on the neural basis of islands of blindsight. Progress in Brain Research, 134, 353–366.
Gazzaniga, M. S., LeDoux, J., & Wilson, D. (1977). Language, praxis, and the right hemisphere: Clues to some mechanisms of consciousness. Neurology, 27, 1144–1147. Gazzaniga, M. S., & Miller, M. B. (2009). The left hemisphere does not miss the right hemisphere. In S. Laureys & G. Tononi (Eds). The Neurology of Consciousness (pp. 261–70). London: Elsevier. Giacino, J., Ashwal, S., Childs, N., Cranford, R., Jennett, B., Katz, D., et al. (2002). The minimally conscious state: Definition and diagnostic criteria. Neurology, 58, 349–353. Gloning, I., Gloning, K., & Hoff, H. (1968). Neuropsychological symptoms and syndromes in lesions of the occipital lobe and adjacent areas. Paris: Gauthier-Villars. Goldberg, E. (1990). Higher cortical functions in humans: The gradiental approach. In E. Goldberg (Ed.), Contemporary neuropsychology and the legacy of Luria (pp. 229–276). Hillsdale, NJ: Erlbaum. Goodale, M. A. (1990). Brain asymmetries in the control of reaching. In M. A. Goodale (Ed.) Vision and action: The control of grasping (pp.14–32). Norwood NJ: Ablex. Hassabis, D., Kumaran, D., Vann, S., & Maguire, E. (2007). Patients with hippocampal amnesia cannot imagine new experiences. Proceedings of the National Academy of Sciences, USA, 104, 1726–1731. Hassabis, D., & Maguire, E. (2007). Deconstructing episodic memory with construction. Trends in Cognitive Sciences, 11, 299–306. Hochstein, S., & Ahissar, M. (2002). View from the top: Hierarchies and reverse hierarchies in the visual system. Neuron, 36, 791–804. Hoshi, E., & Tanji, J. (2004). Differential roles of neuronal activity in the supplementary and presupplementary motor areas: From information retrieval to motor planning and execution. Journal of Neurophysiology, 92, 3482–3499. Humphrey, N., & Weiskrantz, L. (1967). Vision in monkeys after removal of the striate cortex. Nature, 215, 595–597. Jiang, Y., & He, S. (2006). Cortical responses to invisible faces: Dissociating subsystems for facial-information processing. Current Biology, 16, 2023–2029. Jones, E. (1998a). A new view of specific and nonspecific thalamocortical connections. Advances in Neurology, 77, 49–71; discussion 72–43.
Farah, M. (1992). Agnosia. Current Opinion in Neurobiology, 2, 162–164.
Jones, E. (1998b). Viewpoint: The core and matrix of thalamic organization. Neuroscience, 85, 331–345.
Farrer, C., Frey, S., Van Horn, J., Tunik, E., Turk, D., Inati, S., et al. (2007). The angular gyrus computes action awareness representations. Cerebral Cortex, 18, 254–261.
Jones, E. (2002a). Thalamic circuitry and thalamocortical synchrony. Philosophical Transactions of the Royal Society of London, B, 357, 1659–1673.
Fendrich, R., Wessinger, C., & Gazzaniga, M. (1992). Residual vision in a scotoma: Implications for blindsight. Science, 258, 1489–1491.
Jones, E. (2002b). Thalamic organization and function after cajal. Progress in Brain Research, 136, 333–357.
8/17/09 2:24:15 PM
502
Consciousness
Kanwisher, N. (2001). Neural events and perceptual awareness. Cognition, 79(1–2), 89–113.
Logothetis, N. (1998). Single units and conscious vision. Philosophical Transactions of the Royal Society of London, B, 353, 1801–1818.
Kanwisher, N., McDermott, J., & Chun, M. (1997). The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience, 17, 4302–4311.
Logothetis, N., Leopold, D., & Sheinberg, D. (1996). What is rivalling during binocular rivalry? Nature, 380, 621–624.
Karnath, H., Baier, B., & Nägele, T. (2005). Awareness of the functioning of one’s own limbs mediated by the insular cortex? Journal of Neuroscience, 25, 7134–7138.
Logothetis, N., & Sheinberg, D. (1996). Visual object recognition. Annual Review of Neuroscience, 19, 577–621. Luria, A. (1959). Disorders of “simultaneous perception” in a case of bilateral occipito-parietal brain injury. Brain, 82, 437–449.
Kilner, J., Vargas, C., Duval, S., Blakemore, S., & Sirigu, A. (2004). Motor activation prior to observation of a predicted movement. Journal of Neuroscience, 7, 1299–1301.
MacLean, P. (1949). Psychosomatic disease and the ‘visceral brain’: Recent developments bearing on the Papez theory of emotion. Psychosomatic Medicine, 11, 338–353.
Kimura, D. (1982). Left-hemisphere control of oral and brachial movements and their relation to communication. Philosophical Transactions of the Royal Society of London, B, 298, 135–149.
Maguire, E. (2001). Neuroimaging studies of autobiographical event memory. Philosophical Transactions of the Royal Society of London, B, 356, 1441–1451.
Kitchener, E., Hodges, J., & McCarthy, R. (1998). Acquisition of post-morbid vocabulary and semantic facts in the absence of episodic memory. Brain, 121, 1313–1327.
Marois, R., & Ivanoff, J. (2005). Capacity limits of information processing in the brain. Trends in Cognitive Sciences, 9, 296–305.
Klein, S., & Loftus, J. (2002). Memory and temporal experience: The effects of episodic memory loss on an amnesic patient’s ability to remember the past and imagine the future. Social Cognition, 20, 353–379. Koch, C. (2004). The quest for consciousness: A neurobiological approach. Denver, CO: Roberts and Co. Koch, C., & Tsuchiya, N. (2007). Attention and consciousness: Two distinct brain processes. Trends in Cognitive Sciences, 11, 16–22. Lane, R., Reiman, E., Axelrod, B., Yun, L., Holmes, A., & Schwartz, G. (1998). Neural correlates of levels of emotional awareness: Evidence of an interaction between emotion and attention in the anterior cingulate cortex. Journal of Cognitive Neuroscience, 10, 525–535.
Marois, R., Yi, D., & Chun, M. (2004). The neural fate of consciously perceived and missed events in the attentional blink. Neuron, 41, 465–472. Mele, A. R. (2006). Free will and luck. Oxford, New York: Oxford University Press. Mendez, M., Chen, A., Shapira, J., & Miller, B. (2005). Acquired sociopathy and frontotemporal dementia. Dementia and Geriatric Cognitive Disorders, 20(2–3), 99–104. Mills, C. (1912a). The cerebral mechanisms of emotional expression. Transactions of the College of Physicians of Philadelphia, 34, 381–390.
Lau, H., & Passingham, R. (2006). Relative blindsight in normal observers and the neural correlate of visual consciousness. Proceedings of the National Academy of Sciences, USA, 103, 18763–18768.
Mills, C. (1912b). The cortical representation of emotion, with a discussion of some points in the general nervous mechanism of expression in its relation to organic nervous mental disease. Proceedings of the American Medico-Psychological Association, 19, 297–300.
Lau, H., Rogers, R., Haggard, P., & Passingham, R. (2004). Attention to intention. Science, 303, 1208–1210.
Milner, A. D., & Goodale, M. A. (1995). The visual brain in action. Oxford: Oxford University Press.
Laureys, S., Faymonville, M., Luxen, A., Lamy, M., Franck, G., & Maquet, P. (2000). Restoration of thalamocortical connectivity after recovery from persistent vegetative state. Lancet, 355, 1790–1791.
Morris, J., Ohman, A., & Dolan, R. (1998). Conscious and unconscious emotional learning in the human amygdala. Nature, 393, 467–470.
Laureys, S., Perrin, F., & Brédart, S. (2007). Self-consciousness in noncommunicative patients. Consciousness and Cognition, 16, 722–741; discussion 742–725. Leiguarda, R., & Marsden, C. (2000). Limb apraxias: Higher-order disorders of sensorimotor integration. Brain, 123, 860–879. Lenneberg, E. (1967). Biological foundations of language. London: Wiley. Lennie, P. (2001). Color coding in the cortex. In K. Gegenfurtner & L. Sharpe (Eds.), Color vision: From genes to perception (pp. 235–248). New York: Cambridge University Press.
Nachev, P., Rees, G., Parton, A., Kennard, C., & Husain, M. (2005). Volition and conflict in human medial frontal cortex. Current Biology, 15, 122–128. Nagel, T. (1974). What is it like to be a bat? Philosophical Review, 83, 435–450. Newsome, W., Britten, K., Salzman, C., & Movshon, J. (1990). Neuronal mechanisms of motion perception. Cold Spring Harbor Symposia on Quantitative Biology, 55, 697–705. Olausson, H., Lamarre, Y., Backlund, H., Morin, C., Wallin, B., Starck, G., et al. (2002). Unmyelinated tactile afferents signal touch and project to insular cortex. Journal of Neuroscience, 5, 900–904.
Leopold, D., & Logothetis, N. (1996). Activity changes in early visual cortex reflect monkeys’ percepts during binocular rivalry. Nature, 379, 549–553.
Osuji, I., & Cullum, C. (2005). Cognition in bipolar disorder. Psychiatric Clinics of North America, 28, 427–441.
Li, F., VanRullen, R., Koch, C., & Perona, P. (2002). Rapid natural scene categorization in the near absence of attention. Proceedings of the National Academy of Sciences, USA, 99, 9596–9601.
Owen, A., & Coleman, M. (2007). Functional MRI in disorders of consciousness: Advantages and limitations. Current Opinion in Neurology, 20, 632–637.
Libet, B. (2004). Mind time: The temporal factor in consciousness. Cambridge, MA: Harvard University Press.
Owen, A., Coleman, M., Boly, M., Davis, M., Laureys, S., & Pickard, J. (2006). Detecting awareness in the vegetative state. Science, 313, 1402.
Libet, B., Gleason, C., Wright, E., & Pearl, D. (1983). Time of conscious intention to act in relation to onset of cerebral activity (readinesspotential). The unconscious initiation of a freely voluntary act. Brain, 106, 623–642. Liotti, M., Brannan, S., Egan, G., Shade, R., Madden, L., Abplanalp, B., et al. (2001). Brain responses associated with consciousness of breathlessness (air hunger). Proceedings of the National Academy of Sciences, USA, 98, 2035–2040.
c25.indd 502
Papez, J. (1937). A proposed mechanism of emotion. Archives of Neurology and Psychiatry, 38, 725–743. Parvizi, J., & Damasio, A. (2001). Consciousness and the brainstem. Cognition, 79(1–2), 135–160. Perenin, M., & Vighetto, A. (1988). Optic ataxia: A specific disruption in visuomotor mechanisms: Pt. I. Different aspects of the deficit in reaching for objects. Brain, 111, 643–674.
8/17/09 2:24:16 PM
References 503 Posner, M. (1994). Attention: The mechanisms of consciousness. Proceedings of the National Academy of Sciences, USA, 91, 7398–7403.
Singer, W. (2001). Consciousness and the binding problem. Annals of the New York Academy of Sciences, 929, 123–146.
Posner, M., & Dehaene, S. (1994). Attentional networks. Trends in Neurosciences, 17, 75–79.
Sirigu, A., Daprati, E., Ciancia, S., Giraux, P., Nighoghossian, N., Posada, A., et al. (2004). Altered awareness of voluntary action after damage to the parietal cortex. Journal of Neuroscience, 7, 80–84.
Ramachandran, V. (1998). Consciousness and body image: Lessons from phantom limbs, Capgras syndrome and pain asymbolia. Philosophical Transactions of the Royal Society of London, B, 353, 1851–1859. Rauch, S., Shin, L., & Phelps, E. (2006). Neurocircuitry models of posttraumatic stress disorder and extinction: Human neuroimaging research: Past, present, and future. Biological Psychiatry, 60, 376–382. Raymond, J., Shapiro, K., & Arnell, K. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology: Human Perception and Performance, 18, 849–860. Rees, G., Kreiman, G., & Koch, C. (2002). Neural correlates of consciousness in humans. Nature Reviews: Neuroscience, 3, 261–270. Rizzo, M., & Robin, D. (1990). Simultagnosia: A defect of sustained attention yields insights on visual information processing. Neurology, 40, 447–455.
Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5, 42. Tononi, G., & Edelman, G. (1998). Consciousness and complexity. Science, 282, 1846–1851. Tsakiris, M., Hesse, M., Boy, C., Haggard, P., & Fink, G. (2007). Neural signatures of body ownership: A sensory network for bodily self-consciousness. Cerebral Cortex, 17, 2235–2244. Tulving, E. (1983). Elements of episodic memory. Oxford: Oxford University Press. Tulving, E. (1985). Memory and consciousness. Canadian Psychology, 26, 1–12.
Roser, M., & Gazzaniga, M. (2004). Automatic brains- interpretive minds. Current Directions in Psychological Science, 13, 56–59.
Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25.
Rourke, B., Ahmad, S., Collins, D., Hayman-Abello, B., Hayman-Abello, S., & Warriner, E. (2002). Child clinical/pediatric neuropsychology: Some recent advances. Annual Review of Psychology, 53, 309–339.
Tulving, E., Kapur, S., Markowitsch, H., Craik, F., Habib, R., & Houle, S. (1994). Neuroanatomical correlates of retrieval in episodic memory: Auditory sentence recognition. Proceedings of the National Academy of Sciences, USA, 91, 2012–2015.
Rudebeck, P., Buckley, M., Walton, M., & Rushworth, M. (2006). A role for the macaque anterior cingulate gyrus in social valuation. Science, 313, 1310–1312. Saper, C. (2000). Brain stem modulation of sensation, movement, and consciousness. In E. Kandel, J. Schwartz, & T. Jessell (Eds.), Principles of neural science (4th ed., pp. 889–909). New York: McGraw-Hill. Schacter, D., Addis, D., & Buckner, R. (2007). Remembering the past to imagine the future: The prospective brain. Nature Reviews: Neuroscience, 8, 657–661. Schacter, S., & Singer, J. (1962). Cognitive, social, and physiological determinants of emotional state. Psychological Review, 69, 379–399. Schiff, N., Giacino, J., Kalmar, K., Victor, J., Baker, K., Gerber, M., et al. (2007). Behavioural improvements with thalamic stimulation after severe traumatic brain injury. Nature, 448, 600–603.
Ullman, M. (2001). A neurocognitive perspective on language: The declarative/procedural model. Nature Reviews: Neuroscience, 2, 717–726. Ungerleider, L., & Mishkin, M. (1982). Two cortical visual systems. In D. Ingle, M. Goodale, & R. Mansfield (Eds.), Analysis of visual behavior (pp. 549–586). Cambridge, MA: MIT Press. Van der Werf, Y., Witter, M., & Groenewegen, H. (2002). The intralaminar and midline nuclei of the thalamus: Anatomical and functional evidence for participation in processes of arousal and awareness. Brain Research: Brain Research Reviews, 39(2–3), 107–140. Volpe, B., Ledoux, J., & Gazzaniga, M. (1979). Information processing of visual stimuli in an “extinguished” field. Nature, 282, 722–724.
Schiff, N., & Plum, F. (2000). The role of arousal and “gating” systems in the neurology of impaired consciousness. Journal of Clinical Neurophysiology, 17, 438–452.
Vuilleumier, P., Armony, J., Clarke, K., Husain, M., Driver, J., & Dolan, R. (2002). Neural response to emotional faces with and without awareness: Event-related fMRI in a parietal patient with visual extinction and spatial neglect. Neuropsychologia, 40, 2156–2166.
Scoville, W., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal lesions. Journal of Neurology, Neurosurgery, and Psychiatry, 20, 11–21.
Vuilleumier, P., Armony, J., Driver, J., & Dolan, R. (2001). Effects of attention and emotion on face processing in the human brain: An eventrelated fMRI study. Neuron, 30, 829–841.
Searle, J. (2000). Consciousness. Annual Review of Neuroscience, 23, 557–578.
Warrington, E., & James, M. (1967). An experimental investigation of facial recognition in patients with unilateral cerebral lesions. Cortex, 3, 317–326.
Sergent, C., Baillet, S., & Dehaene, S. (2005). Timing of the brain events underlying access to consciousness during the attentional blink. Journal of Neuroscience, 8, 1391–1400.
c25.indd 503
Squire, L., & Zola, S. (1996). Structure and function of declarative and nondeclarative memory systems. Proceedings of the National Academy of Sciences, USA, 93, 13515–13522.
Weintraub, S. (2000). Neuropsychological Assessment of Mental State. In M-M. Mesulam (Ed.) Principles of Behavioral and Cognitive Neurology (pp. 135–136). Oxford: Oxford University Press.
Sheinberg, D., & Logothetis, N. (1997). The role of temporal cortical areas in perceptual organization. Proceedings of the National Academy of Sciences, USA, 94, 3408–3413.
Weiskrantz, L. (1996). Blindsight revisited. Current Opinion in Neurobiology, 6, 215–220.
Singer, T., Seymour, B., O’Doherty, J., Kaube, H., Dolan, R. J., & Frith, C. D. (2004). Empathy for pain involves the affective but not sensory components of pain. Science, 303, 1157–62.
Weiskrantz, L., Warrington, E., Sanders, M., & Marshall, J. (1974). Visual capacity in the hemianopic field following a restricted occipital ablation. Brain, 97, 709–728.
Singer, T., Seymour, B., O’Doherty, J., Stephan, K., Dolan, R., & Frith, C. (2006). Empathic neural responses are modulated by the perceived fairness of others. Nature, 439, 466–469.
Wheeler, M., Stuss, D., & Tulving, E. (1997). Toward a theory of episodic memory: The frontal lobes and autonoetic consciousness. Psychological Bulletin, 121, 331–354.
8/17/09 2:24:16 PM
c25.indd 504
8/17/09 2:24:17 PM
Chapter 26
Neuronal Basis of Learning JOSEPH E. STEINMETZ AND DERICK H. LINDQUIST
the past 25 years—eyeblink classical conditioning and fear conditioning. Both are Pavlovian associative learning procedures that involve the pairing of discrete stimuli in a relatively short temporal window. Both procedures appear to be conserved across mammalian species, behaviorally and neurally, meaning that acquisition and performance of the learned response are similar across species as are the neural structures and systems involved in the acquisition and performance of the responses. The use of both procedures has produced a wealth of data about the neuronal basis of learning. We also summarize data gathered through the use of other associative learning procedures, including instrumental aversive and appetitive conditioning paradigms.
For well over 100 years, experimental psychologists and brain scientists have explored the neuronal basis of learning—that is, the relationship between changes in behavior or cognition and changes in activity in the nervous system that produce or are affected by the behavioral or cognitive change. The early work of Sherrington, Pavlov, Lashley, Hebb, and other giants of neuroscience on learning, memory, and behavioral plasticity provided the groundwork for the past 30 years of discoveries (e.g., Hebb, 1949; Lashley, 1930; Pavlov, 1927; Sherrington, 1906). The field has come a long way in describing and defining the neuronal correlates of learning. The research has been at virtually all levels of analysis including descriptions of nervous system structures and interacting systems involved in learning and behavior change (Thompson, 1976; Thompson & Spencer, 1966), analyses of the activity of individual neurons in areas known to be involved in encoding learning (GoldmanRakic, 1995; Olds, Disterhoft, Segal, Kornblith, & Hirsh, 1972), description of events at neuronal receptors that promote short-term and long-term learning-related neuronal change (Morris, Anderson, Lynch, & Baudry, 1986; Wise, 2004), and explorations of cellular, molecular, or genetic processes that are related to behavioral plasticity (Abel & Lattal, 2001; Tsien, Huerta, & Tonegawa, 1996). A very productive approach to the study of the neuronal basis of learning and memory has been the development and use of simple behavioral and neural model systems, which have allowed detailed analysis of neuronal activity associated with learning. Examples of this approach include the vertebrate models developed by Thompson and associates (e.g., Patterson, Cegavske, & Thompson, 1973; Thompson, 1976), the invertebrate models developed by Kandel and colleagues (e.g., Carew & Sahley, 1986; Hawkins, Kandel, & Bailey, 2006), and brain slice approaches used by many researchers (e.g., Alger & Teyler, 1976; Schreurs, Oh, & Alkon, 1996). In this chapter, we provide a summary of our current understanding of the neuronal basis of learning as revealed through the use of model systems. We feature two model systems that have been used extensively over
EYEBLINK CLASSICAL CONDITIONING Behavioral Paradigm In a typical eyeblink classical conditioning experiment, a neutral stimulus, such as a tone or light, is presented 150 ms to 1,500 ms before an aversive stimulus, which is usually a periorbital eye shock or an air puff delivered to the cornea of the eye. The neutral stimulus is called the conditional stimulus (CS). Initially the presentation of the CS produces no visible response or, at most, a slight orienting response that declines with repeated presentation. The second stimulus is called the unconditional stimulus (US) and it produces a vigorous eyeblink when presented. The reflexive US-elicited eyeblink is called the unconditioned response (UR). With several pairings of the CS and US, an eyeblink response to the CS can be observed and it is called the conditioned response (CR). With enough training, the eyeblink CR becomes well-timed such that the peak of the response occurs at the time of the US onset. If the interval of time between the CS and US (called the interstimulus interval, ISI) is changed, with additional training the response topography will change such that the peak of the response moves to the new time when US onset occurs. 507
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c26.indd 507
8/18/09 6:23:44 PM
508
Neuronal Basis of Learning CS US Delay conditioning
CS US Trace conditioning
Figure 26.1 Schematic drawing of the delay eyeblink conditioning and trace eyeblink conditioning behavioral procedures. Note: During delay conditioning, the CS is presented (indicated by upward deflection in the drawing) and overlaps in time either fully or partially by presentation of the US. During trace conditioning, the CS is presented, a length of time elapses (the trace interval), and then the US is presented. The CS and the US do not overlap during trace conditioning and the procedure is commonly considered to be more complex because a memory trace of the CS must be created for learning to occur.
Most eyeblink conditioning studies have involved either delay or trace conditioning procedures (see Figure 26.1). In the delay conditioning procedure, the CS overlaps with presentation of the US. In the trace conditioning procedure, the CS is presented and a gap of time is allowed before the US is presented. In this fashion, a “memory” component is added such that the two stimuli do not overlap in time. Trace conditioning is often used as a more complex variation of the Pavlovian conditioning procedure. By far, the majority of eyeblink conditioning experiments that have been conducted have involved rabbits, animals that adapt to restraint easily and have proven to be ideal subjects for studies involving neural recording, stimulation, or pharmacology (see Romano & Patterson, 1987, for review). More recently, however, rats, mice, and humans have been increasingly used in eyeblink conditioning experiments, especially those designed to explore developmental, genetic, or clinical/cognitive questions (see Lavond & Steinmetz, 2003, for a comprehensive review of eyeblink classical conditioning). Early Studies of the Neuronal Bases of Eyeblink Conditioning For the first three-quarters of the past century, most of the efforts to understand the neural substrates of learning and memory concentrated on the involvement of cerebral cortex and associated higher structures in the brain. In part, this was due to the fact that processes such as attention, perception, decision making, and learning and memory were thought to be higher functions than the vegetative and reflex functions typically associated with brain stem and lower structures. This is exemplified by the work
c26.indd 508
of Lashley (1950), who spent many years systematically removing portions of the cerebral cortex in hopes of identifying the memory or “engram” for simple maze learning and memory. Given the historical view that higher brain functions were likely involved in learning and memory it is not surprising that prior to 1980 most eyeblink conditioning experiments concentrated on the involvement of the forebrain. For example, Oakley and Steele-Russell published a series of studies that revealed that large lesions of cerebral neocortex did not abolish eyeblink CRs (Oakley & Russell, 1972, 1974, 1976, 1977). Mauk and Thompson (1987) used a decerebrate preparation to show that eyeblink conditioning was not critically dependent on neocortical function— that is, when decerebrations followed acquisition training, eyeblink CRs were not affected. Although these studies showed that neocortex was not necessary for the expression of eyeblink conditioning, there are other studies that suggest that neocortex may be engaged during the learning of this task. In an extensive series of studies involving the cat, Woody and his colleagues studied the involvement of cerebral cortex in an eye-blink conditioning task that involved pairing an auditory CS with a glabellar tap US (e.g., Woody & Black-Cleworth, 1973; Woody & Engel, 1972). Using intracellular and extracellular recording methods, these researchers showed that learning-related neuronal spiking patterns could be observed in motor neocortical areas and that persistent, learning-related changes in neuronal excitability could be recorded (Woody, 1986). Although these data demonstrate that neurons in the cat motor neocortex can encode the conditioning process, extensive lesions of the rabbit motor cortex failed to affect conditioning in the rabbit (Ivkovich & Thompson, 1997). The studies that involved complete removal of neocortex argue strongly that neocortical neurons are not essential for the learning. Many studies have been conducted over the years to assess the involvement of the hippocampus in eyeblink conditioning. The major historic reason for these studies is straightforward—the involvement of hippocampus and related limbic structures have been implicated in a wide range of learning, memory, and plasticity functions that range from anterograde human amnesia to spatial learning to basic associative learning (see Squire, 1992, for review). In the mid-1970s, Richard Thompson and his colleagues began to use the rabbit eyeblink conditioning preparation to study the involvement of hippocampal neurons in associative learning. They recorded the activity of single hippocampal pyramidal cells and reported that before behavioral CRs could be seen, the hippocampal units began to show a spiking pattern that was closely related to the topography of the behavioral response. That is, their
8/18/09 6:23:45 PM
Eyeblink Classical Conditioning 509
summed spiking formed an amplitude-time course model of the CR (Berger, Alger, & Thompson, 1976; Berger, Rinaldi, Weisz, & Thompson, 1983; Berger & Thompson, 1978a). The pattern was not seen during unpaired presentations of the CS and US and disappeared with extinction. Other limbic system structures were also studied including the medial septum, subiculum, and retrosplenial cortex. Berger and Thompson (1978b) showed that medial septal neurons were vigorously activated by presentation of the CS and/or US but that this activity declined with overtraining. Because medial septal neurons provide the lion-share of cholinergic input to the hippocampus, many studies have used pharmacological manipulation, medial septal lesions, or stimulation to examine the role of the cholinergic input in eyeblink conditioning. Solomon and associates (Solomon, Solomon, Van der Schaaf, & Perry, 1983) systemically administered the anticholinergic drug scopolamine and showed the disruption of eyeblink conditioning and associated learning-related activity. While the recording and pharmacological studies provided strong support that neurons in the hippocampus were recruited during eyeblink conditioning, other data argue, that at least for delay conditioning, the hippocampus is not required for eyeblink conditioning. Schmaltz and Theios (1972) showed that lesions of the hippocampus had little or no effect on delay eyeblink conditioning. What then is the role of the hippocampus in conditioning? It has been suggested that neurons in the hippocampus only transiently encode the learning (Sears & Steinmetz, 1990) and that hippocampal activation may be related to aspects of training such as shifts in the CS, encoding of training stimuli or context, or perhaps awareness of contingency (e.g., Clark & Squire, 1998; Miller & Steinmetz, 1997; Schmajuk & DiCarlo, 1992; Solomon & Moore, 1975). A long line of research has demonstrated that the hippocampus is involved in trace conditioning procedures, where CS offset occurs before US onset, therefore requiring the subject to form a memory trace of the CS. Lesions of the hippocampus have been shown to significantly impair or abolish trace conditioned responding without affecting delay conditioning (Moyer, Deyo, & Disterhoft, 1990; Port, Romano, Steinmetz, Mikhail, & Patterson, 1986; Solomon, Van der Schaaf, Thompson, & Weisz, 1986). Trace conditioning is known to significantly increase hippocampal activity (Disterhoft & McEchron, 2000) and also produce cellularlevel changes in pyramidal neurons, such as a long-term reduction in calcium-dependent after-hyperpolarization potentials (Coulter et al., 1989; Disterhoft, Golden, Read, Coulter, & Alkon, 1988). Together these data suggest that the hippocampus may be essentially engaged during trace conditioning to somehow encode the trace period. If this is the case, what has yet to be resolved is how hippocampal
c26.indd 509
activity influences brain stem and cerebellar areas know to be critical for both delay and trace learning (see the discussion that follows). Some recent experiments suggest that hippocampal connections to regions of the prefrontal cortex may be involved in the processing of this important information. The medial prefrontal cortex has been implicated in executive function and other higher cognitive processes including learning and memory (e.g., Asaad, Rainer, & Miller, 1998; Browning, Easton, Buckley, & Gaffan, 2005; Goldman-Rakic, 1995). Although a number of studies have shown little or no effect of prefrontal lesions on simple delay learning (e.g., McLaughlin, Skaggs, Churchwell, & Powell, 2002; Powell, Churchwell, & Burriss, 2005) other studies have shown that prefrontal cortex lesions impair trace conditioned responding (Kronforst-Collins & Disterhoft, 1998; McLaughlin et al., 2002; Powell et|nb|al., 2005; Weible, McEchron, & Disterhoft, 2000). Powell and his colleagues used trace conditioning procedures to train rabbits to a conditioning criterion (Simon, Knuckley, Churchwell, & Powell, 2005). They then lesioned the prelimbic area of the medial prefrontal cortex immediately, 24 hours, 1 week, 2 weeks, or 1 month after training. They saw temporary deficits when training was resumed after the lesions. Lesions given before training had no effect, thus indicating a possible posttraining lesion effect on a retrieval process. The lesions did not support a role for the medial prefrontal cortex in the acquisition process or as a storage site for memory. The authors speculated that perhaps hippocampal input to extrastriatal structures is necessary for the persistence of the memory trace for the CS during the trace period (hypothesized to take place through a brain circuit involving the subiculum, prefrontal cortex, and neostriatum) and also that the hippocampus to medial prefrontal cortex connections may critically inform essential eyeblink conditioning circuitry located in noncortical structures during the trace procedure. Later Studies of the Neuronal Basis of Eyeblink Conditioning: The Cerebellum and Brain Stem The data collected in the 1970s clearly indicated that an area of the brain below the level of the forebrain was important for the acquisition and performance of the classically conditioned eyeblink response. These data suggested that neurons in the forebrain were engaged during conditioning process but they were not necessary for conditioning to be expressed. These findings raised two possibilities concerning the nature of the neuronal system involved in this simple form of associative learning. First, it was possible that there were one or more populations of neurons in lower brain areas that were essential for acquisition and
8/18/09 6:23:45 PM
510
Neuronal Basis of Learning
performance of eyeblink conditioning. Alternatively, it was possible that parallel conditioning pathways existed in higher and lower brain systems for eyeblink conditioning and only one of these were necessary for conditioning to occur. It was known in the early 1970s that a portion of the essential circuitry for eyeblink conditioning resided in lower areas of the brain. A number of experiments showed that motoneurons involved in generating the CR and UR were contained in the abducens and accessory abducens nuclei as well as the facial nucleus (Cegavske, Thompson, Patterson, & Gormezano, 1976). Neuronal recordings taken from these nuclei revealed activations when a CR or an UR was executed (Cegavske, Patterson, & Thompson, 1979) and lesions of these nuclei abolished portions of the CR and UR that were activated by the nuclei that were removed (Disterhoft, Quinn, Weiss, & Shipley, 1985; Steinmetz, Lavond, Ivkovich, Logan, & Thompson, 1992). Armed with strong evidence suggesting that a region or regions of the lower brain was part of the necessary system involved in classical eyeblink conditioning, beginning around 1980, Richard Thompson and his colleagues began a systematic study of potential brain stem areas that might be involved in conditioning. A variety of techniques were used including brain lesion, multiple- and single-unit recording, and pharmacological activation and inactivation methods. These studies and 25 years of subsequent work have established critical roles for the cerebellum and discrete regions of the brain stem in eyeblink conditioning. The earliest lesion studies involved large aspiration lesions of the cerebellum that encompassed both cerebellar cortex and deep cerebellar nuclei. These large lesions were found to abolish previously established CRs and also prevent the formation of new CRs if training was given after the lesion (Lincoln, McCormick, & Thompson, 1982; McCormick et al., 1981). The lesion effect was complete and affected CRs only on the eye ipsilateral to the lesion. This was an important finding. Since CRs could be established on the contralateral side, nonspecific lesion effects, such as loss of motivation for learning, could be ruled out. Also, URs were largely not affected by the lesions thus indicating that the observed loss of CRs was not due to a generalized performance deficit. Subsequent lesion studies more precisely defined the location of the populations of neurons involved in conditioning. Small electrolytic lesions confined to the dorsolateral region of the interpositus nucleus produced a complete abolition of CRs (McCormick & Thompson, 1984a; Steinmetz et al., 1992). Kainic acid lesions that destroyed as little as a cubic millimeter of the dorsolateral interpositus nucleus were found effective in abolishing CRs (Lavond, Hembree, & Thompson, 1985). And the effect was demonstrated to be permanent—CRs do not reemerge with as
c26.indd 510
much as 12 months of daily postlesion, paired, CS-US training (Steinmetz, Logue, & Steinmetz, 1992). Studies using the GABA agonist muscimol to temporarily inactivate the dorsolateral interpositus nucleus have perhaps produced the most compelling case for the necessary involvement of this brain area in eyeblink conditioning. Krupa, Thompson, and Thompson (1993) infused muscimol into this region of the interpositus nucleus during several days of acquisition training. As expected, no CRs were seen during those initial training days. The most important result came during the next several days of training with saline infusion: Rabbits showed acquisition of CRs at a rate that was identical to animals trained in the initial phase with saline injections. It was as if no training trials were delivered during the muscimol infusion stage (i.e., in learning terms, no savings of training were seen as would be expected if critical plasticity processes occurred during the muscimol infusion). Other infusion studies have demonstrated the critical nature of interpositus nucleus involvement. For example, infusions of the NMDA antagonist AP5 into the interpositus nucleus severely retarded conditioning (G. Chen & Steinmetz, 2000a) as did infusions of the general purpose protein kinase inhibitor, H7 (G. Chen & Steinmetz, 2000b) or anisomycin, a general protein synthesis inhibitor (Bracha et al., 1998). In another study, Gomi and associates (1999) showed that interpositus infusions of the transcription inhibitory astinomycin D blocked learning of the CR and that training increased the expression of KKIAMRE, a cdc2related kinase. Finally, there are data available that show that structural changes occur in the interpositus nucleus as a result of eyeblink classical conditioning. Kleim and colleagues (2002) showed that rats given eyeblink conditioning had significantly more excitatory synapses per neuron in the interpositus nucleus relative to rats given either no training or explicitly unpaired presentations of the CS and the US. This study demonstrated that CR development was related to synaptogenesis in the interpositus nucleus—a solid demonstration of a neuronal structural change related to learning and memory formation. Together, these data provided very strong evidence that the interpositus nucleus of the cerebellum was essential for acquisition and performance of the classically conditioned eyeblink response and that parallel activation of higher brain systems was not necessary for basic delay eyeblink conditioning. It appears that the learning-related increase in activity in the interpositus nucleus engages neurons in the red nucleus, which in turn activates motor neurons responsible for the CR (Chapman, Steinmetz, Sears, & Thompson, 1990; Haley, Thompson, & Madden, 1988). Information about the CR is projected to higher brain areas, however, via ascending input that arises from the interpositus nucleus and possibly the red nucleus (e.g., Sears, Logue, & Steinmetz,
8/18/09 6:23:45 PM
Eyeblink Classical Conditioning 511
1996). In addition, descending influences from higher brain areas are likely at many different points along the essential conditioning pathways including precerebellar nuclei, the cerebellum, the red nucleus, and the motor nuclei. Currently, the involvement of higher brain areas in conditioning is being studied in a number of laboratories (e.g., Simon et al., 2005; Weiss, Weible, Galvez, & Disterhoft, 2006). Given the large size of cerebellar cortex and the enormous number of neurons contained within it, it has been assumed over the years that cerebellar cortical regions also play an essential role in eyeblink conditioning. Results from cortical lesion experiments, however, have been mixed and not always supportive of an essential role for cerebellar cortex. For example, eyeblink conditioning (albeit at a reduced level of CR production) has been observed in pcd mice that have a complete loss of cerebellar Purkinje cells (L. Chen, Bao, Lockard, Kim, & Thompson, 1996). Also, Nolan and Freeman (2005) intraventricularly infused the immunotoxin OX7-Saporin, which selectively destroys Purkinje cells in the cerebellar cortex, after rats had acquired the conditioned eyeblink response. They showed that reacquisition of the CR was impaired but that the rats could show learned inhibitory responses when conditioned inhibition training was given after infusion. These data suggest differential roles for the cerebellar cortex in excitatory and inhibitory learning.
(B)
Spikes/sec
Trial
(A)
One region of the cerebellar cortex that has received much attention is Larsell’s Lobule HVI, which is known to project to the critical region of the interpositus nucleus and receive inputs from precerebellar areas thought to be involved in CS and US processing. Results of lobule HVI lesions have been somewhat variable: Some research groups have reported complete abolition of CRs after removal (Yeo & Hardiman, 1992; Yeo, Hardiman, & Glickstein, 1985). Others have reported little or no effect of the lesion (Woodruff-Pak, Lavond, Logan, Steinmetz, & Thompson, 1993) while others have reported that the lesions affect CR amplitude and CR acquisition rate (Lavond & Steinmetz, 1989). From these data, it has been difficult to establish a precise role for lobule HVI in eyeblink conditioning, although some data have suggested it may be involved in memory consolidation processes (Cooke, Attwell, & Yeo, 2004). It is possible that other cerebellar cortical regions play a more critical role in conditioning. In a series of papers, Mauk and colleagues have shown that lesions of the anterior lobe of the cerebellum consistently affect the timing and sometimes the execution of CRs (Mauk & Buonomano, 2004; Perrett, Ruiz, & Mauk, 1993). Additional evidence for the critical involvement of the cerebellum in eyeblink conditioning has come from neural recording experiments (see Figure 26.2). Multiple-unit and
CS
US
CS
US
67 33
Standard Score
0 5 0 ⫺5 0
Figure 26.2 Neural recording has been used to establish an important role for the cerebellum in eyeblink classical conditioning. Note: A: The performance of a CR recorded on a single trial in a rabbit (top trace) along with concomitantly recorded neural activity on that trial from the anterior lobe of the cerebellar cortex (bottom trace). The tone CS period is marked by the bar at the bottom. The insert shows a complex spike marked by the downward arrow, indicating that the action potentials shown are likely from Purkinje cells. Note the increase in Purkinje cell firing on this single trial when the tone CS was presented. B: Summarized
c26.indd 511
10
data across a series of conditioning trials. The top panel shows individual behavioral responses recorded on several trials from a well-trained rabbit. The middle panel shows peristimulus time histograms created by summing the activity of the Purkinje cell shown in (A) across the training session. The bottom panel shows standardized scores of the unit activity. Note the increase in activity just after CS onset. From “Purkinje Cell Activity in the Cerebellar Anterior Lobe during Rabbit Eyeblink Conditioning,” by J. T. Green and J. E. Steinmetz, 2005, Learning and Memory, 12, 260–269. Adapted with permission.
8/18/09 6:23:46 PM
512
Neuronal Basis of Learning
single-unit recordings taken from the dorsolateral interpositus nucleus have revealed populations of neurons that discharge in a CR-related pattern (Berthier & Moore, 1990; Gould & Steinmetz, 1996; McCormick & Thompson, 1984b; Tracy, Britton, & Steinmetz, 2001). Some neurons discharge to the presentation of the CS or US and after training neurons can be found that discharge in a pattern that forms an amplitude-time course model of the behavioral response. Importantly, the onset of the learningrelated interpositus firing precedes the behavioral response onset by 30 to 60 msec, indicating that these cells are candidates for neurons that drive the behavioral response via the brain-stem motor nuclei. Single-unit recordings taken from lobule HVI Purkinje cells have revealed a variety of response patterns including CS-activated, US activated, and CR related spiking patterns (Berthier & Moore, 1986; Gould & Steinmetz, 1996; Katz & Steinmetz, 1997). Surprisingly, these recording studies have shown that about two-thirds of the Purkinje cells in lobule HVI respond with increases in spiking (excitation) during the CS-US interval while about one-third show decreases in spiking (inhibition). Because all Purkinje cells inhibit deep nuclear cells on which they synapse, relative increases in populations of excitatory Purkinje cells would have the net effect of inhibiting deep nuclear activity (the opposite one would expect if increases in interpositus nucleus activity are necessary to drive motor neurons responsible for CR expression). Indeed, cerebellar cortical long-term depression (LTD; Ekerot & Kano, 1985; Hemart, Daniel, Jaillard, & Crepel, 1994; Ito, 1989) has been hypothesized to be an important cellular mechanisms involved in eyeblink conditioning (Thompson, 1986). LTD results in a decrease in Purkinje cell spiking. Recent recordings from the anterior lobe of the cerebellum have revealed that this region may be a better candidate for critical involvement in the acquisition and expression of eyeblink CRs (Green & Steinmetz, 2005). In this study, an ISI discrimination procedure was used to study temporal firing patterns of anterior lobe Purkinje cells. In this behavioral procedure, high and low frequency tone CSs were presented, each followed by an air puff US after either a short (250 ms) or long (750 ms) ISI. In essence, the rabbits learned to execute CRs at two different ISIs with each ISI signaled by a different CS. After training, Purkinje cells were isolated and recordings were made as the two CSs were presented. Similar to lobule HVI recordings, CS-, US- and CR-related neurons were found. Some neurons responded selectively on either long- or shortISI trials while other neurons responded equally to both CSs. Most importantly, the ratio of excitatory to inhibitory Purkinje cell firing patterns was reversed relative to lobule HVI recordings—about two-thirds of the neurons showed
c26.indd 512
decreases in spiking during the CS-US interval. Further, it appears that the excitatory population tended to fire early in the CS period while the inhibitory population tended to fire later in the CS period. This is precisely the withintrial firing pattern that would be expected if this region is somehow modulating the interpositus nucleus during CR expression on a given trial. That is, on each trial excitation of a Purkinje cell population occurs early in the trial period to inhibit interpositus nucleus activity while later in the trial inhibition of a Purkinje cell population occurs to promote interpositus nucleus activation. These findings suggest that the anterior lobe of the cerebellum is critical for performance of the eyeblink CR. A number of studies have been conducted to define the neural pathways by which the CS and the US are projected from the periphery to the critical regions of the cerebellum (see Figure 26.3). It appears that the CS is projected via auditory brain stem nuclei to neurons within the pontine nuclei that send mossy fiber projections to cerebellar cortex and the interpositus nucleus. Auditory, light, and tactile neuronal responses were recorded from discrete regions of the pontine nuclei and lesions of these regions caused CSmodality-specific loss of eyeblink CRs (Steinmetz et al., 1987). Microstimulation of these regions can substitute for
HVI
INP
US
IO
LPN CS
Figure 26.3 This figure shows two coronal sections of a rabbit brain with critical CS and US pathways identified. Note: The CS is projected to the cerebellum from neurons in the lateral pontine nuclei (LPN: bottom section) which terminate in the cerebellar cortex (HVI: top section) and interpositus nucleus (INP: top section). The US is projected to the cerebellum from neurons that originate in the inferior olive (IO: top section) which also terminate in the HVI and INP. In addition to the HVI projections, the CS and US are also projected to regions of the anterior lobe of cerebellar cortex. Both HVI and the anterior lobe send Purkinje cell axons to the INP where they inhibit unit activity.
8/18/09 6:23:46 PM
Eyeblink Classical Conditioning 513
a peripheral CS and produce conditioned responding when paired with an air puff US (Steinmetz, 1990; Steinmetz, Rosen, Chapman, Lavond, & Thompson, 1986) and connections with critical regions of the interpositus nucleus and cerebellar cortex have been established in anatomical tract-tracing studies (e.g., Steinmetz & Sengelaub, 1992). In addition to the critical involvement of pontine neurons, the CS pathway may also involve neurons in the medial auditory thalamic nuclei (Campolattaro, Halverson, & Freeman, 2007; Halverson & Freeman, 2006). A similar series of experiments established the critical pathways for projecting the US to the cerebellum. For an air puff US, afferents in the cornea project input to the trigeminal nucleus, which, in turn, send projections to at least two locations. First, the trigeminal nucleus activates neurons in the reticular formation that then send axons to the brain stem motor nuclei that are responsible for generating eyeblinks. Second, the trigeminal nucleus sends projections to the medial region of the dorsal accessory inferior olive. The first pathway is thought to be the reflexive UR pathway while the second pathway is the source of input that activates the cerebellum (see Steinmetz, 2000, for review). Inferior olivary neurons send climbing fiber axons to cerebellar cortex and these axons have collaterals that terminate in the deep cerebellar nuclei. Lesions of the dorsal accessory olive caused extinction or abolition of the eyeblink CR, thus suggesting that this region was involved in projecting the US to the cerebellum (McCormick, Steinmetz, & Thompson, 1985; Voneida, Christie, Bogdanski, & Chopko, 1990; Yeo, Hardiman, & Glickstein, 1986). Stimulation of the dorsal accessory olive can produce a variety of discrete responses, including eyeblinks, and these can be conditioned when preceded by either a peripheral CS or a pontine-stimulation CS (Mauk, Steinmetz, & Thompson, 1986; Steinmetz, Lavond, & Thompson, 1989). Olivary recordings revealed neurons that respond to the presentation of the air puff US, although these responses diminish as CRs are formed (Sears & Steinmetz, 1991), presumably because of inhibitory feedback from the interpositus nucleus (Andersson, Garwicz, & Hesslow, 1988; Kim, Krupa, & Thompson, 1998). Although most of the experiments concerning the neuronal substrates of eyeblink conditioning have concentrated on the conditioning of somatic responses (e.g., the eyeblink), the paradigm has also been used to study the neuronal substrates of autonomic conditioning. Because eyeblink conditioning is an aversive learning procedure, learning-related changes in autonomic activity occur during the conditioning process. In rabbits, conditioned heart-rate changes have been studied most often. For example, Buchanan and Powell (1988) showed that when
c26.indd 513
connections between the mediodorsal nucleus of the thalamus and prefrontal cortex are severed or when ibotenic acid is used to destroy cells in the mediodorsal nucleus, the late-occurring tachycardiac component of the conditioned heart rate response is abolished. These data suggest that circuitry that includes the prefrontal cortex and thalamus are involved in the sympathetic control of the conditioned autonomic responding that occurs during eyeblink conditioning. Lesions of the interpositus nucleus do not affect conditioned heart rate responding (Lavond, Lincoln, McCormick, & Thompson, 1984). For a comprehensive review of the neuronal basis of heart rate conditioning during the eyeblink conditioning procedure, see Powell, McGlaughlin, and Chachic (2000). Models of Eyeblink Conditioning Figure 26.4 depicts a schematic representation of the essential circuitry thought to be critical for eyeblink classical conditioning. Cerebellar models of eyeblink classical conditioning have generally posited that acquisition and performance of the CR is dependent on learning-related changes in neuronal firings in regions of the cerebellum where the CS and the US converge (e.g., Thompson, 1986). There is good evidence that this convergence occurs in at least three locations: the interpositus nucleus, lobule HVI, and the anterior lobe. The cerebellar conditioning models differ, however, in the relative contributions each region makes to the conditioning process. Yeo (2004) and his colleagues have emphasized the critical role played by lobule HVI in the conditioning process, including in the acquisition, performance, and between-session consolidation of the learned response. Mauk and associates have developed an elegant computational model, constrained by available data, which hypothesizes a role for the anterior lobe in the induction of learning-related activity in the interpositus nucleus and performance of the CR (Medina & Mauk, 2000; Ohyama, Nores, Medina, Riusech, & Mauk, 2006). Their models also address critical CR timing features through the regulation of inferior olivary activity by the interpositus nucleus and also include roles for LTD and LTP in cortex. Steinmetz (2000) and colleagues have hypothesized that plasticity is independently established in the two cortical areas and in the interpositus nucleus. Cortical activity is thought to modulate the responsiveness of the deep nuclear cells in a manner that provides gain and critical timing components of the response. All models seem to agree that neurons in the interpositus nucleus are responsible for the CR generation—that is, increases in activity in neurons in the interpositus nucleus that eventually activate the motor neurons responsible for CR execution are the central neuronal representation of the behavioral CR.
8/18/09 6:23:47 PM
514
Neuronal Basis of Learning (A) Auditory nuclei
CS
Pontine nuclei mf
(D)
(C) Acc Abd/ facial n.
CR/UR
Red nucleus
Interpositus nucleus
Reticular formation
Cerebellar cortex
cf (B)
US
Trigeminal nucleus
Figure 26.4 Schematic drawing of the critical brain-stem and cerebellar circuitry involved in eyeblink classical conditioning. Note: Lines with arrowheads depict excitatory synapses while lines with solid circles depict inhibitory synapses. The CS is project to the cerebellum via mossy fibers (mf) from the pontine nuclei while the US is projected to the cerebellum via climbing fibers (cf) from the inferior olive. Conditioning is thought to occur as a result of the convergence of the CS and US in the cerebellar cortex and interpositus nucleus. A, B, C, and D depict key structures in this basic circuitry that have been studied using a variety of techniques. A: Lesions of the pontine nuclei abolish CR in a CS-modality-specific manner; stimulation here can substitute for a peripheral CS; neural recordings made here show CS-related activation
Neural Development and Eyeblink Classical Conditioning Because both the cerebellum and hippocampus undergo substantial postnatal maturation in many mammals (Altman, 1972; Altman & Bayer, 1997), eyeblink classical conditioning provides a highly informative means for studying the relationship between the ontogeny of learning and neural development. Experimental analysis of the behavioral expression of learning and the maturation of the underlying neural circuitry has benefited from the comparison of two forms of eyeblink conditioning, delay and trace (Ivkovich, Paczkowski, & Stanton, 2000). Delay conditioning is first successfully acquired between postnatal day (PD) 17 and 24 in rats when a 280 ms ISI is used (Stanton, Freeman, & Skelton, 1992). Nonassociative, reflex blinking is present at an even earlier age, indicating that the reason associative learning does not emerge earlier than PD 17 is not simply the result of underdeveloped sensory or motor function (Andrews, Freeman, Carter, & Stanton, 1995; Freeman, Spencer, Skelton, & Stanton, 1993). Interestingly, whereas rat pups can acquire the CS-US association on PD 17, they often are not yet able to express what they have learned (Stanton, Fox, & Carter, 1998). When trained again on PD 20, however, rats that received the previous training demonstrated facilitated learning relative to naïve controls.
c26.indd 514
Inferior olive patterns. B: Lesions of the IO eventually abolish conditioned responding; stimulation here produces discrete movements that can be conditioned; neural recordings made here show US-related activation patterns. C: Lesions of the interpositus nucleus permanently abolish CRs; stimulation here can produce discrete eyeblinks; recordings made here show patterns of activation that are closely linked with the execution of the CR. D: Dependent on size and location, lesions of cerebellar cortex have caused a variety of effects of CR performance that ranged from complete abolition to no effects; stimulation here at high intensities can produce eyeblinks and other discrete movements; recordings from Purkinje cells here show a variety of CS-, US- and CR-related patterns of activation (see text for details).
These data indicate that the maturation of the neural circuit responsible for associative learning precedes maturation of the neural mechanisms controlling its behavioral expression. Trace conditioning, which incorporates a temporal gap between the offset of the CS and onset of the US, is the simplest of the higher order forms of eyeblink conditioning. Whereas delay conditioning is mediated solely by the cerebellum and brain stem, trace conditioning also engages the hippocampus and other forebrain structures (Thompson, 2005; Woodruff-Pak & Steinmetz, 2000). Trace conditioning is first successfully observed between PD 19 and 40 in rats (Ivkovich et al., 2000), with hippocampal lesions impairing its expression (Ivkovich & Stanton, 2001). Intriguingly, delay conditioning does not emerge until the same postnatal period if the conditioning ISI is of the same long duration (880 ms) as that used in trace conditioning (Ivkovich et al., 2000). The results suggest that it is the interval of time between CS onset and US onset, and not the conditioning paradigm, that most affects when learning is first observed, possibly due to the difficulty of forming associations over the long time intervals. Contrary to the CR production results, however, Ivkovich-Claflin, Garrett, and Buffington (2005) reported that the timing characteristics of the CR was affected by the type of conditioning employed. Infant rats (PD 21–23
8/18/09 6:23:47 PM
Eyeblink Classical Conditioning 515
or PD 29–31) trained with the delay paradigm were able to accurately time CR expression at three different conditioning ISIs (630, 880, and 1,130 ms), whereas rats trained with the trace paradigm were able to accurately time CR expression with the shortest ISI only. The authors hypothesize that the hippocampal-related timing circuits engaged during trace conditioning fail, in infant rats, when the ISI is exceptionally long. The maturation of neuronal and synaptic interactions within and between the cerebellum and brain stem is proposed to influence the ontogeny of cerebellar plasticity and eyeblink conditioning. The neural loops that exist between the cerebellum and brain stem are not yet fully functional in young subjects. Consequently, the CS and US sensory input and feedback mechanisms are still immature, dampening synaptic plasticity processes and limiting learning (Freeman & Nicholson, 2004). Activity in the interpositus nucleus, which drives CR production is known to regulate learning-related changes in US-mediated climbing fiber activity through inhibitory feedback to the inferior olive (Medina, Nores, & Mauk, 2002; Sears & Steinmetz, 1991). The inhibitory regulation of the US pathway helps maintain equilibrium in the climbing fiber spontaneous firing rate during periods when the CS and US are not presented (Medina et al., 2002). Weakened negative cerebellar feedback in infant rats results in less robust learning because the increments in cerebellar plasticity that are produced during conditioning diminish over time (Freeman & Nicholson, 2004). Developmental changes in the interactions between the CS pathway and the cerebellum must likewise occur for successful eyeblink conditioning. As detailed above, CS auditory information is relayed from the pontine nuclei to the cerebellum via mossy fibers (Steinmetz et al., 1987; Steinmetz & Sengelaub, 1992). CS-mediated neural activity in the pontine nuclei may be enhanced, in turn, by positive feedback from the cerebellum that emerges as CRs develop (Clark, Gohl, & Lavond, 1997). Less excitatory feedback in immature rats results in weaker CS salience, leading to less learning-specific plasticity in the cerebellum and impaired learning (Freeman & Muckler, 2003). Pontine electrical stimulation, used in place of an auditory CS, can overcome the developmental limitations inherent in pontine responsiveness in infant rats, resulting in successful eyeblink conditioning (Freeman, Rabinak, & Campolattaro, 2005). Synaptic plasticity in Purkinje cells is also known to be influenced by the conjoint activity of the mossy/parallel fibers and climbing fibers, relaying CS and US information, respectively (Gould & Steinmetz, 1996; Sakurai, 1987). In infant rats, developmental differences in CS and US neural propagation might not allow for the same strengthening of
c26.indd 515
Purkinje cell synaptic plasticity that is observed in adults. Changes in Purkinje cell inhibition of the deep nuclei may, in turn, lead to reductions in the induction and perseveration of the learning specific interpositus nucleus-mediated synaptic plasticities that underlie the acquisition and expression of eyeblink classical conditioning (Freeman & Nicholson, 2004). Neural Substrates of Other Associative Learning Procedures While eyeblink classical conditioning has been the model used most widely over the last 25 years to explore the neuronal basis of learning, there have been a number of other paradigms and procedures involving mammals that have broadened our insights into how the brain is involved in learning and memory. The paradigms and procedures all share a common feature—they are simple model systems that have allowed the brain correlates of learning to be explored in a systematic and productive manner. Eyeblink classical conditioning is considered an aversive learning task given that the air puff is considered an aversive US. A number of years ago, Gormezano and his colleagues developed an appetitive classical conditioning task in rabbits known as classical jaw-movement conditioning (Coleman, Patterson, & Gormezano, 1966; Sheafor & Gormezano, 1972; Smith, DiLollo, & Gormezano, 1966). For this task, a tone or light CS is presented before a rewarding intraoral water or saccharin US. The US in this procedure causes a rhythmic jaw movement that leads to the consumption of the liquid. With CS-US pairings, presentation of the CS elicits the jawmovement response. Early studies used this procedure to study motivational influences on learning (e.g., Mitchell & Gormezano, 1970) while later studies have used the procedure to explore the neural bases of appetitive learning (Berry, Seager, Asaka, & Borgnis, 2000; Berry, Seager, Asaka, & Griffin, 2001). While CS and US pathways have not been completely delineated for this type of conditioning, experiments have yielded data concerning the involvement of central brain regions. Jaw-movement conditioning is not critically dependent on the interpositus nucleus of the cerebellum. Gibbs (1992) showed that lesions of the interpositus nucleus completely abolished conditioned eyeblink responses but had no effect on the performance of jaw-movement CRs recorded from the same rabbit. Berry and colleagues have conducted many studies that have examined the involvement of the hippocampus in this form of appetitive learning. They have demonstrated the formation of learning-related activity in the hippocampus (e.g., Oliver, Swain, & Berry, 1993; Seager, Borgnis, & Berry, 1997) although the within-trial pattern of activity recorded
8/18/09 6:23:47 PM
516
Neuronal Basis of Learning
during jaw-movement conditioning differed from activity recorded during eyeblink conditioning when both types of training were given to the same rabbit (Berry et al., 2000). In other work, Berry and colleagues have used the jawmovement conditioning procedure to study the involvement of the cholinergic system in learning, especially as related to normal aging processes. For example, they showed that systemic injections of cholinergic blockers like scopolamine retarded learning, suppressed CS-evoked hippocampal activity, and affected the rhythmicity of the jaw-movement CR (Salvatierra & Berry, 1989). The same behavioral and neural patterns are seen in aging rabbits thus suggesting a deficit in cholinergic function as a possible explanation of the deficits in behavior seen with aging (Seager et al., 1997; Woodruff-Pak, Lavond, Logan, & Thompson, 1987). Over the years, some investigators have had success using instrumental conditioning procedures to study the neuronal basis of learning. These instrumental conditioning procedures differ from classical conditioning procedures along one very important dimension: The response made by the conditioning subject affects the delivery of the stimuli used in training. Arguably, Gabriel and associates have had the most success using instrumental procedures to study the neural substrates of learning (see Gabriel & Talk, 2001, for review). In their discriminative instrumental avoidance procedure, rabbits are placed in a large rotating wheel apparatus and are typically presented with two different tone CSs. One of the tones (the CS+) is followed by a foot-shock US while the second tone (CS-) is not. If the rabbit steps forward in the wheel when the CS+ is presented, the shock is not delivered (i.e., it is avoided by the rabbit). Eventually, the rabbits learn to step to the CS+ and not respond to the CS-. These researchers have also developed a parallel behavioral paradigm to study appetitive learning (e.g., Freeman, Cuppernell, Flannery, & Gabriel, 1996). In an extensive and elegant series of studies, Gabriel and his associates have described the neural systems involved in the aversive and appetitive instrumental learning tasks (Gabriel & Talk, 2001). Unlike the basic circuitry for eyeblink conditioning, which appears to have a few critical sites of plasticity confined to the cerebellum, the instrumental learning circuitry can best be described as modular with many critical sites of plasticity that encode critical functional features of the CS-US convergence. Using mainly lesion and neural recording techniques, Gabriel and associates have defined and studied these modules. It appears that the cingulate cortex and associated thalamic nuclei play an important part in this learning, processing associative attention and retrieval of information when task-relevant cues are presented (Freeman & Gabriel, 1999; Gabriel, 1990; Gabriel, Sparenborg, & Kubota, 1989). The hippocampus
c26.indd 516
has also been implicated in this learning and appears to be involved in context-based information processing (Kang & Gabriel, 1998). The amygdala has been demonstrated to play a role in the learning and was identified as important for initiating learning-relevant plasticity in other areas of the brain (Poremba & Gabriel, 1999). Interestingly, collaborative work involving the Gabriel and Steinmetz laboratories has shown that the critical neuronal bases of the two types of learning are completely dissociable—lesions of the cerebellum do not affect the aversive instrumental task while abolishing eyeblink conditioning and lesions of the limbic thalamus severely impair aversive instrumental learning while having no affect on eyeblink conditioning (Gabriel et al., 1996; Steinmetz, Sears, Gabriel, Kubota, & Poremba, 1991). A rat instrumental procedure was developed by Steinmetz and colleagues to study similarities and differences in aversive and appetitive learning (Steinmetz, Logue, & Miller, 1993). In the aversive procedure, a tone CS is presented for 4 to 6 sec at which time a foot-shock US is presented. Early in training, the rats learn to terminate the shock by pressing a response bar. With additional paired training, the rats learn to press the bar before the presentation of the shock, which prevents the shock delivery (i.e., a conditioned avoidance response). In the appetitive procedure, a tone CS is presented for 4 to 6 sec. If the rat presses the bar during the tone period, food pellet reinforcement is delivered (i.e., a conditioned approach or appetitive response). The same tone or different tones can be used during the aversive and appetitive training. This preparation has been used in within-subject design experiments, where the same rat is given both appetitive and aversive versions of the tasks often using the same tone CS, same training context, same response requirement (bar-pressing), and same CS-US timing parameters. Variations of the task have included training in conjunction with autonomic recording, partial reinforcement schedules, and delay of reinforcement schedules. In an initial study, it was demonstrated that bilateral lesions of the interpositus nuclei prevented acquisition of the aversive learning procedures but had no effect on appetitive learning when relatively short CS-US intervals were used (Steinmetz et al., 1993). These data suggest that there may be some similarities in the neuronal circuits involved in the aversive instrumental conditioning and eyeblink classical conditioning tasks, but that the appetitive task engages different neuronal systems. This result is similar to the results of the Gibbs study (1992) where it was established that classical eyeblink conditioning and classical jaw-movement conditioning involved different critical neural circuits. The instrumental bar-pressing tasks have also been used in studies designed to assess clinical
8/18/09 6:23:48 PM
Fear Conditioning
pathologies such as the learning and memory capabilities of rats specifically bred for differential alcohol preference (e.g., Blankenship, Finn, & Steinmetz, 1998; Blankenship, Finn, & Steinmetz, 2000; Rorick, Finn, & Steinmetz, 2003a, 2003b) and cognitive impairments associated with the administration of antiepileptic compounds (e.g., Banks, Mohr, Besheer, Steinmetz, & Garraghty, 1999). It is possible to use multiple conditioning procedures to explore the neuronal basis of learning. An example of this approach is a recent series of studies published by RorickKehn and Steinmetz (2005). In these studies, neural activity in the amygdala was recorded during three behavioral tasks presented to separate groups of rats: eyeblink classical conditioning, classical fear conditioning, and aversive signaled bar-press conditioning. Robust learning-related activation of the central nucleus of the amygdala was seen during all three tasks. The basolateral nucleus, however, showed little activation during the eyeblink conditioning tasks but significant activation during the fear conditioning and instrumental tasks. In general, the amount of learningrelated activity appeared to be related to the relative intensity of the US presented during the task and the somatic requirements of the task. Given the results of these experiments, the use of multiple learning procedures to study the function of a given brain structure or system would seem to be useful for advancing our understanding of the neuronal basis of learning and memory.
FEAR CONDITIONING The ability of organisms to associate environmental stimuli with emotionally charged events allows for the coordination of defensive reactions when faced with potential threats. Fear conditioning, which utilizes this innate propensity, refers to an experimental procedure wherein a subject learns that certain aversive stimuli are accurately predicted by other initially innocuous stimuli. It is a simple, reductive form of learning, expressed across the phylogenetic spectrum. As such, it is a highly amendable model for studying the neurobiological mechanisms of associative learning and memory. Moreover, fear conditioning research can inform a variety of clinical disorders related to emotion and anxiety, many of which are thought to result from disturbances in fear learning (reviewed in LeDoux, 2000). The Behavioral Procedure In a typical fear conditioning experiment, a neutral CS, such as a tone or light, is paired with a mildly aversive US, such as a brief electric foot shock. After one or more
c26.indd 517
517
pairings, a conditioned fear response is elicited when the CS is presented, even in the absence of the US. Conditioned fear is manifest via a variety of fear CRs, including alterations in autonomic responses (changes in heart rate, blood pressure; papillary dilation), defensive responses (immobility or freezing, ultrasonic vocalizations), endocrine responses (pituitary-adrenal hormone release), pain sensitivity (analgesia), and reflex facilitation (whole-body startle, the R1 component of the blink response). Whereas the CS-US association is learned, the particular CRs emitted in response to the CS are not—they are species-specific responses that are expressed automatically in the presence of appropriate stimuli (Fanselow, 1997). In most fear conditioning studies, training typically employs simple contiguity or temporal overlap of the CS and US. As detailed in a series of studies from the late 1960s, however, it is actually the contingency, or informational relationship, that exists between the CS and US that is the critical factor underlying the associative learning (Kamin, 1968; Rescorla, 1968; Wagner, Logan, Haberlandt, & Price, 1968). The Involvement of the Amygdala in Fear Conditioning Located in the medial temporal lobe, the amygdala— named, for its shape, after the Greek word for almond (Burdach, 1819)—plays a critical role in the development and expression of conditioned fear (Brown et al., 2003; LeDoux, 2000; Maren, 2005). The amygdaloid nuclear complex (see Figure 26.5) is composed of 12 functionally and anatomically distinct nuclei (McDonald, 1982;
PR ACe LA BL BM
Figure 26.5 Coronal rat brain slice detailing the locations of multiple amygdala nuclei and the perirhinal cortex (PR). Note: CS-US information is propagated directly from the thalamus and indirectly through cortex, including PR, converging in the lateral nucleus (LA) of the amygdala. CS-related information is relayed to the central nucleus (ACe) directly from the LA, and indirectly through the basolateral (BL), and basomedial (BM) nuclei. The behavioral expression of conditioned fear is regulated through projection outputs from the ACe.
8/18/09 6:23:48 PM
518
Neuronal Basis of Learning
Swanson & Petrovich, 1998). Massive reciprocal interconnections exist among the amygdala’s multiple nuclei, with information primarily flowing from lateral to medial (Pitkanen, Savander, & LeDoux, 1997). Sensory information related to the CS and US enters the amygdala through the lateral, basal, and basolateral nuclei (Aggleton, 2000), collectively termed the basolateral complex. Neural activity initiated by an auditory signal, probably the most commonly employed CS, is transmitted through the auditory system to the level of the auditory thalamus (LeDoux, Sakaguchi, & Reis, 1984). From the thalamus, auditory stimuli can reach the basolateral complex through a direct monosynaptic thalamic projection and a polysynaptic, thalamo-cortical projection (Campeau & Davis, 1995a; McDonald, 1998; Romanski & LeDoux, 1992). Somatosensory information related to the US has likewise been proposed to reach the amygdala directly from the thalamus and indirectly though cortex (McDonald, 1998; Shi & Davis, 1999; but see Brunzell & Kim, 2001). While the monosynaptic thalamic input to lateral amygdala is capable of supporting fear conditioning to simple pure tone CSs (Phillips & LeDoux, 1995; Romanski & LeDoux, 1992), it appears more limited, relative to the thalamo-cortical input, in its capacity to represent complex acoustic stimuli (Bordi & LeDoux, 1994a, 1994b). Indeed, aspiration lesions of the perirhinal cortex—the major cortical input to the lateral nucleus—impair acquisition of conditioned fear when the CS is a prerecorded rat 22 kHz ultrasonic distress call. Fear learning in the same animals is normal, however, when the CS is a 4 or 22 kHz pure tone, suggesting it is the spatio-temporal complexity or temporal discontinuity of the cue and not simply ultrasonic frequency that mandates cortical processing (Lindquist, Jarrard, & Brown, 2004). The motor efferents responsible for the expression of conditioned fear originate in the central nucleus of the amygdala. The central nucleus receives direct and indirect projections from many amygdala nuclei, including the lateral and basolateral nuclei (Jolkkonen & Pitkanen, 1998). In turn, the central nucleus projects to a variety of brain stem and hypothalamic regions, allowing it to modulate motor and autonomic outputs (Hopkins & Holstege, 1978; LeDoux, Iwata, Cicchetti, & Reis, 1988). The amygdala and surrounding structures have long been recognized to play a crucial role in regulating emotive experiences (e.g., Kluver & Bucy, 1937; Weiskrantz, 1956). And yet, the functional role of the amygdala in fear conditioning remains a matter of debate to the present. One line of research suggests that the amygdala, when activated by the arousal associated with an emotionally charged event, modulates fear memories stored in other areas of the brain (McGaugh, 2002; Packard & Cahill, 2001).
c26.indd 518
Alternatively, other research firmly establishes a critical role for the amygdala in the development and long-term storage of fear memories (Fanselow & LeDoux, 1999; Maren, 2005). As an example, lesions of basolateral complex, whether 1 day or 16 months following the acquisition of conditioned fear, completely abolish its expression (Gale et al., 2004). Electrical lesions, neurotoxic lesions, and drug-induced reversible inactivation of the basolateral complex, or individual nuclei within it, all produce severe deficits in both the learning and expression of Pavlovian fear conditioning, independent of the particular CS used or CR monitored (Campeau & Davis, 1995b; Cousens & Otto, 1998; Lindquist & Brown, 2004; Muller, Corodimas, Fridel, & LeDoux, 1997). Importantly, the resulting fear conditioning deficits are not due to secondary changes in motor performance or sensory processing on the part of the rat (Choi & Brown, 2003; Maren, 1998; Wallace & Rosen, 2001). Electrophysiological studies have revealed that amygdala neuronal activity increases in response to both unconditioned and conditioned aversive stimuli (Applegate, Frysinger, Kapp, & Gallagher, 1982; Quirk, Repa, & LeDoux, 1995). Associative long-term potentiation (LTP), the leading candidate synaptic substrate for acquiring and expressing conditioned fear (Brown & Lindquist, 2003), requires the convergence of CS and US inputs onto single neurons. Electrophysiological recording have confirmed that individual neurons within the lateral nucleus respond to both the CS and US (Romanski, Clugnet, Bordi, & LeDoux, 1993). Following fear conditioning, CS-responsive amygdala neurons show learning-dependent increases in neuronal firing (Rogan, Stäubli, & LeDoux, 1997; Rorick-Kehn & Steinmetz, 2005), allowing for the enhanced transmission responsible for CR production. The central amygdala projects to hypothalamic and brain stem areas that mediate specific fear responses (Hopkins & Holstege, 1978; LeDoux et al., 1988). Accordingly, damage to the central nucleus interferes with the expression of multiple fear CRs (Choi & Brown, 2003; Choi, Lindquist, & Brown, 2001; Hitchcock & Davis, 1986; van der Karr, Piechowski, Rittenhouse, & Gray, 1991). On the other hand, damage restricted to particular central nucleus projection areas can selectively interrupt the expression of specific CRs. For example, perturbations of arterial blood pressure, but not freezing, result from damage to the lateral hypothalamus, whereas freezing, but not blood pressure, is disrupted by lesions to the periaqueductal gray (LeDoux et al., 1988). Recent research has expanded on the functional role of the central amygdala in acquiring conditioned fear. Wilensky, Schafe, Kristensen, and LeDoux (2006), for instance, temporarily inactivated the central nucleus with
8/18/09 6:23:49 PM
Fear Conditioning
Hippocampus
Medial prefrontal cortex
Primary sensory cortex
Association cortex
Thalamus
CS
US
Amygdala basolateral complex
including impaired conditioned galvanic skin responses (Bechara et al., 1995; LaBar, LeDoux, Spencer, & Phelps, 1995). Neuroimaging studies have likewise revealed increased regional cerebral blood flow in the amygdala when a subject acquires and later expresses conditioned fear (Büchel, Morris, Dolan, & Friston, 1998; Cheng, Knight, Smith, Stein, & Helmstetter, 2003). Intriguingly, the increase in blood flow is most pronounced when the experimental contingencies of the CS are altered, or, more precisely, early in conditioning and in the early parts of extinction (Knight, Smith, Cheng, Stein, & Helmstetter, 2004).
Intercalated cell mass
The Involvement of the Cerebellum in Fear Conditioning
Amygdala central nucleus
Reflex facilitation
Autonomic responses Defensive behavior
519
Hormone release
Pain sensitivity
Figure 26.6 Schematic drawing of the brain circuitry involved in classical fear conditioning. Note: CS-US sensory information is propagated through the thalamic and cortical pathways, converging in the amygdala basolateral complex. Recent work suggests that CS and US information may also converge directly in the amygdala central nucleus. The central nucleus is the major source of output pathways for multiple conditioned fear responses. Lines with arrowheads depict excitatory synapses while lines with solid circles depict inhibitory synapses.
the GABA agonist muscimol just before conditioning, impairing the acquisition of conditioned fear. In the same study, posttraining central nucleus injections of the protein synthesis inhibitor anisomycin resulted in fear memory consolidation deficits. In line with previous work (e.g., Killcross, Robbins, & Everitt, 1997), the data suggest that the central amygdala, rather than simply mediating fear expression, plays a larger role in the formation and consolidation of fear conditioning memories than heretofore generally appreciated. The neural circuitry thought to be involved in fear conditioning is summarized in Figure 26.6.
In addition to its well-known role in eyeblink classical conditioning, the cerebellum is also involved in emotional processing. Early results linking the cerebellum with emotion found that stimulating the cerebellar vermis, which is connected to hypothalamic and brain stem areas by way of the fastigial nucleus, produced a repertoire of behavioral responses indicative of emotional arousal (Snider & Maiti, 1976). The cerebellar cortex has been implicated in the expression of various affective and fear-related behaviors (Bobee, Mariette, Tremblay-Leveau, & Caston, 2000; Frings et al., 2002). The cerebellum may play a more complex role, as part of an integrated network, in regulating fear behavior. Injecting the Na+ channel blocker tetradotoxin into the vermis at various postconditioning time delays impaired the long-term retention of both contextual and cued fear memories (Sacchetti, Baldi, Lorenzini, & Bucherelli, 2002), suggesting the vermis is part of the neural substrate subserving the consolidation of fear conditioning. Cerebellar involvement in fear memory consolidation is also supported by the long-term increase of synaptic efficacy between parallel fibers and Purkinje cells observed following fear conditioning (Sacchetti, Scelfo, Tempia, & Strata, 2004; Zhu, Scelfo, Hartell, Strata, & Sacchetti, 2007). Reconsolidation, following fear memory retrieval, seems to rely on a functional cerebellum as well. Reversible inactivation of the vermis immediately, but not one hour, after recall of specific fear memories interferes with their subsequent retrieval (Sacchetti, Sacco, & Strata, 2007).
Fear Conditioning in Humans Neuropsychological and neuroimaging studies in humans have substantiated and extended fear conditioning research findings from nonhuman animal subjects (Delgado, Olsson, & Phelps, 2006). People with amygdala damage, for instance, exhibit deficits in Pavlovian fear conditioning,
c26.indd 519
Contextual Fear Conditioning To this point, fear conditioning has been discussed in terms of the CS-US association that forms over the course of training. Nevertheless, it is well established that the US also forms associations with the context in which conditioning
8/18/09 6:23:49 PM
520
Neuronal Basis of Learning
takes place. The hippocampus is proposed to bind the multimodal sensory information related to the physical elements of the conditioning environment into a single conjunctive representation that can then be associated with the US (Fanselow, 2000; Rudy, Huff, & Matus-Amat, 2004). The hippocampus consists of a dorsal and a ventral pole. The dorsal hippocampus is strongly interconnected with various cortices and plays a key role in cognitive function and spatial processing. The ventral hippocampus is connected to the amygdala and hypothalamus and participates in innate and conditioned emotional behavior. Lesion work indicates that the dorsal pole is involved in contextual, but not cued, fear conditioning, whereas the ventral pole appears to be involved in both forms of learning (Maren & Holt, 2004; Richmond et al., 1999). Hippocampal place cells encode spatial information, yet it is not the case that the hippocampus merely inputs processed spatial information to the amygdala (Sanders, Wiltgen, & Fanselow, 2003). For example, dorsal hippocampal lesioned rats show impairments in contextual fear when required to disambiguate different contexts based solely on a single sensory (olfaction) cue (Otto & Poon, 2006). The results reinforce the hypothesis that the hippocampus encodes generalized contextual information, independent of its well-recognized role in spatial processing. The hippocampus is necessary for contextual fear memory for only a limited time following conditioning. Hippocampal lesions impair recent (e.g., 1 day) but not remote (e.g., 28 or 50 days) contextual fear memories, without weakening fear responsiveness to the tone CS (Anagnostaras, Maren, & Fanselow, 1999; Kim & Fanselow, 1992). If made prior to training, hippocampal lesions impair context conditioning to a lesser degree, however. Taken together, the data suggest that the hippocampus is required for the rapid, incidental learning that underlies context conditioning, whereas cortical structures are responsible for the slower learning that integrates across multiple experiences in order to extract generalities (O’Reily & Rudy, 2001). Fanselow and colleagues (Sanders et al., 2003) have suggested that while both the hippocampus and neocortex are capable of forming contextual representations, an intact hippocampus normally inhibits the neocortex from forming a redundant representation. Only if the hippocampus is damaged or inactivated prior to training can compensatory learning in other structures proceed unimpeded, and animals are able to learn. In addition to the hippocampus, the perirhinal and postrhinal cortices are also involved in contextual fear conditioning. Neurotoxic lesions of each cortical region, at training-to-lesion intervals ranging from 1 to 100 days, impair the expression of contextual fear (Burwell, Bucci, Sanborn, & Jutras, 2004). Such long-term impairments
c26.indd 520
differ from the posttraining hippocampal lesion results, suggesting that the two rhinal cortices may play an ongoing role in the storage or retrieval of contextual representations. Conditioned Fear Extinction In the real world, newly formed associations rarely remain static—the CS may over time lose its ability to accurately predict the US, a process termed extinction. The resulting reductions in conditioned responding are not due to simple forgetting, however. Extinction requires new learning on the part of the organism, learning that the CS is no longer predictive of the US (Myers & Davis, 2002). Results from several behavioral phenomena make clear that extinction is not simply the result of unlearning the CS-US association. First, relearning the CS-US association is significantly faster following extinction than during the original acquisition. Second, over time an extinguished CR can spontaneously recover if the CS is represented. Third, an extinguished CR can reappear following exposure to unsignaled presentations of the US. Fourth, extinguished CRs can reappear if subjects are tested in a context different from the one in which extinction training took place. All of the findings support the idea that the original CS-US association remains intact, though inhibited, once extinguished. While the hippocampus plays an important role in mediating the contextual modulation of extinction (Bouton, 2004), the memory trace underlying extinction is thought to be stored in the amygdala (Ji & Maren, 2007). It is not entirely clear, however, how the hippocampus regulates CS-evoked amygdala activity. Evidence points to the medial prefrontal cortex as an important component of the neural circuit for fear extinction. The medial prefrontal cortex is known to receive strong hippocampal inputs and to exert strong inhibitory control over the amygdala (Grace & Rosenkranz, 2002; Ishikawa & Nakamura, 2003). Extinction is impaired following lesions of ventral medial prefrontal cortex (Morgan, Romanski, & LeDoux, 1993), while single units in the infralimbic subregion of the medial prefrontal cortex show potentiation of shortlatency conditioned responses during the expression, but not acquisition, of extinction (Milad & Quirk, 2002). The data are consistent with the Pavlov-Konorski hypothesis (Konorski, 1967; Pavlov, 1927) that extinction potentiates excitatory neuronal activity in structures involved in inhibiting the conditioned response (Quirk, Garcia, & González-Lima, 2006). The infralimbic subregion has robust projections to clusters of intercalated cells interposed between the central and basolateral amygdala nuclei (Cassell & Wright, 1986; McDonald, Mascagni, & Guo, 1996). The GABAergic
8/18/09 6:23:50 PM
References 521
intercalated cells, in turn, project to the central nucleus and are responsible for feed-forward inhibition of its output neurons (Royer, Martina, & Pare, 1999). Neuronal activity in central amygdala projection neurons is decreased by electrical stimulation of the infralimbic subregion (Quirk, Likhtik, Pelletier, & Pare, 2003), whereas chemical stimulation of the infralimbic subregion increases c-Fos expression in the intercalated cells (Berretta, Pantazopoulos, Caldera, Pantazopoulos, & Paré, 2005). As a final point, the firing pattern of infralimbic subregion neurons in response to conditioned tones has been analyzed and mimicked, via electrical stimulation, resulting in reductions in the conditioned freezing response (Milad & Quirk, 2002; Milad, Vidal-Gonzalez, & Quirk, 2004). All told, the evidence suggests that extinction-induced potentiation of tone CS neuronal responses in the infralimbic subregion of the medial prefrontal cortex causes feed-forward inhibition of central amygdala efferent projections, via the intercalated cells, thereby preventing the expression of conditioned fear. SUMMARY Eyeblink classical conditioning and fear conditioning are two forms of associative learning, each of which evolved to solve a specific function. Eyeblink conditioning is slowly learned and characterized by the precise timing of a very specific motor response. Fear conditioning, on the other hand, is rapidly learned and characterized by emotional responses that are relatively diffuse in timing and topography. Enormous progress over the past few decades has been made in delineating the neural circuits underlying each form of conditioning and linking learning-related cellular and behavioral changes. These two model systems, along with a number of other models that have been developed, have unquestionably advanced our understanding of the neuronal basis of associative learning. Considering that most forms of learning involve changes at multiple sites throughout the brain, future research will benefit from analyses that investigate how the various groups of interconnected learning and memory neural systems interact, cooperatively and competitively, to influence ongoing and future behavior.
REFERENCES Abel, T., & Lattal, K. M. (2001). Molecular mechanisms of memory acquisition, consolidation and retrieval. Current Opinion in Neurobiology, 11, 180–187.
c26.indd 521
Altman, J. (1972). Postnatal development of the cerebellar cortex in the rat. Journal of Comparative Neurology, 145, 353–398. Altman, J., & Bayer, S. A. (1997). Development of the cerebellar system in relation to its evolution, structure, and functions. Boca Raton, FL: CRC Press. Anagnostaras, S. G., Maren, S., & Fanselow, M. S. (1999). Temporally graded retrograde amnesia of contextual fear after hippocampal damage in rats: Within subjects examination. Journal of Neuroscience, 19, 1106–1114. Andersson, G., Garwicz, M., & Hesslow, G. (1988). Evidence for a GABA-mediated cerebellar inhibition of the inferior olive in the cat. Experimental Brain Research, 72, 450–456. Andrews, S. J., Freeman, J. H., Carter, C. S., & Stanton, M. E. (1995). Ontogeny of eyeblink conditioning in the rat: Auditory frequency and discrimination learning effects. Developmental Psychobiology, 28, 307–320. Applegate, C. D., Frysinger, R. C., Kapp, B. S., & Gallagher, M. (1982). Multiple unit activity recorded from amygdala central nucleus during Pavlovian heart rate conditioning in rabbit. Brain Research, 238, 457–462. Asaad, W. F., Rainer, G., & Miller, E. K. (1998). Neural activity in the primate prefrontal cortex during associative learning. Neuron, 21, 1399–1407. Banks, M. K., Mohr, N. L., Besheer, J., Steinmetz, J. E., & Garraghty, P. E. (1999). The effects of phenytoin on instrumental appetitive-to-aversive transfer in rats. Pharmacology Biochemistry and Behavior, 63, 465–472. Bechara, A., Tranel, D., Damasio, H., Adolphs, R., Rockland, C., & Damasio, A. R. (1995, August 25). Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science, 269, 1115–1118. Berger, T. W., Alger, B., & Thompson, R. F. (1976, April 30). Neuronal substrate of classical conditioning in the hippocampus. Science, 192, 483–485. Berger, T. W., Rinaldi, P. C., Weisz, D. J., & Thompson, R. F. (1983). Single-unit analysis of different hippocampal cell types during classical conditioning of rabbit nictitating membrane response. Journal of Neurophysiology, 50, 1197–1219. Berger, T. W., & Thompson, R. F. (1978a). Neuronal plasticity in the limbic system during classical conditioning of the rabbit nictitating membrane response: Pt. I. The hippocampus. Brain Research, 145, 323–346. Berger, T. W., & Thompson, R. F. (1978b). Neuronal plasticity in the limbic system during classical conditioning of the rabbit nictitating membrane response: Pt. II. Septum and mammillary bodies. Brain Research, 156, 293–314. Berretta, S., Pantazopoulos, H., Caldera, M., Pantazopoulos, P., & Paré, D. (2005). Infralimbic cortex activation increases c-Fos expression in intercalated neurons of the amygdala. Neuroscience, 132, 943–953. Berry, S. D., Seager, M. A., Asaka, Y., & Borgnis, R. L. (2000). Motivational issues in aversive and appetitive conditioning paradigms. In D. S. Woodruff-Pak & J. E. Steinmetz (Eds.), Eyeblink classical conditioning: Animal models (Vol. 2, pp. 287–312). Boston: Kluwer Press. Berry, S. D., Seager, M. A., Asaka, Y., & Griffin, A. L. (2001). The septohippocampal system and classical conditioning. In J. E. Steinmetz, M. A. Gluck, & P. R. Solomon (Eds.), Model systems and the neurobiology of associative learning (pp. 79–110). Mahwah, NJ: Erlbaum.
Aggleton, J. P. (2000). The amygdala: A functional analysis. Oxford: Oxford University Press.
Berthier, N. E., & Moore, J. W. (1986). Cerebellar purkinje cell activity related to the classically conditioned nictitating membrane response. Experimental Brain Research, 63, 341–350.
Alger, B. E., & Teyler, T. J. (1976). Long-term and short-term plasticity in the CA1, CA3, and dentate regions of the rat hippocampal slice. Brain Research, 110, 463–480.
Berthier, N. E., & Moore, J. W. (1990). Activity of deep cerebellar nuclear cells during classical conditioning of nictitating membrane extension in rabbits. Experimental Brain Research, 83, 44–54.
8/18/09 6:23:50 PM
522
Neuronal Basis of Learning
Blankenship, M. R., Finn, P. R., & Steinmetz, J. E. (1998). A characterization of approach and avoidance learning in alcohol preferring (P) and non-preferring (NP) rats. Alcoholism: Clinical and Experimental Research, 22, 1227–1233. Blankenship, M. R., Finn, P. R., & Steinmetz, J. E. (2000). A characterization of approach and avoidance learning in high alcohol drinking (HAD) and low alcohol drinking (LAD) rats. Alcoholism: Clinical and Experimental Research, 24, 1778–1784. Bobee, S., Mariette, E., Tremblay-Leveau, H., & Caston, J. (2000). Effects of early midline cerebellar lesion on cognitive and emotional functions in the rat. Behavioral Brain Research, 112, 107–117. Bordi, F., & LeDoux, J. E. (1994a). Response properties of single units in areas of rat auditory thalamus that project to the amygdala: Pt. I. Acoustic discharge patterns and frequency receptive fields. Experimental Brain Research, 98, 261–274. Bordi, F., & LeDoux, J. E. (1994b). Response properties of single units in areas of rat auditory thalamus that project to the amygdala: Pt. II. Cells receiving convergent auditory and somatosensory inputs and cells antidromically activated by amygdala stimulation. Experimental Brain Research, 98, 275–286. Bouton, M. E. (2004). Context and behavioral processes in extinction. Learning and Memory, 11, 485–494. Bracha, V., Irwin, K. B., Webster, M. L., Wunderlich, D. A., Stachowiak, M. K., & Bloedel, J. R. (1998). Microinjections of anisomycin into the intermediate cerebellum during learning affect the acquisition of classically conditioned responses in the rabbit. Brain Research, 788, 169–178. Brown, T. H., Byrne, J. H., LaBar, K., LeDoux, J., Lindquist, D. H., Thompson, R. F., et al. (2003). Learning and memory: Basic mechanisms. In J. H. Byrne & J. L. Roberts (Eds.), From molecules to networks: An introduction to cellular and molecular neuroscience (pp. 499–574). San Diego, CA: Academic Press. Brown, T. H., & Lindquist, D. H. (2003). Long-term potentiation: Amygdala. In J. H. Byrne (Ed.), Learning and Memory (2nd ed., pp. 342–346). Farmington Hills, MI: Macmillan. Browning, P. G. F., Easton, A., Buckley, M. J., & Gaffan, D. (2005). The role of the prefrontal cortex in object-in-place learning in monkeys. European Journal of Neuroscience, 22, 3281–3291. Brunzell, D. H., & Kim, J. J. (2001). Fear conditioning to tone, but not to context, is attenuated by lesions of the insular cortex and posterior extension of the intralaminar complex in rats. Behavioral Neuroscience, 115, 365–375. Buchanan, S. L., & Powell, D. A. (1988). Parasagittal thalamic knife cuts retard Pavlovian eyeblink conditioning and abolish the tachycardiac component of the heart rate conditioned response. Brain Research Bulletin, 21, 723–729. Büchel, C., Morris, J., Dolan, R. J., & Friston, K. J. (1998). Brain systems mediating aversive conditioning: An event-related fMRI study. Neuron, 20, 947–957. Burdach, K. F. (1819). Vom Baue und Leben des Gehirns. Leipzig. Burwell, R. D., Bucci, D. J., Sanborn, M. R., & Jutras, M. J. (2004). Perirhinal and postrhinal contributions to remote memory for context. Journal of Neuroscience, 24, 11023–11028. Campeau, S., & Davis, M. (1995a). Involvement of the central nucleus and basolateral complex of the amygdala in fear conditioning measured with fear-potentiated startle in rats trained concurrently with auditory and visual conditioned stimuli. Journal of Neuroscience, 15, 2301–2311. Campeau, S., & Davis, M. (1995b). Involvement of subcortical and cortical afferents to the lateral nucleus of the amygdala in fear conditioning measured with fear-potentiated startle in rats trained concurrently with auditory and visual conditioned stimuli. Journal of Neuroscience, 15, 2312–2327.
c26.indd 522
Campolattaro, M. M., Halverson, H. E., & Freeman, J. H. (2007). Medial auditory thalamic stimulation as a conditioned stimulus for eyeblink conditioning in rats. Learning and Memory, 14, 152–159. Carew, T. J., & Sahley, C. L. (1986). Invertebrate learning and memory: From behavior to molecules. Annual Review of Neuroscience, 9, 435–487. Cassell, M. D., & Wright, D. J. (1986). Topography of projections from the medial prefrontal cortex to the amygdala in the rat. Brain Research Bulletin, 17, 321–333. Cegavske, C. F., Patterson, M. M., & Thompson, R. F. (1979). Neuronal unit activity in the abducens nucleus during classical conditioning of the nictitating membrane response in the rabbit (Oryctolagus cuniculus). Journal of Comparative Physiological Psychology, 93, 595–609. Cegavske, C. F., Thompson, R. F., Patterson, M. M., & Gormezano, I. (1976). Mechanisms of efferent neuronal control of the reflex nictitating membrane response in rabbit (Oryctolagus cuniculus). Journal of Comparative and Physiological Psychology, 90, 411–423. Chapman, P. F., Steinmetz, J. E., Sears, L. L., & Thompson, R. F. (1990). Effects of lidocaine injection in the interpositus nucleus and red nucleus on conditioned behavioral and neuronal responses. Brain Research, 537, 149–156. Chen, G., & Steinmetz, J. E. (2000a). Intra-cerebellar infusion of NMDA receptor antagonist AP5 disrupts classical eyeblink conditioning in rabbits. Brain Research, 887, 144–156. Chen, G., & Steinmetz, J. E. (2000b). Microinfusions of protein kinase inhibitor H7 in the cerebellum impairs the acquisition but not retention of classical eyeblink conditioning in rabbits. Brain Research, 865, 193–201. Chen, L., Bao, S., Lockard, J. M., Kim, J. J., & Thompson, R. F. (1996). Impaired classical eyeblink conditioning in cerebellar-lesioned and Purkinje cell degeneration (pcd) mutant mice. Journal of Neuroscience, 16, 2829–2838. Cheng, D. T., Knight, D. C., Smith, C. N., Stein, E. A., & Helmstetter, F. J. (2003). Functional MRI of human amygdala activity during Pavlovian fear conditioning: Stimulus processing versus response expression. Behavioral Neuroscience, 117, 3–10. Choi, J.-S., & Brown, T. H. (2003). Central amygdala lesions block ultrasonic vocalization and freezing as conditional but not unconditional responses. Journal of Neuroscience, 23, 8713–8721. Choi, J.-S., Lindquist, D. H., & Brown, T. H. (2001). Amygdala lesions prevent conditioned enhancement of the rat eyeblink reflex. Behavioral Neuroscience, 115, 764–775. Clark, R. E., Gohl, E. B., & Lavond, D. G. (1997). The learning-related activity that develops in the pontine nuclei during classical eyeblink conditioning is dependent on the interpositus nucleus. Learning and Memory, 3, 532–544. Clark, R. E., & Squire, L. R. (1998, April 3). Classical conditioning and brain systems: The role of awareness. Science, 280, 77–81. Coleman, S. R., Patterson, M. M., & Gormezano, I. (1966). Conditioned jaw movement in the rabbit: Deprivation procedure and saccharin concentration. Psychonomic Science, 6, 39–40. Cooke, S. F., Attwell, P. J. E., & Yeo, C. H. (2004). Temporal properties of cerebellar-dependent memory consolidation. Journal of Neuroscience, 24, 2934–2941. Coulter, D. A., Lo Turco, J. J., Kubota, M., Disterhoft, J. F., Moore, J. W., & Alkon, D. L. (1989). Classical conditioning reduces amplitude and duration of calcium-dependent afterhyperpolarization in rabbit hippocampal pyramidal cells. Journal of Neurophysiology, 61, 971–981. Cousens, G., & Otto, T. (1998). Both pre- and post-training excitotoxic lesions of the basolateral amygdala abolish the expression of olfactory and contextual fear conditioning. Behavioral Neuroscience, 112, 1092–1103.
8/18/09 6:23:50 PM
References 523 Delgado, M. R., Olsson, A., & Phelps, E. A. (2006). Extending animal models of fear conditioning to humans. Biological Psychology, 73, 39–48. Disterhoft, J. F., Golden, D. T., Read, H. L., Coulter, D. A., & Alkon, D. L. (1988). AHP reduction in rabbit hippocampal neurons during conditioning correlate with acquisition of the learned response. Brain Research, 462, 118–125. Disterhoft, J. F., & McEchron, M. D. (2000). Cellular alterations in hippocampus during acquisition and consolidation of hippocampus-dependent trace eyeblink conditioning. In D. S. Woodruff-Pak & J. E. Steinmetz (Eds.), Eyeblink classical conditioning: Animal models (Vol. 2, pp. 313–334). Boston: Kluwer Press. Disterhoft, J. F., Quinn, K. J., Weiss, C., & Shipley, M. T. (1985). Accessory abducens nucleus and conditioned eye retraction nictitating membrane extension in rabbit. Journal of Neuroscience, 5, 941–950. Ekerot, C. F., & Kano, M. (1985). Long-term depression of parallel fibre synapses following stimulation of climbing fibres. Brain Research, 342, 357–360. Fanselow, M. S. (1997). Species-specific defense reactions: Retrospect and prospect. In M. E. Bouton (Ed.), Learning, motivation, and cognition (pp. 321–341). Washington, DC: American Psychological Association. Fanselow, M. S. (2000). Contextual fear, gestalt memories, and the hippocampus. Behavioral Brain Research, 110(1–2), 73–81. Fanselow, M. S., & LeDoux, J. E. (1999). Why we think plasticity underlying Pavlovian fear conditioning occurs in the basolateral amygdala. Neuron, 23, 2239–2232. Freeman, J. H., & Nicholson, D. A. (2004). Developmental changes in the neural mechanisms of eyeblink conditioning. Behavioral and Cognitive Neuroscience Reviews, 3, 3–13. Freeman, J. H., Spencer, C. O., Skelton, R. W., & Stanton, M. E. (1993). Ontogeny of eyeblink conditioning in the rat: Effects of US intensity and interstimulus interval on delay conditioning. Psychobiology, 21, 233–242. Freeman, J. H., Jr., Cuppernell, C., Flannery, K., & Gabriel, M. (1996). Context-specific multi-site cingulate cortical, limbic thalamic, and hippocampal neuronal activity during concurrent discriminative approach and avoidance training in rabbits. Journal of Neuroscience, 16, 1538–1549. Freeman, J. H., Jr., & Gabriel, M. (1999). Changes of cingulothalamic topographic excitation patterns and avoidance response incubation over time following initial discriminative conditioning in rabbits. Neurobiology of Learning and Memory, 72, 259–272.
Gale, G. D., Anagnostaras, S. G., Godsil, B. P., Mitchell, S., Nozawa, T., Sage, J. R., et al. (2004). Role of the basolateral amygdala in the storage of fear memories across the adult lifetime of rats. Journal of Neuroscience, 24, 3810–3815. Gibbs, C. M. (1992). Divergent effects on deep cerebellar lesions on two different conditioned somatomotor responses in rabbits. Brain Research, 585, 395–399. Goldman-Rakic, P. S. (1995). Cellular basis of working memory. Neuron, 14, 477–485. Gomi, H., Sun, W., Finch, C. E., Itohara, S., Yoshimi, K., & Thompson, R. F. (1999). Learning induces a CDC2-related protein kinase, KKIAMRE. Journal of Neuroscience, 19, 9530–9537. Gould, T. J., & Steinmetz, J. E. (1996). Changes in rabbit cerebellar cortical and interpositus nucleus activity during acquisition, extinction and backward classical conditioning. Neurobiology of Learning and Memory, 65, 17–34. Grace, A. A., & Rosenkranz, J. A. (2002). Regulation of conditioned responses of basolateral amygdala neurons. Physiology and Behavior, 77(4–5), 489–493. Green, J. T., & Steinmetz, J. E. (2005). Purkinje cell activity in the cerebellar anterior lobe during rabbit eyeblink conditioning. Learning and Memory, 12, 260–269. Haley, D. A., Thompson, R. F., & Madden, J. (1988). Pharmacological analysis of the magnocellular red nucleus during classical conditioning of the rabbit nictitating membrane response. Brain Research, 454, 131–139. Halverson, H. E., & Freeman, J. H. (2006). Medial auditory thalamic nuclei are necessary for eyeblink conditioning. Behavioral Neuroscience, 120, 880–887. Hawkins, R. D., Kandel, E. R., & Bailey, C. H. (2006). Molecular mechanisms of memory storage in Aplysia. Biological Bulletin, 210, 174–191. Hebb, D. O. (1949). The organization of behavior. New York: Wiley. Hemart, N., Daniel, H., Jaillard, D., & Crepel, G. (1994). Properties of glutamate receptors are modified during long-term depression in rat cerebellar Purkinje cells. Neuroscience Research, 19, 213–221.
Freeman, J. H., Jr., & Muckler, A. S. (2003). Developmental changes in eyeblink conditioning and neuronal activity in the pontine nuclei. Learning and Memory, 10, 337–345.
Hitchcock, J., & Davis, M. (1986). Lesions of the amygdala, but not of the cerebellum or red nucleus, block conditioned fear as measured with the potentiated startle paradigm. Behavioral Neuroscience, 100, 11–22.
Freeman, J. H., Jr., Rabinak, C. A., & Campolattaro, M. M. (2005). Pontine stimulation overcomes developmental limitations in the neural mechanisms of eyeblink conditioning. Learning and Memory, 12, 255–259.
Hopkins, D. A., & Holstege, G. (1978). Amygdaloid projections to the mesencephalon, pons and medulla oblongata in the cat. Experimental Brain Research, 32, 529–547.
Frings, M., Maschke, M., Erichsen, M., Jentzen, W., Muller, S. P., Kolb, F. P., et al. (2002). Involvement of the human cerebellum in fearconditioned potentiation of the acoustic startle response: A PET study. NeuroReport, 13, 1275–1278.
Ishikawa, A., & Nakamura, S. (2003). Convergence and interaction of hippocampal and amygdalar projections with the prefrontal cortex in the rat. Journal of Neuroscience, 23, 9987–9995.
Gabriel, M. (1990). Functions of anterior and posterior cingulate cortex during avoidance learning in rabbits. Progress in Brain Research, 85, 467–483. Gabriel, M., Kang, E., Poremba, A., Kubota, Y., Allen, M. T., Miller, D. P., et al. (1996). Neural substrates of discrimination avoidance learning and classical eyeblink conditioning in rabbits: A double dissociation. Behavioral Brain Research, 82, 23–30. Gabriel, M., Sparenborg, S., & Kubota, Y. (1989). Anterior and medial thalamic lesions, discriminative avoidance learning, and cingular cortical neuronal activity in rabbits. Experimental Brain Research, 76, 441–457.
c26.indd 523
Gabriel, M., & Talk, A. C. (2001). A tale of two paradigms: Lessons learned from parallel studies of discriminative instrumental learning and classical eyeblink conditioning. In J. E. Steinmetz, M. A. Gluck, & P. R. Solomon (Eds.), Model systems and the neurobiology of associative learning (pp. 149–186). Mahwah, NJ: Erlbaum.
Ito, M. (1989). Long-term depression. Annual Review of Neuroscience, 12, 85–102. Ivkovich, D., Paczkowski, C. M., & Stanton, M. E. (2000). Ontogeny of delay versus trace eyeblink conditioning in the rat. Developmental Psychobiology, 36, 148–160. Ivkovich, D., & Stanton, M. E. (2001). Effects of early hippocampal lesions on trace, delay, and long-delay eyeblink conditioning in developing rats. Neurobiology: Learning and Memory, 76, 426–446. Ivkovich, D., & Thompson, R. F. (1997). Motor cortex lesions do not affect learning or performance of the eyelid response in rabbits. Behavioral Neuroscience, 111, 727–738.
8/18/09 6:23:51 PM
524
Neuronal Basis of Learning
Ivkovich-Caflin, D., Garrett, T., & Buffington, M. L. (2005). A developmental comparison of trace and delay eyeblink conditioning in rats using matching interstimulus intervals. Developmental Psychobiology, 47, 77–88.
Lavond, D. G., & Steinmetz, J. E. (2003). Handbook of classical conditioning. New York: Kluwer Press.
Ji, J., & Maren, S. (2007). Hippocampal involvement in contextual modulation of fear extinction. Hippocampus, 17, 749–758.
LeDoux, J. E., Iwata, J., Cicchetti, P., & Reis, D. J. (1988). Different projections of the central amygdaloid nucleus mediate autonomic and behavioral correlates of conditioned fear. Journal of Neuroscience, 8, 2517–2529.
Jolkkonen, E., & Pitkanen, A. (1998). Intrinsic connections of the rat amygdaloid complex: Projections originating in the central nucleus. Journal of Comparative Neurology, 395, 53–72. Kamin, L. J. (1968). Aversive stimulation. In M. R. Jones (Ed.), Miami symposium on the prediction of behavior (pp. 9–33). Coral Gables, FL: University of Miami Press. Kang, E., & Gabriel, M. (1998). Hippocampal modulation of cingulothalamic neuronal activity and discriminative avoidance learning in rabbits. Hippocampus, 8, 491–510. Katz, D. B., & Steinmetz, J. E. (1997). Single-unit evidence for eyeblink conditioning in cerebellar cortex is altered, but not eliminated, by interpositus nucleus lesions. Learning and Memory, 4, 88–104. Killcross, S., Robbins, T. W., & Everitt, B. J. (1997, July 24). Different types of fear-conditioned behavior mediated by separate nuclei within amygdala. Nature, 388, 377–380.
LeDoux, J. E. (2000). Emotion circuits in the brain. Annual Review of Neuroscience, 23, 155–184.
LeDoux, J. E., Sakaguchi, A., & Reis, D. J. (1984). Subcortical efferent projections of the medial geniculate nucleus mediate emotional responses conditioned to acoustic stimuli. Journal of Neuroscience, 4, 683–698. Lincoln, J. S., McCormick, D. A., & Thompson, R. F. (1982). Ipsilateral cerebellar lesions prevent learning of the classically conditioned nictitating membrane/eyelid response. Brain Research, 17, 190–193. Lindquist, D. H., & Brown, T. H. (2004). Amygdalar NMDA receptors control the expression of associative reflex facilitation and three other conditional responses. Behavioral Neuroscience, 18, 36–52. Lindquist, D. H., Jarrard, L. E., & Brown, T. H. (2004). Perirhinal cortex supports delay fear conditioning to rat ultrasonic social signals. Journal of Neuroscience, 24, 3610–3617.
Kim, J. J., & Fanselow, M. S. (1992, May 1). Modality-specific retrograde amnesia of fear. Science, 256, 675–677.
Maren, S. (1998). Overtraining does not mitigate contextual fear conditioning deficits produced by neurotoxic lesions of the basolateral amygdala. Journal of Neuroscience, 18, 3088–3097.
Kim, J. J., Krupa, D. J., & Thompson, R. F. (1998, January 23). Inhibitory cerebello-olivary projections and blocking effect in classical conditioning. Science, 279, 570–573.
Maren, S. (2005). Building and burying fear memories in the brain. Neuroscientist, 11, 89–99.
Kleim, J. A., Freeman, J. H., Bruneau, R., Nolan, B. C., Cooper, N. R., Zook, A., et al. (2002). Synapse formation is associated with memory storage in the cerebellum. Proceedings of the National Academy of Sciences, USA, 99, 13228–13231. Kluver, H., & Bucy, P. C. (1937). ‘Psychic blindness’ and other symptoms following bilateral temporal lobectomy in rhesus monkeys. American Journal of Physiology, 119, 352–353. Knight, D. C., Smith, C. N., Cheng, D. T., Stein, E. A., & Helmstetter, F. J. (2004). Amygdala and hippocampal activity during acquisition and extinction of human fear conditioning. Cognitive, Affective, and Behavioral Neuroscience, 4, 317–325. Konorski, J. (1967). Integrative activity of the brain. Chicago: University of Chicago Press. Kronforst-Collins, M. A., & Disterhoft, J. F. (1998). Lesions of the caudal area of rabbit medial prefrontal cortex impair trace eyeblink conditioning. Neurobiology of Learning and Memory, 69, 147–162. Krupa, D. J., Thompson, J. K., & Thompson, R. F. (1993, May 14). Localization of a memory trace in the mammalian brain. Science, 260, 989–991. LaBar, K. S., LeDoux, J. E., Spencer, D. D., & Phelps, E. A. (1995). Impaired fear conditioning following unilateral temporal lobectomy in humans. Journal of Neuroscience, 15, 6846–6855. Lashley, K. S. (1930). Basic neural mechanisms in behavior. Psychological Review, 37, 1–24. Lashley, K. S. (1950). In search of the engram in psychological mechanisms in animal behaviour. New York: Academic Press. Lavond, D. G., Hembree, T. L., & Thompson, R. F. (1985). Effect of kainic acid lesions of the cerebellar interpositus nucleus on eyelid conditioning in the rabbit. Brain Research, 326, 179–182. Lavond, D. G., Lincoln, J. S., McCormick, D. A., & Thompson, R. F. (1984). Effect of bilateral lesions of the dentate and interpositus cerebellar nuclei on conditioning of heart-rate and nictitating membrane/ eyelid responses in the rabbit. Brain Research, 305, 323–330. Lavond, D. G., & Steinmetz, J. E. (1989). Acquisition of classical conditioning without cerebellar cortex. Behavioural Brain Research, 33, 113–164.
c26.indd 524
Maren, S., & Holt, W. G. (2004). Hippocampus and Pavlovian fear conditioning in rats: Muscimol infusions into the ventral, but not dorsal, hippocampus impair the acquisition of conditional freezing to an auditory conditional stimulus. Behavioral Neuroscience, 118, 97–110. Mauk, M. D., & Buonomano, D. V. (2004). The neural basis of temporal processing. Annual Review of Neuroscience, 27, 307–340. Mauk, M. D., Steinmetz, J. E., & Thompson, R. F. (1986). Classical conditioning using stimulation of the inferior olive as the unconditioned stimulus. Proceedings of the National Academy of Sciences, USA, 83, 5349–5353. Mauk, M. D., & Thompson, R. F. (1987). Retention of classically conditioned eyelid responses following acute decerebration. Brain Research, 403, 89–95. McCormick, D. A., Lavond, D. G., Clark, G. A., Kettner, R. R., Rising, C. E., & Thompson, R. F. (1981). The engram found? Role of the cerebellum in classical conditioning of nictitating membrane and eyelid responses. Bulletin of the Psychonomic Society, 18, 103–105. McCormick, D. A., Steinmetz, J. E., & Thompson, R. F. (1985). Lesions of the inferior olivary complex cause extinction of the classically conditioned eyelid response. Brain Research, 359, 120–130. McCormick, D. A., & Thompson, R. F. (1984a, January 20). Cerebellum essential involvement in the classically conditioned eyelid response. Science, 223, 296–299. McCormick, D. A., & Thompson, R. F. (1984b). Neuronal responses of the rabbit cerebellum during acquisition and performance of a classically conditioned nictitating membrane-eyelid responses. Journal of Neuroscience, 4, 2811–2822. McDonald, A. J. (1982). Cell types and intrinsic connections of the amygdala. In J. P. Aggleton (Ed.), The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 67–96). New York: Wiley-Liss. McDonald, A. J. (1998). Cortical pathways to the mammalian amygdala. Progress in Neurobiology, 55, 257–332. McDonald, A. J., Mascagni, F., & Guo, L. (1996). Projections of the medial and lateral prefrontal cortices to the amygdala: A Phaseolus vulgaris leucoagglutinin study in the rat. Neuroscience, 71, 55–75.
8/18/09 6:23:51 PM
References 525 McGaugh, J. L. (2002). Memory consolidation and the amygdala: A systems perspective. Trends in Neuroscience, 25, 456–461.
Otto, T., & Poon, P. (2006). Dorsal hippocampal contributions to unimodal contextual conditioning. Journal of Neuroscience, 26, 6603–6609.
McLaughlin, J., Skaggs, H., Churchwell, J., & Powell, D. A. (2002). Medial prefrontal cortex and Pavlovian conditioning: Trace versus delay conditioning. Behavioral Neuroscience, 116, 37–47.
Packard, M. G., & Cahill, L. (2001). Affective modulation of multiple memory systems. Current Opinion in Neurobiology, 11, 752–756.
Medina, J. F., & Mauk, M. D. (2000). Computer simulation of cerebellar information processing. Nature Neuroscience, 3, 1205–1211. Medina, J. F., Nores, W. L., & Mauk, M. D. (2002, March 21). Inhibition of climbing fibres is a signal for the extinction of conditioned eyelid responses. Nature, 416, 270–273. Milad, M. R., & Quirk, G. J. (2002, November 7). Neurons in medial prefrontal cortex signal memory for fear extinction. Nature, 420, 70–74. Milad, M. R., Vidal-Gonzalez, I., & Quirk, G. J. (2004). Electrical stimulation of medial prefrontal cortex reduces conditioned fear in a temporally specific manner. Behavioral Neuroscience, 118, 389–394. Miller, D. P., & Steinmetz, J. E. (1997). Hippocampal activity during discrimination/reversal eyeblink conditioning in rabbits. Behavioral Neuroscience, 111, 70–79. Mitchell, D. S., & Gormezano, I. (1970). Effects of water deprivation on classical appetitive conditioning of the rabbit’s jaw movement response. Learning and Motivation, 1, 199–206. Morgan, M. A., Romanski, L. M., & LeDoux, J. E. (1993). Extinction of emotional learning: Contribution of medial prefrontal cortex. Neuroscience Letters, 163, 109–113. Morris, R. G. M., Anderson, E., Lynch, G. S., & Baudry, M. (1986). Selective impairment of learning and blockade of long-term potentiation by an N-methyl-D-aspartate receptor antagonist, AP5. Nature, 319, 774–776. Moyer, J. R., Deyo, R. A., & Disterhoft, J. F. (1990). Hippocampectomy disrupts trace eye-blink conditioning in rabbits. Behavioral Neuroscience, 104, 243–252. Muller, J., Corodimas, K. P., Fridel, Z., & LeDoux, J. E. (1997). Functional inactivation of the lateral and basal nuclei of the amygdala by muscimol infusion prevents fear conditioning to an explicit conditioned stimulus and to contextual stimuli. Behavioral Neuroscience, 111, 863–891. Myers, K. M., & Davis, M. (2002). Behavioral and neural analysis of extinction. Neuron, 36, 567–584. Nolan, B. C., & Freeman, J. H. (2005). Purkinje cell Loss by OX7-saporin impairs excitatory and inhibitory eyeblink conditioning. Behavioral Neuroscience, 119, 190–201. Oakley, D. A., & Russell, I. S. (1972). Neocortical lesions and Pavlovian conditioning. Physiology and Behavior, 8, 915–926. Oakley, D. S., & Russell, I. S. (1974). Differential and reversal conditioning in partially neodecorticate rabbits. Physiology and Behavior, 13, 221–230. Oakley, D. S., & Russell, I. S. (1976). Subcortical nature of Pavlovian differentiation in the rabbit. Physiology and Behavior, 17, 947–954. Oakley, D. A., & Russell, I. S. (1977). Subcortical storage of Pavlovian conditioning in the rabbit. Physiology and Behavior, 18, 931–937.
c26.indd 525
Patterson, M. M., Cegavske, C. F., & Thompson, R. F. (1973). Effects of classical conditioning paradigm on hindlimb flexor nerve response in immobolized spinal cat. Journal of Comparative and Physiological Psychology, 84, 88–97. Pavlov, I. P. (1927). Conditioned reflexes (G. V. Anrep, Trans.). London: Oxford University Press. Perrett, S. P., Ruiz, B. P., & Mauk, M. D. (1993). Cerebellar cortex lesions disrupt learning-dependent timing of conditioned eyelid responses. Journal of Neuroscience, 13, 1708–1718. Phillips, R. G., & LeDoux, J. E. (1995). Lesions of the fornix but not the entorhinal or perirhinal cortex interfere with contextual fear conditioning. Journal of Neuroscience, 15, 5308–5315. Pitkanen, A., Savander, V., & LeDoux, J. E. (1997). Organization of intra-amygdaloid circuitries in the rat: An emerging framework for understanding functions of the amygdala. Trends in Neuroscience, 20, 517–523. Poremba, A., & Gabriel, M. (1999). Amygdala neurons mediate acquisition but not maintenance of instrumental avoidance behavior in rabbits. Journal of Neuroscience, 19, 9635–9641. Port, R. L., Romano, A. G., Steinmetz, J. E., Mikhail, A. A., & Patterson, M. M. (1986). Retention and acquisition of classical trace conditioned responses by rabbits with hippocampal lesions. Behavioral Neuroscience, 100, 745–752. Powell, D. A., Churchwell, J., & Burriss, L. (2005). Medial prefrontal lesions and Pavlovian eyeblink and heart rate conditioning: Effects of partial reinforcement on delay and trace conditioning in rabbits (Oryctolagus cuniculus). Behavioral Neuroscience, 119, 180–189. Powell, D. A., McLaughlin, J., & Chachich, M. (2000). Classical conditioning of autonomic and somatomotor responses and their central nervous system substrates. In D. S. Woodruff-Pak & J. E. Steinmetz (Eds.), Eyeblink classical conditioning: Animal models (Vol. II, pp. 257–286). New York: Kluwer Press. Quirk, G. J., Garcia, R., & González-Lima, F. (2006). Prefrontal mechanisms in extinction of conditioned fear. Biological Psychiatry, 60, 337–343. Quirk, G. J., Likhtik, E., Pelletier, J. G., & Pare, D. (2003). Stimulation of medial prefrontal cortex decreases the responsiveness of central amygdala output neurons. Journal of Neuroscience, 23, 8800–8807. Quirk, G. J., Repa, J. C., & LeDoux, J. E. (1995). Fear conditioning enhances short-latency auditory responses of lateral amygdala neurons: Parallel recordings in the freely behaving rat. Neuron, 15, 1029–1039. Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative and Physiological Psychology, 66, 1–5.
Ohyama, T., Nores, W. L., Medina, J. F., Riusech, F. A., & Mauk, M. D. (2006). Learning-induced plasticity in deep cerebellar nucleus. Journal of Neuroscience, 26, 12656–12663.
Richmond, M. A., Yee, B. K., Pouzet, B., Veenman, L., Rawlins, J. N. P., Feldon, J., et al. (1999). Dissociating context and space within the hippocampus: Effects of complete, dorsal, and ventral excitotoxic hippocampal lesions on conditioned freezing and spatial learning. Behavioral Neuroscience, 113, 1189–1203.
Olds, J., Disterhoft, J. F., Segal, M., Kornblith, C. L., & Hirsh, R. (1972). Learning centers in the brain mapped by measuring latencies of conditioned unit responses. Journal of Neurophysiology, 35, 202–219.
Rogan, M. T., Stäubli, U. V., & LeDoux, J. E. (1997, December 11). Fear conditioning induces associative long-term potentiation in the amygdala. Nature, 390, 604–607.
Oliver, C. G., Swain, R. A., & Berry, S. D. (1993). Hippocampal plasticity during jaw movement conditioning in the rabbit. Brain Research, 608, 150–154.
Romano, A. G., & Patterson, M. M. (1987). The rabbit in Pavlovian conditioning. In I. Gormezano, W. F. Prokasy, & R. F. Thompson (Eds.), Classical conditioning (3rd ed., pp. 1–36). Hillsdale, NJ: Erlbaum.
O’Reily, R. C., & Rudy, J. W. (2001). Conjunctive representations in learning and memory: Principles of cortical and hippocampal function. Psychological Review, 108, 311–345.
Romanski, L. M., Clugnet, M. C., Bordi, F., & LeDoux, J. E. (1993). Somatosensory and auditory convergence in the lateral nucleus of the amygdala. Behavioral Neuroscience, 107, 444–450.
8/18/09 6:23:52 PM
526
Neuronal Basis of Learning
Romanski, L. M., & LeDoux, J. E. (1992). Equipotentiality of thalamoamygdala and thalamo-cortico-amygdala circuits in auditory fear conditioning. Journal of Neuroscience, 12, 4501–4509. Rorick, L. M., Finn, P. R., & Steinmetz, J. E. (2003a). High alcohol-drinking (HAD) rats exhibit persistent freezing responses to discrete cues following Pavlovian fear conditioning. Pharmacology, Biochemistry and Behavior, 76, 223–230. Rorick, L. M., Finn, P. R., & Steinmetz, J. E. (2003b). Moderate doses of ethanol partially reverse avoidance learning deficits in high-alcoholdrinking (HAD) rats. Pharmacology, Biochemistry and Behavior, 75, 89–102. Rorick-Kehn, L. M., & Steinmetz, J. E. (2005). Amygdala unit activity during three learning tasks: Eyeblink classical conditioning, Pavlovian fear conditioning and signaled avoidance conditioning. Behavioral Neuroscience, 119, 1254–1276. Royer, S., Martina, M., & Pare, D. (1999). An inhibitory interface gates impulse traffic between the input and output stations of the amygdala. Journal of Neuroscience, 19, 10575–10583. Rudy, J. W., Huff, N. C., & Matus-Amat, P. (2004). Understanding contextual fear conditioning: Insights from a two-process model. Neuroscience Biobehavioral Review, 28, 675–685. Sacchetti, B., Baldi, E., Lorenzini, C., & Bucherelli, C. (2002). Cerebellar role in fear-conditioning consolidation. Proceedings of the National Academy of Sciences, USA, 99, 8406–8411. Sacchetti, B., Sacco, T., & Strata, P. (2007). Reversible inactivation of amygdala and cerebellum but not perirhinal cortex impairs reactivated fear memories. European Journal of Neuroscience, 25, 2875–2884. Sacchetti, B., Scelfo, B., Tempia, F., & Strata, P. (2004). Long-term synaptic changes induced in the cerebellar cortex by fear conditioning. Neuron, 42, 973–982. Sakurai, M. (1987). Synaptic modification of parallel fibre-Purkinje cell transmission in in vitro guinea-pig cerebellar slices. Journal of Physiology, 394, 463–480. Salvatierra, A. T., & Berry, S. D. (1989). Scopolamine disruption of septo-hippocampal activity and classical conditioning. Behavioral Neuroscience, 103, 715–721. Sanders, M. J., Wiltgen, B. J., & Fanselow, M. S. (2003). The place of the hippocampus in fear conditioning. European Journal of Pharmacology, 463(1–3), 217–223. Schmajuk, N. A., & DiCarlo, J. J. (1992). Stimulus configuration, classical conditioning, and hippocampal function. Psychological Review, 99, 268–305.
Sheafor, P. J., & Gormezano, I. (1972). Conditioning of the rabbit’s (Oryctolagus cuniculus) jaw movement response: US magnitude effects on URs, CRs, and pseudo-CRs. Journal of Comparative and Physiological Psychology, 81, 449–456. Sherrington, C. S. (1906). Integrated action of the nervous system. New Haven, CT: Yale University Press. Shi, C., & Davis, M. (1999). Pain pathways involved in fear conditioning measured with fear-potentiated startle: Lesion studies. Journal of Neuroscience, 19, 420–430. Simon, B., Knuckley, B., Churchwell, J., & Powell, D. A. (2005). Posttraining lesions of the medial prefrontal cortex interfere with subsequent performance of trace eyeblink conditioning. Journal of Neuroscience, 25, 10740–10746. Smith, M. C., DiLollo, V., & Gormezano, I. (1966). Conditioned jaw movement in the rabbit. Journal of Comparative and Physiological Psychology, 62, 479–483. Snider, R. S., & Maiti, A. (1976). Cerebellar contributions to the Papez circuit. Journal of Neuroscience Research, 2, 133–146. Solomon, P. R., & Moore, J. W. (1975). Latent inhibition and stimulus generalization of the classically conditioned nictitating membrane response in rabbits (Oryctolagus cuniculus) following dorsal hippocampal ablation. Journal of Comparative and Physiological Psychology, 89, 1192–203. Solomon, P. R., Solomon, S. D., Van der Schaaf, E. V., & Perry, H. E. (1983, April 15). Altered activity in the hippocampus is more detrimental to classical conditioning than removing the structure. Science, 220, 329–331. Solomon, P. R., Van der Schaaf, E. V., Thompson, R. F., & Weisz, D. J. (1986). Hippocampus and trace conditioning of the rabbit’s classically conditioned nictitating membrane response. Behavioral Neuroscience, 100, 729–744. Squire, L. R. (1992). Declarative and nondeclarative memory: Multiple brain systems supporting learning and memory. Journal of Cognitive Neuroscience, 4, 232–243. Stanton, M. E., Fox, G. D., & Carter, C. S. (1998). Ontogeny of the conditioned eyeblink response in rats: Acquisition or expression? Neuropharmacology, 37, 623–632. Stanton, M. E., Freeman, J. H., Jr., & Skelton, R. W. (1992). Eyeblink conditioning in the developing rat. Behavioral Neuroscience, 106, 657–665. Steinmetz, J. E. (1990). Neuronal activity in the cerebellar interpositus nucleus during classical NM conditioning with a pontine stimulation CS. Psychological Science, 1, 378–382.
Schmaltz, L. W., & Theios, J. (1972). Acquisition and extinction of a classically conditioned response in hippocampectomized rabbit (Oryctolagus cuniculus). Journal of Comparative Physiological Psychology, 79, 328–333.
Steinmetz, J. E. (2000). Brain substrates of classical eyeblink conditioning: A highly localized but also distributed system. Behavioural Brain Research, 110, 13–24.
Schreurs, G. G., Oh, M. M., & Alkon, D. L. (1996). Pairing-specific longterm depression of Purkinje cells excitatory postsynaptic potentials results from classical conditioning procedure in the rabbit cerebellar slice. Journal of Neurophysiology, 75, 1051–1060.
Steinmetz, J. E., Lavond, D. G., Ivkovich, D., Logan, C. G., & Thompson, R. F. (1992). Disruption of classical eyelid conditioning after cerebellar lesions: Damage to a memory trace system or a simple performance deficit? Journal of Neuroscience, 12, 4403–4426.
Seager, M. A., Borgnis, R. L., & Berry, S. D. (1997). Delayed acquisition of behavioral and hippocampal responses during jaw movement conditioning in aging rabbits. Neurobiology of Aging, 18, 631–639.
Steinmetz, J. E., Lavond, D. G., & Thompson, R. F. (1989). Classical conditioning in rabbits using pontine nucleus stimulation as a conditioned stimulus and inferior olive stimulation as an unconditioned stimulus. Synapse, 3, 225–233.
Sears, L. L., Logue, S. F., & Steinmetz, J. E. (1996). Involvement of the ventrolateral thalamus in rabbit classical eyeblink conditioning. Behavioural Brain Research, 74, 105–117. Sears, L. L., & Steinmetz, J. E. (1990). Acquisition of classically conditioned-related activity in the hippocampus is affected by lesions of the cerebellar interpositus nucleus. Behavioral Neuroscience, 104, 681–692. Sears, L. L., & Steinmetz, J. E. (1991). Dorsal accessory inferior olive activity diminishes during acquisition of the rabbit classically conditioned eyelid response. Brain Research, 545, 114–122.
c26.indd 526
Steinmetz, J. E., Logan, C. E., Rosen, D. J., Thompson, J. K., Lavond, D. G., & Thompson, R. F. (1987). Initial localization of the acoustic conditioned stimulus projection system to the cerebellum during classical eyelid conditioning. Proceedings of the National Academy of Sciences, USA, 84, 3531–3535. Steinmetz, J. E., Logue, S. F., & Miller, D. P. (1993). Using signaled barpressing tasks to study the neural substrates of appetitive and aversive learning in rats: Behavioral manipulations and cerebellar lesions. Behavioral Neuroscience, 107, 941–954.
8/18/09 6:23:52 PM
References 527 Steinmetz, J. E., Logue, S. F., & Steinmetz, S. S. (1992). Rabbit classically conditioned eyelid responses do not reappear after interpositus nucleus lesion and extensive post-lesion training. Behavioural Brain Research, 51, 103–114. Steinmetz, J. E., Rosen, D. J., Chapman, P. F., Lavond, D. G., & Thompson, R. F. (1986). Classical conditioning of the rabbit eyelid response with a mossy fiber stimulation CS: Pt. I. Pontine nuclei and middle cerebellar peduncle stimulation. Behavioural Neuroscience, 100, 871–880. Steinmetz, J. E., Sears, L. L., Gabriel, M., Kubota, Y., & Poremba, A. (1991). Cerebellar interpositus nucleus lesions disrupt classical nictitating membrane conditioning but not discriminative avoidance learning in rabbits. Behavioural and Neural Biology, 57, 103–115. Steinmetz, J. E., & Sengelaub, D. R. (1992). Possible CS pathway for classical eyelid conditioning in rabbits: Pt. I. Anatomical evidence for direct projections from the pontine nuclei to the cerebellar interpositus nucleus. Behavioral and Neural Biology, 57, 103–115. Swanson, L. W., & Petrovich, G. D. (1998). What is the amygdala? Trends in Neuroscience, 21, 323–331. Thompson, R. F. (1976). The search for the engram. American Psychologist, 31, 209–227. Thompson, R. F. (1986, August 29). The neurobiology of learning and memory. Science, 233, 941–947. Thompson, R. F. (2005). In search of memory traces. Annual Review of Psychology, 56, 1–23. Thompson, R. F., & Spencer, W. A. (1966). Habituation: A model phenomenon for the study of neuronal substrates of behavior. Psychological Review, 173, 16–43. Tracy, J. A., Britton, G. B., & Steinmetz, J. E. (2001). Comparisons of single unit responses to tone, light and compound conditioned stimuli during rabbit classical eyeblink conditioning. Neurobiology of Learning and Memory, 76, 253–267. Tsien, J. Z., Huerta, P. T., & Tonegawa, S. (1996). The essential role of hippocampal CA1 NMDA receptor-dependent synaptic plasticity in spatial memory. Cell, 87, 1327–1338. van der Karr, L. D., Piechowski, R. A., Rittenhouse, P. A., & Gray, T. S. (1991). Amygdaloid lesions: Differential effect on conditioned stress and immobilization-induced increases in corticosterone and renin secretion. Neuroendocrinology, 54, 89–95. Voneida, T. J., Christie, D., Bogdanski, R., & Chopko, B. (1990). Changes in instrumentally and classically conditioned limb flexion responses following inferior olivary lesions and olivocerebellar tractotomy in the cat. Journal of Neuroscience, 10, 3583–3593.
c26.indd 527
Weible, A. P., McEchron, M. D., & Disterhoft, J. F. (2000). Cortical involvement in acquisition and extinction of trace eyeblink conditioning. Behavioral Neuroscience, 114, 1058–1067. Weiskrantz, L. (1956). Behavioral changes associated with ablation of the amygdaloid complex in monkeys. Journal of Comparative and Physiological Psychology, 49, 381–391. Weiss, C., Weible, A. P., Galvez, R., & Disterhoft, J. F. (2006). Forebraincerebellar interactions during learning. Cell Science Reviews, 3, 200–230. Wilensky, A. E., Schafe, G. E., Kristensen, M. P., & LeDoux, J. E. (2006). Rethinking the fear circuit: The central nucleus of the amygdala is required for the acquisition, consolidation, and expression of Pavlovian fear conditioning. Journal of Neuroscience, 26, 12387–12396. Wise, R. A. (2004). Dopamine, learning and motivation. Nature Reviews Neuroscience, 5, 1–12. Woodruff-Pak, D. S., Lavond, D. G., Logan, C. G., Steinmetz, J. E., & Thompson, R. F. (1993). Cerebellar cortical lesions and reacquisition in classical conditioning of the nictitating membrane response in rabbits. Brain Research, 608, 67–77. Woodruff-Pak, D. S., Lavond, D. G., Logan, C. G., & Thompson, R. F. (1987). Classical conditioning in 3-, 30-, and 45-month old rabbits: Behavioral learning and hippocampal unit activity. Neurobiology of Aging, 8, 101–108. Woodruff-Pak, D. S., & Steinmetz, J. E. (Eds.). (2000). Eyeblink classical conditioning: Vol. 1. Human applications. Boston: Kluwer Press. Woody, C. D. (1986). Understanding the cellular basis of memory and learning. Annual Review of Psychology, 37, 433–493. Woody, C. D., & Black-Cleworth, P. (1973). Differences in excitability of cortical neurons as a function of motor projection in conditioned cats. Journal of Neurophysiology, 36, 1004–1116. Woody, C. D., & Engel, J., Jr. (1972). Changes in unit activity and thresholds to electrical microstimulation at coronal-precurciate cortex of cat with classical conditioning of different facial movements. Journal of Neurophysiology, 31, 851–864. Yeo, C. H. (2004). Memory and the cerebellum. Current Neurology and Neuroscience Reports, 4, 87–89. Yeo, C. H., & Hardiman, M. J. (1992). Cerebellar cortex and eyeblink conditioning: A reexamination. Experimental Brain Research, 88, 623–638. Yeo, C. H., Hardiman, M. J., & Glickstein, M. (1985). Classical conditioning of the nictitating membrane response of the rabbit: Pt. II. Lesions of the cerebellar cortex. Experimental Brain Research, 63, 81–92.
Wagner, A. R., Logan, F. A., Haberlandt, K., & Price, T. (1968). Stimulus selection in animal discrimination learning. Journal of Experimental Psychology, 76, 171–180.
Yeo, C. H., Hardiman, M. J., & Glickstein, M. (1986). Classical conditioning of nictitating membrane response of the rabbit: Pt. IV. Lesions of the inferior olive. Experimental Brain Research, 63, 81–92.
Wallace, K. J., & Rosen, J. B. (2001). Neurotoxic lesions of the lateral nucleus of the amygdala decrease conditioned fear but not unconditioned fear of a predator odor: Comparison with electrolytic lesions. Journal of Neuroscience, 21, 3619–3627.
Zhu, L., Scelfo, B., Hartell, N. A., Strata, P., & Sacchetti, B. (2007). The effects of fear conditioning on cerebellar LTP & LTD. European Journal of Neuroscience, 26, 219–227.
8/18/09 6:23:53 PM
Chapter 27
Synaptic and Cellular Basis of Learning CRAIG H. BAILEY AND ERIC R. KANDEL
Studies of a variety of memory systems, ranging in complexity from elementary forms of implicit memory in invertebrates and mammals to more complex forms of hippocampal-based explicit memory, suggest that the storage of long-term memory is associated with altered gene expression, the synthesis of new proteins, and the growth of new synaptic connections (Kandel, 2001). For both forms of memory storage, the synaptic growth is thought to represent a final cellular change that stabilizes the long-term process (Bailey, Bartsch, & Kandel, 1996; Bailey & Kandel, 1993; Bailey, Kandel, & Si, 2004). Despite the association of synaptic growth with various forms of long-term memory, surprisingly little is known about the cell biological mechanisms that regulate and couple the structural changes to the molecular changes and the relative functional contribution each may make to the initiation of the long-term process on the one hand and its stable maintenance on the other (Bailey & Kandel, 1993; Bliss, Collingridge, & Morris, 2003; Kandel, 2001). This in turn raises two questions central to an understanding of the molecular biology of memory storage: (1) Do the enduring alterations in synaptic strength that characterize long-term memory result from a structural change in preexisting connections, for example, from the conversion of nonfunctional (silent) synapses to functional synapses, from the addition of newly formed functional synapses, or from perhaps both? (2) Is the maintenance of long-term memory achieved, at least in part, because of the relative stability of synaptic structure? If so, what are the mechanisms that can survive molecular turnover and thereby serve to stabilize learning-induced changes in synapse number and structure? We address these questions by focusing on recent molecular and structural studies of long-term memory in Aplysia. We begin by examining the structural remodeling and growth of identified sensory neuron synapses that accompany long-term sensitization—an elementary form of implicit memory. We then turn to in vitro studies
of the sensory-to-motor neuron synapse reconstituted in dissociated cell culture that have provided some of the first molecular insights into both the signaling pathways and mechanisms that underlie the initiation of these structural changes and their functional contribution to the different temporal phases of long-term facilitation, as well as the role of local protein synthesis and activation of translational regulators in the stabilization of learning-related synaptic growth for the persistence of memory storage. Finally, we consider how the molecules and mechanisms that regulate alterations in the structure of the synapse that are induced by learning in Aplysia may relate to those that govern de novo synapse formation during development. MEMORY’S TWO MAJOR FORMS Modern studies in cognitive psychology have demonstrated that learning and memory are not unitary faculties of mind but consist of distinct mental processes (for review, see Squire & Zola-Morgan, 1991). In the most general sense, learning can be considered as the process by which new information is acquired, and memory can be considered as the process by which that knowledge is retained. Memory can be divided into at least two general categories, each with its own rules. Explicit or declarative memory is the conscious recall of knowledge about people, places, and things, and is particularly well developed in the vertebrate brain (see also Chapter 28). The second category, implicit or nondeclarative memory, relates to motor and perceptual skills as well as other tasks and is expressed through performance, without conscious recall of past experience. Implicit memory includes simple associative forms of memory such as classical and operant conditioning, and nonassociative forms such as sensitization and habituation. Explicit and implicit memory have been localized to different neural systems within the brain (Milner, 1985; Polster, Nadel, & Schachter, 1991; Squire, 1992). As first shown by Brenda Milner in her neuropsychological studies of the patient H.M., the establishment of explicit memory is critically
Research in this review was supported in part by National Institutes of Health grant MH37134 (to C.H.B.), the Howard Hughes Medical Institute (to E.R.K.), and the Kavli Institute for Brain Sciences. 528
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c27.indd 528
8/18/09 6:25:10 PM
Long-Term Synaptic Plasticity 529
dependent on structures in the medial temporal lobe of the cerebral cortex, including the hippocampal formation. Implicit memory is a family of different processes that are represented in a number of brain systems including the cerebellum, striatum, amygdala, and, in the simplest cases, the sensory and motor pathways recruited during the learning process for particular perceptual or motor skills. As a result, implicit memory can be studied in a variety of simple reflex systems, including those of higher invertebrates, whereas explicit memory is best studied in mammals. Experimental results from clinical studies in humans as well as a variety of studies on different animal systems suggest that each form of memory has distinct stages: a short-term form that lasts seconds, minutes, or hours and a long-term form that can persist for days, weeks, and even a lifetime. Early in this century, studies of human memory described a consolidation period in the transition from short-term to long-term memory. During this consolidation period, memory storage is labile and highly sensitive to disruption. Recent molecular studies of implicit and explicit forms of learning suggest that this transition corresponds to a central program of altered gene expression. This molecular cascade converts a transient short-term process, which involves the covalent modification of preexisting proteins and a change in the effectiveness of preexisting synapses, into a stable, self-maintained long-term process that is accompanied by the structural remodeling of preexisting synapses and the growth of new synaptic connections. Two experimental model systems have been extensively studied as representative examples of these two forms of memory storage: long-term sensitization in the marine snail Aplysia californica as an example of implicit memory and hippocampal long-term potentiation (LTP) as an example of long-lasting synaptic plasticity thought to contribute to explicit memory storage in mammals. An enduring increase in the strength of synaptic connections can be induced by high-frequency stimulation of specific afferent pathways. The phenomenon of LTP is currently thought to be a cellular correlate or, at least, a requirement for certain types of explicit memory formation in the mammalian hippocampus. In this chapter, we focus primarily on the cellular, molecular, and structural mechanisms that underlie longterm memory in Aplysia but refer to recent studies of LTP and spatial memory formation in mammals as points of comparison to consider similarities and differences in implicit and explicit memory storage.
LONG-TERM SYNAPTIC PLASTICITY The central nervous system of the marine snail Aplysia californica has proven useful as a model system for studying
c27.indd Sec1:529
the cellular and molecular bases of learning and memory. It contains only approximately 20,000 large, identifiable nerve cells, clustered into 10 major ganglia. The ability to identify individual neurons and record their activity has made it possible to define the major components of the neuronal circuits of specific behaviors and to delineate the critical synaptic sites and underlying mechanisms used to store memory-related representations. The molecular mechanisms contributing to implicit memory storage have been most extensively studied for the gill-withdrawal reflex of Aplysia (Kandel, 2001). As is true for other types of defensive reflexes, the gillwithdrawal reflex can be modified by several different forms of implicit learning. We focus here on sensitization, an elementary nonassociative form of learned fear by which an animal learns about the properties of a single noxious stimulus (Figure 27.1A). The animal learns to strengthen its defensive reflexes and to respond vigorously to a variety of previously neutral stimuli after it has been exposed to a potentially threatening stimulus. In Aplysia, sensitization of the gill-withdrawal reflex can be induced by a strong stimulus applied to the tail. This activates facilitatory interneurons that synapse on identified sensory neurons and strengthen the synaptic connection between the sensory neurons and their target motor neurons (Figure 27.1B). As is the case for other defensive withdrawal reflexes, the behavioral memory for sensitization of the gill-withdrawal reflex is graded and retention is proportional to the number of training trials. A single stimulus to the tail gives rise to short-term sensitization lasting minutes to hours. Repetition of this stimulus produces long-term behavioral sensitization that can last for days or weeks (Frost, Castellucci, Hawkins, & Kandel, 1985; Figure 27.2). Short- and long-term sensitization lead to enhanced synaptic transmission at the monosynaptic connection between identified mechanoreceptor sensory neurons and motor neurons. Although this component accounts for only a part of the behavioral modification measured in the intact animal, its simplicity has facilitated the cellular and molecular analysis of both the short- and long-term forms of sensitization. The monosynaptic sensory to motor neuron connection, which is thought to be glutamatergic (Conrad, Wu, & Schacher, 1999; Dale & Kandel, 1993; Trudeau & Castellucci, 1993), can be reconstituted in dissociated cell culture in which serotonin (5-hydroxytryptamine [5HT]), a modulatory neurotransmitter normally released by sensitizing stimuli, can substitute for the tail shock used during behavioral training in the intact animal (Montarolo et al., 1986). In parallel to behavioral sensitization, a single application of 5-HT produces short-term changes in synaptic effectiveness, whereas five spaced applications given over a period of 1.5 hour produce long-term changes lasting
8/18/09 6:25:10 PM
530
Synaptic and Cellular Basis of Learning
(A)
Standard Gill-Withdrawal Reflex
Enhanced Gill-Withdrawal after Sensitization
Mantle shelf
Gill
Siphon
Tactile stimulus
Tail shock
Tactile stimulus
(B) Siphon
Modulatory (5-HT)
Sensory Neuron (24)
Tail
Figure 27.1 Sensitization withdrawal reflex in Aplysia.
of
the
gill-
Note. A: Dorsal view of Aplysia showing the gill, mantle shelf, and siphon. A light touch to the siphon causes the siphon to contract and the gill to withdraw under the protection of the mantle shelf (shown here retracted for a clearer view). Sensitization of the gillwithdrawal reflex is produced by applying a noxious stimulus to another part of the body (such as the tail) and leads to an enhancement of the withdrawal reflex of both the siphon and gill. B: Neural circuit of the gill-withdrawal reflex. The siphon is innervated by 24 sensory neurons that connect directly with the 6 motor neurons. The sensory neurons also connect to populations of excitatory and inhibitory interneurons that in turn connect with the motor neurons. Stimulating the tail activates modulatory interneurons that act on the terminals of the sensory neurons as well as on those of the excitatory interneurons. Three classes of modulatory neurons are activated by tail stimulation. The most important modulatory action is mediated by serotonin (5-HT). Blocking the action of these serotonergic cells blocks the effects of sensitizing stimuli. From “The Molecular Biology of Memory Storage: A Dialogue between Genes and Synapses,” by E. R. Kandel, 2001, Science, 294, p. 1031. Adapted with permission.
IN Motor Neuron (6)
EX Interneurons
Gill
Duration of Withdrawal (percentage of control)
1000
4 x 4 shocks a day for 4 days
Note. A summary of the effects of long-term sensitization training on the duration of gill and siphon withdrawal in Aplysia. The retention of the memory for sensitization is a graded function proportional to the number of training trials. Before sensitization, a weak touch to the siphon causes only a brief siphon- and gill-withdrawal reflex. Following a single noxious, sensitizing, shock to the tail, that same weak touch elicits a much larger response that lasts about 1 hour. More tail shocks increase the size and duration of the response. Application of protein synthesis inhibitors blocks the long-term but not the short-term memory for sensitization. From “Monosynaptic Connections Made by the Sensory Neurons of the Gill- and Siphon-Withdrawal Reflex in Aplysia Participates in the Storage of LongTerm Memory for Sensitization,” by W. N. Frost, V. F. Castellucci, R. D. Hawkins, and E. R. Kandel, 1985, Proceedings of the National Academy of Sciences, United States, 82, p. 8267. Adapted with permission.
500
4 shocks
1 tail shock
100
Figure 27.2 The behavioral memory for long-term sensitization of the gill-withdrawal reflex in Aplysia is graded and retention is proportional to the number of training trials.
Inhibitor of protein synthesis
0 0
1
4
7
Days after Training
1 or more days. These findings of an elementary, cellular representation of the short- and long-term memory for sensitization have allowed us to address directly the following question: What are the molecular substrates and regulatory mechanisms that underlie memory storage?
c27.indd Sec1:530
Biophysical and biochemical studies of the connections between sensory and motor neurons in both the intact animal and cells in culture indicate that the short-term and long-term changes share aspects of a common molecular mechanism. Both processes are initiated by 5-HT, and a
8/18/09 6:25:10 PM
Learning-Induced Growth of New Sensory Neuron Synapses 531
component of the increase in synaptic strength observed during both the short- and long-term is due to enhanced transmitter release by the sensory neuron. This presynaptic increase in transmitter release is due, in part, to the spike broadening that results from the modulation by 5-HT of specific sets of potassium channels (Dale, Kandel, & Schacher, 1987; Frost et al., 1985; Klein & Kandel, 1980; Montarolo et al., 1986; Scholz & Byrne, 1987). Despite these several similarities, the short-term cellular changes differ from the long-term modifications in two important ways. First, the short-term change involves only covalent modification of preexisting proteins and an alteration of preexisting connections. Both short-term behavioral sensitization in the animal and short-term facilitation in dissociated cell culture do not require ongoing macromolecular synthesis (Montarolo et al., 1986; Schwartz, Castellucci, & Kandel, 1971). By contrast, inhibitors of transcription or translation block the induction of the long-term changes in both the semi-intact preparation (Castellucci, Blumenfeld, Goelet, & Kandel, 1989) and primary cell culture (Montarolo et al., 1986). Most striking is the finding that the induction of long-term facilitation at this single synapse in Aplysia exhibits a requirement for protein and RNA synthesis during a critical time window or consolidation period. A variety of forms of memory in both vertebrates and invertebrates share this requirement for macromolecular synthesis during the consolidation period. From a molecular perspective, these studies indicate that the long-term behavioral and cellular changes require the expression of genes and proteins not required for short-term processes. The identification of the gene products required for this consolidation remains a major goal of molecular research into memory processes. Second, the finding in Aplysia that long-term sensitization training is associated with the growth of new synaptic connections between the sensory neurons and their follower cells demonstrated that the long-term but not the short-term process involves a structural change (Bailey & Chen, 1983, 1988a; Bailey & Kandel, 1993).
LEARNING-INDUCED GROWTH OF NEW SENSORY NEURON SYNAPSES In the early 1980s, studies in Aplysia first began to explore the morphological basis of the synaptic plasticity that might underlie the transition from short-term to long-term memory. By combining selective intracellular labeling techniques with the analysis of serial thin sections and transmission electron microscopy, complete reconstructions of unequivocally identified sensory neuron synapses were quantitatively analyzed from both control and behaviorally
c27.indd Sec2:531
modified animals. The storage of long-term memory for sensitization (lasting several weeks) was accompanied by a family of distinct structural changes at identified sensory neuron synapses. These changes reflected a learning-induced remodeling of the functional architecture of presynaptic sensory neuron varicosities at two different levels of synaptic organization: (1) alterations in focal regions of membrane specialization of the synapse that mediate transmitter release—the number, size, and vesicle complement of sensory neuron active zones were larger in sensitized animals than in controls (Bailey & Chen, 1983, 1988b) and (2) a growth process that appeared similar to synaptogenesis during development and led to a pronounced increase in the total number of presynaptic varicosities per sensory neuron (Bailey & Chen, 1988a). Thus, sensory neurons from long-term sensitized animals exhibited a twofold increase in the number of synaptic varicosities, as well as an enlargement in the linear extent of each neuron’s axonal arbor when compared to sensory neurons from untrained animals (Figure 27.3). To determine which class of structural changes at sensory neuron synapses might contribute to the retention of long-term sensitization, Bailey and Chen (1989) compared the time course for each morphological change with the behavioral duration of the memory. They found that not all of the structural changes persisted as long as the memory. The increase in the size and synaptic vesicle complement of sensory neuron active zones present 24 hours following the completion of behavioral training returned to control levels when tested 1 week later. These data indicated that, insofar as this relatively transient modulation of active zone size and associated synaptic vesicles is one of the structural mechanisms underlying long-term sensitization, it is associated with the initiation and early expression of the long-term process and not with its persistence. By contrast, the duration of changes in varicosity and active zone number, which persisted unchanged for at least 1 week and were partially reversed at the end of 3 weeks, paralleled the behavioral time course of memory storage indicating that only the learning-induced increase in the number of sensory neuron synapses contributes to the stable maintenance of long-term sensitization. These results directly linked a change in synaptic structure to a long-lasting behavioral memory and suggested that the morphological alterations could represent an anatomical substrate for memory consolidation. In addition, the finding that some components of the learning-induced changes in synaptic architecture were transient whereas others endured suggested that not all of these modifications were regulated synchronously. At the structural level, the sensory neuron appears to have multiple mechanisms and parameters of plasticity available to it. Thus, during the later phases of long-term memory
8/18/09 6:25:11 PM
532
Synaptic and Cellular Basis of Learning Pericardial N. Branchial N. Gential N. Siphon N. 1 2 3
Control
100 µm
Sensitized
1
2
3
Figure 27.3 Growth of sensory neurons induced by long-term sensitization in Aplysia. Note. Serial reconstruction of identified sensory neurons labeled with horseradish peroxidase (HRP; Bailey, Thompson, Castellucci, & Kandel, 1979) from long-term-sensitized and control animals. Total extent of the axonal arbors of sensory neurons from one control (untrained) and two long-termsensitized animals are shown. In each case, the rostral (row 3) to caudal (row 1) extent of the arbor is divided roughly into thirds. Each panel was produced by the super-imposition of camera lucida tracings of all HRP-labeled processes present in 17 consecutive slab-thick Epon sections and represents a linear segment through the ganglion of roughly 340 m.
storage for sensitization, although there are more synapses, each individual synapse may recruit all of the mechanisms of plasticity that were present before training. Unlike the extensive anatomical changes observed at sensory neuron synapses following long-term training, the structural correlates of short-term memory in Aplysia (lasting minutes to hours rather than days to weeks) are far less pronounced (Bailey & Chen, 1988c). For example, the decrease in the strength of the sensory to motor neuron connection that accompanies short-term habituation is not associated with any detectable alterations in either the number of sensory neuron presynaptic varicosities or the number of active zones within the presynaptic varicosities. Nor does it alter the size of active zones or the total number of synaptic vesicles within the presynaptic varicosity.
c27.indd Sec2:532
For each composite, ventral is up, dorsal is down, lateral is to the left, and medial is to the right. By examining images across each row (rows 1, 2, and 3), the viewer is comparing similar regions of each sensory neuron. In all cases, the axonal arbor of long-term-sensitized cells is markedly increased compared to cells from control (untrained) animals and parallels the concomitant learning-induced increase in the number of sensory neuron presynaptic varicosities. From “Long-Term Memory in Aplysia Modulates the Total Number of Varicosities of Single Identified Sensory Neurons,” by C. H. Bailey and M. Chen, 1988a, Proceedings of the National Academy of Sciences United States, 85, p. 2375. Reprinted with permission.
Rather, there is a reduction in the number of vesicles that are docked at the active zones and thus there are fewer packets of transmitter ready to be released. Taken together, these initial morphological studies of short- and long-term memory in Aplysia began to suggest a clear difference in the nature, extent, and time course of changes in the functional architecture of the synapse that may underlie memories of differing durations. The transient durations of short-term memories involving covalent modifications of preexisting proteins (proteins that turn over slowly) are accompanied only by modest structural rearrangements that appear to be restricted to shifts in the proximity of synaptic vesicle populations contiguous to the release site. By contrast, the prolonged durations of longterm memories depend on altered gene expression and the
8/18/09 6:25:11 PM
Functional Contribution of Presynaptic Structural Changes to Long-Term Facilitation
synthesis of new proteins and are associated with more substantial and potentially more enduring structural alterations that are reflected by frank changes in both the number of synaptic contacts and their active zone morphology. These studies also demonstrated, for the first time, that learning-induced structural changes could be detected at the level of specific, identified synaptic connections known to be critically involved in the behavioral modification and provided evidence for an intriguing notion—that active zones are plastic rather than immutable components of the synapse. Even elementary forms of learning can remodel the basic anatomical scaffolding of the neuron, in this case by altering the number and organization of transmitter release sites in the presynaptic compartment, to modulate the functional expression of synaptic connections. Long-term sensitization also induces a parallel set of anatomical changes in the postsynaptic motor neuron L7. For example, long-term training increases the number of postsynaptic spines in contact with the sensory neuron presynaptic varicosities (Bailey & Chen, 1988b). Whereas these learning-related structural changes are considerably regulated and involve the remodeling and growth of both the pre- and postsynaptic compartment, we limit ourselves in this review to the presynaptic changes. Complete serial reconstructions of identified sensory neuron varicosities in untrained (naive) animals revealed that approximately 60% of these presynaptic terminals lacked a structurally detectable active zone suggesting the possibility of nascent synapses in the adult brain. The extent to which learning and memory can convert these immature, and presynaptically silent synapses into mature and functionally competent synaptic connections is discussed next. Finally, these initial studies in Aplysia suggested that the growth of new sensory neuron synapses may represent the final and perhaps most stable phase of long-term memory storage, and raised the possibility that the stability of the long-term process might be achieved, at least in part, because of the relative stability of synaptic structure. The long-lasting growth of new synaptic connections between sensory neurons and their follower cells during long-term sensitization can be reconstituted in sensorymotor neuron co-cultures by five repeated applications of 5-HT (Bailey, Montarolo, Chen, Kandel, & Schacher, 1992; Glanzman, Kandel, & Schacher, 1990) as well as induced in the intact ganglion by the intracellular injection of cAMP, a second messenger activated by 5-HT (Nazif, Byrne, & Cleary, 1991). In culture, the synaptic growth can be correlated with the long-term (24 to 72 hr) enhancement in synaptic effectiveness and depends on the presence of an appropriate target cell similar to the synapse formation that occurs during development.
c27.indd Sec3:533
533
FUNCTIONAL CONTRIBUTION OF PRESYNAPTIC STRUCTURAL CHANGES TO LONG-TERM FACILITATION In most model learning systems, the functional contribution of the structural changes that accompany long-lasting forms of synaptic plasticity remains largely unknown. We would like to know if changes in the number or structure of synaptic connections induced by learning are functionally effective and capable of contributing to the storage of long-term memory. Both technical and experimental limitations prevented the earlier behavioral studies in Aplysia discussed in the previous section from examining whether the increase in synaptic strength during long-term sensitization resulted from the conversion of preexisting but nonfunctional (silent) synapses to active synapses, or from the addition of newly formed functional synapses, or perhaps both. To address these issues directly, in vitro studies of the sensory-to-motor neuron synapse in Aplysia culture have monitored both functional and structural changes simultaneously so as to follow remodeling and growth at the same specific synaptic varicosities continuously over time and to examine the functional contribution of these presynaptic structural changes to the different time-dependent phases of long-term facilitation. Kim et al. (2003) combined time-lapse confocal imaging of individual presynaptic varicosities of sensory neurons labeled with three different fluorescent markers: the whole cell marker Alexa-594, and two presynaptic marker proteins: synaptophysin-eGFP that monitors changes in the distribution of synaptic vesicles within individual varicosities and synapto-PHluorin (synPH), a monitor of active transmitter release sites (Miesenbock, De Angelis, & Rothman, 1998). They found that repeated pulses of 5-HT induce two temporally, morphologically, and molecularly distinct classes of presynaptic changes: (1) the rapid activation of silent presynaptic terminals through the filling of preexisting empty varicosities with synaptic vesicles, which requires translation but not transcription; and (2) the generation of new synaptic varicosities that occurs more slowly and requires both transcription and translation. The enrichment of preexisting but empty varicosities with synaptophysin is completed within 3 to 6 hours, parallels intermediate-term facilitation, and accounts for approximately 32% of the newly activated synapses evident at 24 hours. By contrast, the new sensory neuron varicosities, which account for 68% of the newly activated synapses at 24 hours, do not form until 12 to 18 hours after exposure to five pulses of 5-HT. The rapid activation of silent presynaptic terminals suggests that in addition to its role in longterm facilitation, this modification of preexisting synapses may also contribute to the intermediate phase of synaptic
8/18/09 6:25:12 PM
534
Synaptic and Cellular Basis of Learning 0 hr
3–6 hr Intermediate-Term facilitation
12–18 hr
24 hr Long-Term facilitation
5ⴛ 5-HT SN
Empty Presynaptic Terminals
Activation of Silent Presynaptic Terminals
Formation of New Preynaptic Terminals *
MN
*
* Newly Activated Presynaptic Terminals MN⫽ Motor Neuron SN⫽ Sensory Neuron
Figure 27.4 Time course and functional contribution of two distinct presynaptic structural changes associated with intermediate- and long-term facilitation in Aplysia. Note. Repeated pulses of 5-HT in sensory to motor neuron co-cultures trigger two distinct classes of presynaptic structural changes: (1) the rapid clustering of synaptic vesicles to preexisting silent sensory neuron varicosities (3 to 6 hr), and (2) the slower generation of new sensory neuron synaptic varicosities (12 to 18 hr). The resultant newly filled and newly formed varicosities are functionally competent (capable of evoked
plasticity and memory storage (Ghirardi, Montarolo, & Kandel, 1995; Mauelshagen, Parker, & Carew, 1996; Sutton, Masters, Bagnall, & Carew, 2001; Figure 27.4). In this study, Kim and colleagues (2003) employed a reduced 5-HT protocol to induce selectively facilitation in the intermediate-term time domain without inducing longterm facilitation (Ghirardi et al., 1995). They found that isolated intermediate-term facilitation was also accompanied by the redistribution and clustering of synaptic vesicle proteins into empty sensory neuron varicosities at 0.5 hour and 3 hours similar to what occurred when intermediate- and long-term facilitation were recruited together. However, the presynaptic structural changes induced by the reduced 5-HT protocol differed from those induced by long-term training in at least two ways. First, there was no growth of new sensory neuron varicosities in the isolated intermediate phase. Second, unlike the filling of preexisting empty varicosities during the intermediate-term phase induced by the long-term protocol, the newly filled varicosities did not persist for 24 hours and were unaffected by inhibitors of protein synthesis suggesting that the structural remodeling induced by the reduced 5-HT protocol involved only a simple rearrangement of preexisting synaptic components. This may reflect a fundamental difference in the molecular mechanisms recruited by the two 5-HT protocols. Although both protocols induce intermediate-term facilitation, the long-term protocol may activate
c27.indd Sec3:534
transmitter release) and contribute to the synaptic enhancement that underlies LTF. The rapid filling and activation of silent presynaptic terminals at 3 hours suggests that, in addition to its role in LTF, this modification of preexisting varicosities may also contribute to the intermediate phase of synaptic plasticity. Triangles represent functionally competent transmitter release sites (active zones). From “Presynaptic Activation of Silent Synapses and Growth of New Synapses Contribute to Intermediate and Long-Term Facilitation in Aplysia,” by J.-H. Kim et al., 2003, Neuron, 40, p. 162. Adapted with permission.
additional molecular events (including the machinery for translational activation) required to set up the long-term phase, perhaps by stabilizing the intermediate phase. At present, it is not known how the covalent modifications that lead to the rearrangement of preexisting synaptic proteins at empty varicosities is converted by the long-term protocol to a more stable, protein-synthesis dependent process. The activation of silent synapses also seems to play a major role in long-term potentiation (LTP)—a more complex form of explicit memory storage in the hippocampus of mammals. Although in mammals the term refers to a very specific molecular configuration found in synapses in different regions of the CNS of vertebrates (Malinow, Mainen, & Hayashi, 2000; Malinow & Malenka, 2002). In this case, the term silent synapse refers to excitatory glutamatergic synapses whose postsynaptic membrane contains NMDARs but no AMPARs. Found on the surface of the postsynaptic neuron, these receptors bind and are activated by the amino acid glutamate. There are two basic classes of glutamate receptors: ionotropic and metabotropic. Ionotropic receptors include AMPA receptors (AMPAR) that mediate fast synaptic transmission at excitatory synapses and NMDA receptors (NMDAR) that are permeable to calcium and regulate synaptic plasticity. Unlike ionotropic receptors, metabotropic glutatmate receptors (mGluRs) are not directly linked to ion channels but can affect them through an indirect process involving the activation of biochemical cascades.
8/18/09 6:25:12 PM
5-HT-Induced Regulation of the Presynaptic Actin Network
The lack of AMPAR-mediated signaling renders these synapses inactive, or “silent,” under normal conditions. Synaptic stimulation activates these silent synapses through the insertion of AMPARs into the postsynaptic membrane, a phenomenon sometimes referred to as AMPAfication. Calcium/calmodulin-dependent protein kinase II (CaMKII) plays a critical role in this process. Once this kinase is activated by high-frequency stimulation, it phosphorylates AMPARs or associated proteins, triggering their insertion into the postsynaptic membrane. The synapse is then no longer silent and postsynaptic responses are, by consequence, enhanced.
5-HT-INDUCED REGULATION OF THE PRESYNAPTIC ACTIN NETWORK One clue to the underlying molecular mechanisms responsible for these two discrete learning-related presynaptic structural changes comes from a study by Ahmari, Buchanan, and Smith (2000) who demonstrated that fluorescent puncta labeled by the synaptic vesicle marker VAMP-GFP are transported only at those synapses defined by the activity-dependent marker FM4-64. Moreover, these puncta contained not only synaptic vesicles but also other molecular components of the presynaptic active zone. Thus, the 5-HT-induced clustering of synaptic vesicle proteins to sensory neuron varicosities might represent a recruitment of not only synaptic vesicles but also the molecular precursors for active zone assembly. This redistribution of synaptic vesicle proteins and active zone components in both preexisting and newly formed sensory neuron synapses is also likely to involve cytoskeleton rearrangement (Benfenati, Onofri, & Giovedi, 1999; Matus, 2000). For example, structural remodeling of synapses in response to physiological activity requires reorganization of the actin network (Colicos, Collins, Sailor, & Goda, 2001; Huntley, Benson, & Colman, 2002) and the inhibition of actin function blocks synapse formation and interferes with long-term synaptic plasticity (Hatada, Wu, Sun, Schacher, & Goldberg, 2000; Krucker, Siggins, & Halpain, 2000; Zhang & Benson, 2001). Furthermore, several synaptic proteins such as synapsin can bind to the actin cytoskeleton and participate in synaptic vesicle trafficking (Humeau et al., 2001). How does an extracellular signal such as 5-HT lead to a reorganization of the actin cytoskeleton? The balance between actin polymerization and depolymerization is tightly regulated by extracellular signaling molecules, many of which act through the Rho family of GTPases (Hall, 1998). These small GTPases are thought to participate at different stages during the development of the central nervous system, for example, in the establishment of
c27.indd Sec4:535
535
polarity, axon guidance, dendritic growth, and maintenance of dendritic spines (Bradke & Dotti, 1999; Nakayama, Harms, & Luo, 2000; Sin, Haas, Ruthhazer, & Cline, 2002; Threadgill, Bobb, & Ghosh, 1997; Yuan et al., 2003). Their participation, in turn, can be regulated by neuronal activity in vivo (Li, Aizenman, & Cline, 2002). In Aplysia, Udo et al. (2005) found that the application of toxin B, a general inhibitor of the Rho family, blocks 5-HT-induced long-term facilitation, as well as growth of new synapses in sensory-motor neuron co-cultures. Moreover, repeated pulses of 5-HT selectively induce the spatial and temporal regulation of the activity of only one of the small GTPases—Cdc42—at a subset of sensory neuron presynaptic varicosities. The activation of ApCdc42 induced by 5-HT is dependent on both the P13K and PLC pathways and, in turn, recruits the downstream effectors PAK (p21-Cdc42/Rac-activated kinase) and N-WASP (neuronal Wiskott-Aldrich syndrome protein) to regulate the presynaptic actin network. This initial molecular cascade leads to the outgrowth of filopodia, some of which represent morphological precursors for the growth of new sensory neuron varicosities associated with the storage of long-term facilitation. Initiation of Long-Term Facilitation As mentioned, the inhibition of transcription or translation does not affect short-term memory, but blocks the formation of long-term memory in a variety of model learning systems, suggesting that the stabilization of memory traces depends on de novo gene expression (Kandel, 2001). In Aplysia, 5-HT, released in vivo during sensitization or applied directly to cultured sensory neurons, regulates transmitter release. 5-HT binds to cell surface receptors on the sensory neurons that activate the enzyme adenylyl cyclase that converts ATP to the diffusible second messenger cAMP, thereby activating the cAMP-dependent protein kinase (PKA). PKA is a tetramer containing two catalytic subunits and two regulatory subunits. Binding of the second messenger cAMP to the regulatory subunit frees the active catalytic subunit of PKA that can then add phosphate groups to serine and threonine residues on target proteins, thereby altering their activity. Studies in Aplysia first revealed the participation of the camp/PKA-signaling pathway in behavioral sensitization and synaptic facilitation (Brunelli, Castellucci, & Kandel, 1976). PKA plays a central role in both short- and long-term facilitation: cAMP can evoke both short- and long-term facilitation, and inhibitors of PKA block both forms of facilitation. Insights into how PKA participates in both the short- and long-term process were provided by experiments in which fluorescently tagged PKA subunits were injected into sensory cells in
8/18/09 6:25:13 PM
536
Synaptic and Cellular Basis of Learning
culture. Using this technique to measure the amount of free PKA catalytic subunit, Bacskai et al. (1993) found that a single pulse of 5-HT, which produces short-term facilitation, increased the amount of active catalytic subunit in the presynaptic terminal of the sensory neurons. In the presynaptic terminals of the sensory cells, PKA phosphorylates target proteins such as ion channels, leading to a transient enhancement of transmitter release. By contrast, during long-term facilitation induced by repeated applications of 5-HT, the free catalytic subunit of PKA translocates to the cell body of the sensory neurons and enters the nucleus, where it phosphorylates transcription factors and thereby regulates gene expression. Both cAMP and PKA are essential components of the signal-transduction pathway for consolidating memories, not only in Aplysia but also for certain types of memory in Drosophila and mammals. Several olfactory learning mutants in Drosophila map to the cAMP pathway (Davis, 1996; Davis, Cherry, Dauwalder, Han, & Skoulakis, 1995; Drain, Folkers, & Quinn, 1991), indicating that blocking PKA function blocks memory formation in flies. In parallel, the late but not the early phase of LTP of the CA3-to-CAl synapse in the hippocampus is impaired by pharmacological or genetic interference with PKA (Abel et al., 1997; Frey, Huang, & Kandel, 1993; Y.-Y Huang, Li, & Kandel, 1994). However, the role of PKA seems to be different in hippocampal neurons than during LTF formation in Aplysia sensory neurons. In the hippocampus, PKA does not translocate to the nucleus and plays only a synaptic role: it can phosphorylate different targets, such as the GluR1 subunit of AMPAR (H. K. Lee, Barbarosie, Kameyama, Bear, & Huganir, 2000) and it favors the induction of LTP by counteracting the activity of protein phosphatases (Abel et al., 1997; Winder, Mansuy, Osman, Moallem, & Kandel, 1998). Finally, it also tags the synapse enabling the consolidation of the long-term process (Barco, Alarcon, & Kandel, 2002). In addition to protein kinases, synaptic protein phosphatases also play a key role in regulating the initiation of long-term synaptic changes. Various protein phosphatases, such as PP1 and calcineurin, oppose the local activity of PKA and act as inhibitory constraints on memory formation. Recent experiments in cultured Aplysia neurons indicate that calcineurin may act as a memory suppressor for sensitization (Sharma, Bagnall, Sutton, & Carew, 2003). In the mammalian brain, an increase in calcineurin activity also causes defects in long-term memory and L-LTP (Mansuy, Mayford, Jacob, Kandel, & Bach, 1998; Winder et al., 1998) whereas a reduction has the opposite effect (Malleret et al., 2001). Similarly, a reduction in PP1 activity improves memory in mice (Genoux et al., 2002). Therefore, in both systems a balance between phosphatase
c27.indd Sec4:536
and kinase activities at a given synapse gates the synaptic signals that eventually reach the nucleus, and can regulate both memory storage and retrieval (Abel et al., 1998). These studies suggest that the long-term regulation of transmitter release requires PKA-related gene activation. PKA activates gene expression by the phosphorylation of transcription factors that bind to the cAMP-responsive element (CRE). One of the major transcription factors that recognize the CRE is a protein called CRE-binding protein (CREB1), which functions as a transcriptional activator only after it is phosphorylated by PKA or another second messenger kinase. Microinjection of CRE containing oligonucleotides into sensory neurons inhibits the function of CREB1 and blocks long-term facilitation but has no effect on the short-term process (Dash, Hochner, & Kandel, 1990). Not only is CREB1 activation necessary for longterm facilitation, it is also sufficient to induce long-term facilitation, albeit in reduced form and in a form that is not maintained beyond 24 hours. Thus, sensory cell injection of recombinant CREB1a phophorylated in vivo by PKA led to an increase in EPSP amplitude at 24 hours in the absence of any 5-HT stimulation (Bartsch et al., 2000). Bartsch and associates (1995) have found that the genetic switch that converts short- to long-term facilitation is not only composed of the CREB1 regulatory unit but also another member of the CREB gene family, ApCREB2, a CRE-binding transcription factor constitutively expressed in sensory neurons. ApCREB2 resembles human CREB2 and mouse ATF4 (Hai, Liu, Coukos, & Green, 1989; Karpinski, Morle, Huggenvik, Uhler, & Leiden, 1992), and functions as a repressor of long-term facilitation. Thus, injection of anti-ApCREB2 antibodies into Aplysia sensory neurons causes a single pulse of 5-HT, which normally induces only short-term facilitation lasting minutes, to evoke facilitation that lasts more than 1 day. This response requires both transcription and translation and is accompanied by the growth of new synaptic connections. That both positive and negative regulators govern long-term synaptic changes suggests the transition from short-term facilitation to long-term facilitation requires the simultaneous removal of transcriptional repressors and activation of transcriptional activators. These transcriptional repressors and activators can interact with each other both physically and functionally and it is likely that the transition is a complex process involving temporally distinct phases of gene activation, repression, and regulation of signal transduction. The complete set of genes regulated by a transcription factor in a specific cell type is still not known. In Aplysia sensory neurons, the activity of ApCREB1 leads to the expression of several immediate-response genes, such as ubiquitin hydrolase that stabilize short-term facilitation (Hegde et al.,
8/18/09 6:25:14 PM
Chromatin Remodeling and Epigenetic Changes During Long-Term Memory Storage 537
1997), and the transcription factor CCAAT-box-enhanced binding-protein (C/EPB), whose induction has been shown to be critical for LTF (Alberini, Ghirardi, Metz, & Kandel, 1994). This induced transcription factor (in concert with other constitutively expressed molecules such as ApAF; Bartsch et al., 2000) activate a second wave of downstream genes that can ultimately lead to the growth of new synaptic connections. These genes represent only a few of the family of gene products generated by CREB activity. The participation of the cAMP/CREB pathway appears to be a general feature of long-term memory formation throughout the animal kingdom. The first genetic screenings designed to identify learning mutants in Drosophila revealed two interesting mutants, dunce and rutabaga, with specific defects in memory formation (Dudai, Jan, Byers, Quinn, & Benzer, 1976; Duerr & Quinn, 1982) that were subsequently shown to affect genes in the cAMP signaling pathway (Byers, Davis, & Kiger, 1981; Waddell & Quinn, 2001). Experiments in transgenic flies have confirmed that the balance between CREB activator and repressor isoforms is critical for long-term behavioral memory. Thus, overexpression of an inhibitory form of CREB (dCREB-2b) blocked long-term olfactory memory but did not alter shortterm memory (Perazzona, Isabel, Preat, & Davis, 2004; Yin et al., 1994). Indeed, most of the upstream signaling cascade leading to CREB activation appears to be conserved through evolution, and many aspects of the role of CREB in synaptic plasticity described in invertebrates have also been observed in the mammalian brain. However, the role of CREB in explicit forms of memory appears to be more complex than in implicit forms of memory in invertebrates (see reviews by Barco, Pittenger, & Kandel, 2003; Lonze & Ginty, 2002). In mammals, CREB has been shown to regulate the expression of more than one hundred genes, but it is still not clear how many of these putative downstream genes are actually regulated during learning and required for memory storage (Lonze & Ginty, 2002; Mayr & Montminy, 2001). The current list of target genes is heterogeneous and includes genes with very diverse functions, from regulation of transcription and metabolism to genes affecting cell structure or signaling. Many CREB targets, such as c-fos, EGR-1, or C/EBPb are themselves transcription factors, whose induction may trigger a second wave of gene expression. Although we have focused on CREB- dependent gene expression because of its conserved role in memory formation through evolution, other transcription factors, such as ApAF and C/EBP in Aplysia and SRF, C/EBPb, c-fos, or EGR-1 in mice (Albensi & Mattson, 2000; Izquierdo & Cammarota, 2004; Ramanan et al., 2005; Tischmeyer & Grimm, 1999) are also likely to contribute to the transcriptional regulation that accompanies long-lasting forms of synaptic plasticity.
c27.indd Sec5:537
The CREB-mediated response to extracellular stimuli can be modulated by a number of kinases (PKA, CaMKII, CaMKIV, RAK2, MAPK, and PKC) and phosphatases (PP1 and calcineurin). The CREB regulatory unit may therefore serve to integrate signals from various signal transduction pathways. This ability to integrate signaling as well as mediate activation or repression may explain why CREB is so central to memory storage in different contexts (Martin & Kandel, 1996).
CHROMATIN REMODELING AND EPIGENETIC CHANGES DURING LONG-TERM MEMORY STORAGE Guan et al. (2002) used chromatin immunoprecipitation techniques to examine directly the role of CREB-mediated responses in the integration of synaptic signaling by studying the long-term interactions of two opposing modulatory transmitters important for behavioral sensitization in Aplysia. Toward that end, they utilized a single bifurcated sensory neuron that contacts two spatially separated postsynaptic neurons (Martin, Casadio, et al., 1997). They found that when a neuron receives 5-HT, and at the same time receives input from the inhibitory transmitter FMRFamide at another set of synapses, the synapsespecific long-term depression produced by FMRFamide dominates. These opposing inputs are integrated in the neuron’s nucleus and are evident in the repression of C/EPB, a transcription regulator downstream from CREB that is critical for long-term facilitation. Whereas 5-HT induces C/EPB by activating CREB1 and recruiting the CREBbinding protein, a histone acetylase, to acetylate histones, FMRFamide displaces CREB1 with CREB2 that recruits a histone deacetylase to deacetylate histones. When 5-HT and FMRFamide are given together, FMRFamide overrides 5-HT by recruiting CREB2 and the deacetylase to displace CREB1 and CBP, thereby inducing histone deacetylation and repression of C/EBP. Thus, both the facilitatory and inhibitory modulatory transmitters that are important for long-term memory in Aplysia activate signal transduction pathways that alter nucleosome structure bidirectionally through acetylation and deacetylation of histone residues in chromatin (Figure 27.5). The epigenetic marking of chromatin, by histone modifications, chromatin methylation, and the activity of retrotransposons, may have long-term consequences on transcriptional regulation of specific gene loci involved in long-term synaptic changes, and thus adds a new layer of complexity to our view of how nuclear function and synaptic activity affect one another (Guan et al., 2002; Hsieh & Gage, 2005; Levenson & Sweatt, 2005). As detailed,
8/18/09 6:25:14 PM
538
Synaptic and Cellular Basis of Learning
Figure 27.5 5-HT and FMRFamide alter nucleosome structure bidirectionally through acetylation and deacetylation of chromatin.
(A) Control C/EBP TATA box CREB-1
C/EBP CRE Ac C/EBP Enhancer
C/EBP Coding
(B) 5-HT Alone PKA
5-HT
Ac CREB-1 P
CBP Pol II
TBP CRE C/EBP Enhancer
C/EBP Ac C/EBP Coding
(C) FMRFa Alone FMRFa
Note. A: At the basal level, CREB1a resides on the C/ EBP promoter some lysine residues of histones are acetylated. B: 5-HT, through PKA, phosphorylates CREB1 that binds to the C/EBP promoter. Phosphorylated CREB1 then forms a complex with CBP at the promoter. CBP then acetylates lysine residues of the histones (e.g., KS of H4). Acetylation modulates chromatin structure, enabling the transcription machinery to bind and induce gene expression. C: FMRFamide activates CREB2, which displaces CREB1 from the C/EBP promoter. HDAC5 is then recruited to deacetylate histones. As a result, the gene is repressed. D: If the neuron is exposed to both FMRFamide and 5-HT, CREBa is replaced by CREB2 at the promoter even though it might still be phosphorylated through the 5-HT-PKA pathway, and HDAC5 is then recruited to deacetylate histones, blocking gene induction. From “Integration of Long-Term-Memory-Related Synaptic Plasticity Involves Bidirectional Regulation of Gene Expression and Chromatin Structure,” by Z. Guan et al., 2002, Cell, 111, p. 490. Reprinted with permission.
CREB-2 P
HDAC-5
p38 C/EBP TATA box X
C/EBP CRE C/EBP Enhancer
C/EBP Coding
(D) FMRFa + 5-HT P 5-HT
PKA
CBP
CREB-1 p38 CREB-2 P
HDAC-5
FMRFa
C/EBP TATA box X
C/EBP CRE C/EBP Enhancer
C/EBP Coding
the contribution of histone tail acetylation, a modification that favors transcription and is associated with active loci was first revealed for long-term facilitation by Guan et al. (2002) in Aplysia. In addition to finding that facilitatory and inhibitory stimuli alter, bidirectionally, the acetylation stage and structure of promoters driven by the expression of genes involved in the maintenance of long-term facilitation, such as C/EBP, this study also demonstrated that
c27.indd Sec5:538
enhancing histone acetylation with deacetylase (HDAC) inhibitors facilitates the induction of long-term facilitation. HDAC inhibitors have now been shown to enhance L-LTP in the Schaffer collateral pathway of mammals and memory formation in hippocampus-dependent tasks (Alarcon et al., 2004; Korzus, Rosenfeld, & Mayford, 2004; Levenson & Sweatt, 2005; Yeh, Lin, & Gean, 2004). Conversely, mice with reduced histone acetyltransferase
8/18/09 6:25:14 PM
Consolidation of Long-Term Memory 539
activity have deficits in both long-lasting forms of memory and LTP (Alarcon et al., 2004; Bourtchouladze et al., 2003; Korzus et al., 2004; Wood et al., 2005). These results indicate that critical chromatin remodeling occurs during the formation of long-term memory, and that these nuclear changes are required for the stable maintenance of memory storage. Synapse to Nucleus Signals The transcriptional switch for the conversion of shortto long-term memory requires not only the activation of CREB1 but also the removal of the repressive action of CREB2, which lacks consensus sites for PKA phosphorylation (Bartsch et al., 1995). ApCREB2 does, however, have both protein kinase C and mitogen-activated protein kinase (MAPK) phosphorylation sites and MAPK is activated by 5-HT in Aplysia neurons. Martin, Michael, et al., (1997) and Michael and Martin (1998) examined the subcellular localization of an Aplysia ERK2 homologue in sensory-to-motor neuron co-cultures during short- and long-term facilitation. Whereas MAPK immunoreactivity was predominantly localized to the cytoplasm in both sensory and motor neurons during short-term facilitation, MAPK translocated into the nucleus of the presynaptic sensory neuron but not in the postsynaptic motor cell during 5-HT-induced long-term facilitation. Presynaptic but not postsynaptic nuclear translocation of MAPK was also triggered by elevations in intracellular cAMP, indicating that the cAMP pathway activates the MAPK pathway in a neuron-specific manner. Injection of either anti-MAPK antibodies or MAPK inhibitors (PD98059) into the presynaptic sensory cell selectively blocked long-term facilitation without affecting short-term facilitation. Thus, like PKA, MAPK translocates to the nucleus with prolonged 5-HT treatment so as to activate the activators (CREB1) and relieve the repressors (CREB2; Martin, Michael, et al., 1997). The involvement of MAPK in long-term plasticity may be quite general: Martin, Michael, et al., (1997) found that cAMP also activated MAPK in mouse hippocampal neurons, suggesting that MAPK may play a role in hippocampal long-term potentiation. The requirement for MAPK during hippocampal LTP has been shown by English and Sweatt (1996, 1997), who demonstrated that ERK1 is activated in CAl pyramidal cells during LTP and that bath application of MAPK kinase inhibitors blocks LTP. CONSOLIDATION OF LONG-TERM MEMORY The activation of adenylyl cyclase by 5-HT, the increase in cAMP concentration with the resultant dissociation of
c27.indd Sec6:539
the catalytic subunit of PKA and its translocation to the nucleus, as well as the phosphorylation of CREB1 are all unaffected by inhibitors of RNA or protein synthesis. Where then does the RNA and protein synthesis–dependent step that characterizes the consolidation phase of long-term memory appear? It requires an additional step—the synthesis of proteins encoded by the genes whose expression is induced by CREB1 and repressed by CREB2. To examine which genes are downstream from CREB1, Alberini et al. (1994) characterized the intermediary, immediate-early genes induced by cAMP and CREB. In a search for possible cAMP-dependent regulatory genes that might be interposed between constitutively expressed transcription factors and stable effector genes, Alberini and colleagues (1994) focused on the CCAAT-box-enhancerbinding protein (C/EBP) transcription factors. They cloned an Aplysia C/EBP homologue (ApC/EBP) and found that its expression was induced by exposure to 5-HT. Inhibition of ApC/EBP activity blocked long-term facilitation but had no effect on short-term facilitation. Thus, the induction of ApC/EBP seems to serve as an intermediate component of a molecular switch activated during the consolidation period. The existence of C/EBP, a cAMP-regulated immediateearly gene that is itself a transcription factor and regulates other genes, leads to a model of sequential gene activation. CREB1a, CREB1b, CREB1c, and CREB2 represent the first level of control because all are constitutively expressed. Stimuli that lead to long-term facilitation disturb the balance between CREB1-mediated activation and CREB2-mediated repression, through the action of PKA, MAPK, and possibly other kinases. This leads to the upregulation of a family of immediate-early genes. Some of these immediate-early genes are transcription factors such as C/EBP; others are effectors, such as ubiquitin hydrolase, that contribute to consolidation by either extending the inducing signal or initiating the changes at the synapse that cause long-term facilitation. Activity-Dependent Modulation of Cell Adhesion Molecules and the Initiation of Learning-Related Synaptic Growth How does this sequential gene activation lead to the growth of new sensory neuron synapses? Since the functional and structural changes that accompany long-term sensitization in Aplysia require new protein synthesis, Barzilai, Kennedy, Sweatt, and Kandel (1989) utilized quantitative two-dimensional gels and [355] methionine incorporation to examine changes in specific proteins in the sensory neurons in response to 5-HT. They found that 5-HT initiates a large increase in overall protein synthesis during training. Moreover, beyond these overall effects, 5-HT also
8/18/09 6:25:15 PM
540
Synaptic and Cellular Basis of Learning
produces three temporally discrete sets of changes in specific proteins that could be resolved on two-dimensional gels. First, 5-HT induces a rapid and transient increase at 30 minutes in the rate of synthesis of 10 proteins and a transient decrease in five proteins that subside within 1 hour and are in all cases dependent on transcription. These early changes are followed by at least two further rounds of changes in the expression of specific proteins, some of which are transient, and some of which persist for at least 24 hours. The 15 early proteins induced by repeated exposure to 5-HT can also be induced by cAMP. Of the 15 early proteins Barzilai et al. (1989) observed to be specifically altered in expression during the acquisition of long-term facilitation, six have now been identified. Two proteins that increase (clathrin and tubulin) and four proteins that decrease their level of expression (NCAM-related cell adhesion molecules) all seem to relate to the 5-HT-induced structural changes. Mayford, Barzilai, Keller, Schacher, and Kandel (1992) first focused on the four proteins, D1 to D4, that decrease their expression in a transcriptionally dependent manner following the application of 5-HT or cAMP and found that they encoded different isoforms of an immunoglobulin-related cell adhesion molecule, which is homologous to NCAM in vertebrates and Fasciclin II in Drosophila. Imaging of fluorescently labeled MAbs to apCAM indicates that not only is there a decrease in the level of expression but that even preexisting protein is lost from the surface membrane of the sensory neurons within 1 hour after the addition of 5-HT (Mayford et al., 1992). This transient modulation by 5-HT of cell adhesion molecules, therefore, may represent one of the early molecular steps required for initiating learning-related growth of synaptic connections. Blocking the expression of the antigen by MAb causes defasciculation, a step that appears to precede synapse formation during development in Aplysia (Keller & Schacher, 1990). To examine the mechanisms that underlie the 5-HTinduced down-regulation of apCAM and, in particular, how these relate to the initiation of synaptic growth, Bailey, Chen, Keller, and Kandel (1992) combined thinsection electron microscopy with immunolabeling using a gold-conjugated MAb specific to apCAM. They found that a 1-hour application of 5-HT led to a 50% decrease in the density of gold-labeled apCAM complexes at the surface membrane of the sensory neuron. This down-regulation was particularly prominent at adherent processes of the sensory neurons and was achieved by a heterologous, protein synthesis–dependent activation of the endosomal pathway, leading to internalization and apparent degradation of apCAM. As is the case for the down-regulation at the level of expression, the 5-HT-induced internalization of apCAM can be simulated by cAMP. Concomitant with the
c27.indd Sec6:540
down-regulation of apCAM, Hu, Barzilai, Chen, Bailey, and Kandel (1993) further demonstrated that, as part of this coordinated program for endocytosis, 5-HT and cAMP also induce an increase in the number of coated pits and coated vesicles in the sensory neurons and an increase in the expression of the light chain of clathrin (apClathrin). Because the apClathrin light chain contains the important functional domains of both LCa and LCb of mammalian clathrin thought to be essential for the coated pit assembly and disassembly, the increase in clathrin may be an important component in the activation of the endocytic cycle required for the internalization of apCAM. The learning-induced internalization of apCAM is thought to have at least two major structural consequences: (1) disassembly of homophilically associated fascicles of the sensory neurons (defasciculation), a process that may destabilize adhesive contacts normally inhibiting growth; and (2) endocytic activation that may lead to a redistribution of membrane components to sites where new synapses form. Thus, aspects of the initial steps in the learningrelated growth of synaptic connections that is a hallmark of the long-term process may eventually be understood in the context of a novel and targeted form of receptormediated endocytosis. To further define the mechanisms whereby 5-HT leads to apCAM down-regulation, Bailey and colleagues (1997) used epitope tags to examine the fate of the two apCAM isoforms (transmembrane and GPI-linked) and found that only the transmembrane form (TM-apCAM) is internalized (Figure 27.6). This internalization was blocked by overexpression of TM-apCAM with a point mutation in the two MAPK phosphorylation consensus sites, as well as by injection of a specific MAPK antagonist into sensory neurons. These data suggest that activation of the MAPK pathway is important for the internalization of TMapCAM and may represent one of the initial and perhaps permissive stages of learning-related synaptic growth in Aplysia. Furthermore, the combined actions of MAPK both in the cytoplasm and in the nucleus suggest that MAPK plays multiple roles in long-lasting synaptic plasticity and appears to regulate each of the two distinctive processes that characterize the long-term process: activation of transcription and growth of new synaptic connections. Han, Lim, Kandel, and Kaang (2004) examined more closely the relationship between the 5-HT-induced downregulation of TM-apCAM and synaptic growth by overexpressing various HA-epitope tagged recombinant apCAMs in Aplysia sensory neurons. They found that overexpression of TM-apCAM, but not the GPI-linked isoform of apCAM, blocked both long-term facilitation as well as the associated increase in the number of sensory neuron varicosities. By interrupting the adhesive function of apCAM
8/18/09 6:25:15 PM
Consolidation of Long-Term Memory 541
MN
Transmembrane Isoform
tail portion of apCAM alone. These studies indicated that the extracellular domain of TM-apCAM has an inhibitory function that is neutralized by internalization to induce long-term facilitation and suggested that the cytoplasmic domain provides an interactive platform for both signal transduction and the internalization machinery.
SN
Nuclear Translocation of apCAM-Associated Protein (CAMAP) and Induction of Long-Term Facilitation
GPI-Linked Isoform
5-HT MN
SN
Transmembrane Isoform of apCAM
GPI-Linked Isoform of apCAM
Figure 27.6 Regional specific down-regulation of the transmembrane isoform of apCAM. Note. This model is based on the assumption that the relative concentration of the GPI-linked versus transmembrane isoforms of apCAM is highest at points of synaptic contact between the sensory neuron and motor neuron and reflects the results of studies done in dissociated cell culture. Thus, previously established connections might remain intact following exposure to 5-HT since they would be held in place by the adhesive, homophilic interactions of the GPI-linked isoforms and the process of outgrowth from sensory neuron axons would be initiated by down-regulation of the transmembrane form at extrasynaptic sites of membrane apposition. In the intact ganglion, the axons of sensory neurons are likely to fasciculate not only with other sensory neurons but also with the processes of other neurons and perhaps even glia. One of the attractive features of this model is that the mechanism for down-regulation is intrinsic to the sensory neurons. Thus, even if some of the sensory neuron axonal contacts in the intact ganglion were heterophilic in nature, that is, with other neurons or glia, we would still expect the selective internalization of apCAM at the sensory neuron surface membrane at these sites of heterophilic apposition to destabilize adhesive contacts and to facilitate disassembly. From “Mutation in the Phosphorylation Sites of MAP Kinase Blocks LearningRelated Internalization of apCAM in Aplysia Sensory Neurons,” by C. H. Bailey et al., 1997, Neuron, 18, p. 921. Reprinted with permission.
with an anti-HA antibody, this inhibition of long-term facilitation induced by the overexpression of TM-apCAM was restored. Moreover, long-term facilitation could be completely blocked by overexpression of the cytoplasmic
c27.indd Sec7:541
S. H. Lee at al. (2007) examined the 5-HT-induced signaling interactions mediated by the cytoplasmic domain of TM-apCAM and found an additional, and novel role for this cell adhesion molecule in synapse-specific forms of long-lasting plasticity. As outlined, long-term facilitation at the sensory to motor neuron synapse requires the activation of CREB1 in the nucleus of the sensory neuron (Bartsch, Casadio, Karl, Serodio, & Kandel, 1998). Activated CREB1 induces the transcription factor ApC/ EBP that in turn acts on downstream genes encoding proteins important for synaptic growth and the stable maintenance of long-term facilitation (Alberini et al., 1994). An initial step, thought to be permissive, for the initiation of learning-related growth is the clathrinmediated internalization and consequent down-regulation of TM-apCAM. To examine directly how the internalization of TMapCAM is related to the initiation of nuclear transcription, S. H. Lee et al. (2007) first looked for molecules that could bind to the cytoplasmic tail of TM-apCAM and cloned an apCAM-associated protein (CAMAP) by yeast two-hybrid screening. They found that 5-HT signaling at the synapse activates PKA which in turn phosphorylates CAMAP to induce the dissociation of CAMAP from apCAM and that this dissociation is a prerequisite for the internalization of apCAM. The 5-HT-induced dissociated CAMAP is subsequently translocated to the nucleus of the sensory neurons. In the nucleus, CAMAP acts as a transcriptional co-activator for CREB1 that is essential for the activation of ApC/ EBP required for the initiation of long-term facilitation. Combined, these data suggest that CAMAP is one of the retrograde signals from the synapse to the nucleus where it acts as a co-regulator of the presynaptic gene expression associated with the induction of long-term facilitation in Aplysia. In addition, these findings demonstrate the importance, for learning-related synaptic plasticity, of signal propagation into the nucleus from the surface membrane of activated synaptic sites mediated by a molecule directly interacting with a cell surface adhesion molecule and suggest a novel presynaptic molecular mechanism to turn on the gene transcription required for long-term memory.
8/18/09 6:25:15 PM
542
Synaptic and Cellular Basis of Learning
STABILIZATION OF LEARNING-RELATED SYNAPTIC GROWTH In addition to transcription in the nucleus and protein synthesis in the cell body, long-term memory also requires a second site of local protein synthesis at the synapse. A number of distinct mRNAs have been localized in the axons of Aplysia and in the dendrites of rodent hippocampal neurons (for review, see Steward & Schuman, 2001, 2003). The molecular mechanisms that target these mRNAs to the synapse are largely unknown, but some are carried by the kinesin motors, the key anterograde transport machinery (Puthanveettil et al., 2008). Some of these mRNAs are thought to involve the recognition of cis-acting elements in their 3⬘untranslated region by specific RNA-binding proteins that interact with the cytoskeleton. Once transported to the synaptic compartments, these mRNAs are translated only after docking at active synaptic sites, a process frequently referred to as synaptic or local protein synthesis. Regulation of local protein synthesis plays a major role in the control of synaptic strength at the sensory to motor neuron connection in Aplysia and during L-LTP in the hippocampus. Martin, Casadio, et al. (1997) first investigated the role of local protein synthesis in an Aplysia culture system in which a single bifurcated sensory neuron was plated in contact with two spatially separated gill motor neurons. In this system, repeated application of 5-HT to one synapse produces a CREB-mediated, synapse-specific long-term facilitation that can be blocked by the local application of inhibitors of translation, suggesting that local protein synthesis at the synapse is required as part of the retrograde signaling cascade for the initiation of synapse-specific long-term facilitation. Subsequent studies by Casadio et al. (1999) found, in addition, that long-term synapse-specific facilitation induced by 5-HT in Aplysia requires local protein synthesis for the stable maintenance of learning-induced synaptic growth. Similarly, in the hippocampus, the induction of LTP in the Schaffer collateral pathway is accompanied by the transport of polysomes from dendritic shafts to active spines of CA1 neurons, suggesting a critical role for local protein synthesis in the morphological changes associated with LTP (Ostroff, Fiala, Allwardt, & Harris, 2002), and local inhibition of protein synthesis blocks L-LTP in the Schaffer collateral pathway (Bradshaw, Emptage, & Bliss, 2003; Cracco, Serrano, Moskowitz, Bergold, & Sacktor, 2005). Following the sending of a retrograde signal to the nucleus and the subsequent transcriptional activation, newly synthesized gene products, both mRNAs and proteins, have to be delivered by kinesin-mediated fast axonal transport (Puthanveettil & Kandel, 2006) specifically to the synapses whose activation originally triggered
c27.indd Sec7:542
the wave of gene expression. To explain how this specificity can be achieved in a biologically economical way given the massive number of synapses in a single neuron, Martin, Casadio, et al. (1997) and Frey and Morris (1997) proposed the synaptic capture hypothesis. This hypothesis, also referred to as synaptic tagging, proposes that the products of gene expression are delivered throughout the cell, but are only functionally incorporated in those specific synapses that have been tagged by previous synaptic activity. The synaptic tag model has now been supported by a number of studies both Aplysia (Casadio et al., 1999; Martin, Casadio, et al., 1997) and in the rodent hippocampus (Barco et al., 2002; Dudek & Fields, 2002; Frey & Morris, 1997, 1998). Studies of synaptic capture at the synapses between the sensory and motor neurons of the gill-withdrawal reflex in Aplysia have further demonstrated that the production of CRE-driven gene products in the nucleus is not sufficient to achieve synapse-specific long-term facilitation. One also needs a PKA-mediated covalent signal to mark the stimulated synapses, and consequent local protein synthesis to stabilize that mark (Casadio et al., 1999; Martin, Casadio, et al., 1997). Thus, injection into the cell body of phosphorylated CREB-1 gives rise to long-term facilitation at all the synapses of the sensory neuron, but this facilitation is not maintained beyond 24 to 48 hours unless one of the synapses is also marked by triggering the short-term process with a single pulse of 5-HT (Casadio et al., 1999). Once marked that synapse and only that synapse shows maintained facilitation and growth. Experiments in the rat hippocampus by Frey and Morris have demonstrated, in turn, that once transcription-dependent LTP has been induced at one pathway, the long-term process can be “captured” at a second pathway receiving a single train that would normally produce only E-LTP. The stimulus for the short-term process causes a transient potentiation and, in addition, marks the synaptic terminals, enabling the capture of the newly expressed gene products. The properties of synaptic capture observed for intracompartmental capture in hippocampal CA1 neurons are similar to those described in the bifurcated sensory neurons of Aplysia (Martin, Casadio, et al., 1997). However, in mammals, where there are two dendritic compartments—apical and basal, the tag appears to be restricted to specific dendritic compartments, and additional mechanisms are required to capture across compartments (Alarcon, Barco, & Kandel, 2006). The finding of two distinct components for the marking signal in Aplysia first suggested that there is a mechanistic distinction between the initiation of long-term synaptic plasticity and synaptic growth (which requires only nuclear transcription and central translation but does not require
8/18/09 6:25:16 PM
Stabilization of Learning-Related Synaptic Growth
local protein synthesis) and the stable maintenance of the long-term functional and structural changes that requires, in addition, local protein synthesis at the synapse. How might this local protein synthesis at the synapse, which is necessary for stabilizing synaptic growth and long-term plasticity, be regulated? The control of local translation at the synapse is likely to be complex and involve several different mechanisms, including different types of mRNA transport and docking, cytoplasmic poly adenylation, mTOR which is the target of the selective protein synthesis inhibitor rapamycin (Cammalleri et al., 2003; Purcell, Sharma, Bagnall, Sutton, & Carew, 2003), and the phosphorylation of different translation factors (see review by Sutton & Schuman, 2005). Since mRNAs are made in the cell body, the need for the local translation of some mRNAs suggests that these mRNAs may be dormant before they reach the activated synapse. If that were true, one way of activating protein synthesis at the synapse would be to recruit a regulator of translation that is capable of activating translationally dormant mRNAs. Si, Lindquist, and Kandel (2003) began to search for such a molecule by focusing on the Aplysia homolog of CPEB (cytoplasmic polyadenylation elementbinding protein), a protein capable of activating dormant mRNAs through the elongation of their polyA tail. CPEB was first identified in oocytes and subsequently in hippocampal neurons. In Aplysia, a novel, neuron-specific isoform of CPEB is present in the processes of sensory neurons and stimulation with 5-HT increases the amount of CPEB protein at the synapse. The induction of CPEB is independent of transcription but requires new protein synthesis and is sensitive to rapamycin and to inhibitors of P13 kinase. Moreover, the induction of CPEB coincides with the polyadenylation of neuronal actin, and blocking CPEB locally at the activated synapse blocks the long-term maintenance of synaptic facilitation but not its early expression at 24 hours. Thus, CPEB has all the properties required of the local protein synthesis–dependent component of marking and supports the idea that there are separate mechanisms for initiation of the long-term process and its stabilization. Moreover, these data suggest that the maintenance but not the initiation of long-term synaptic plasticity requires a new set of molecules in the synapse and some of these new molecules are made by CPEB-dependent translational activation. Interestingly, a structurally similar neuronal isoform of CPEB, CPEB-3, has been found in mouse hippocampal neurons, where it is induced by the neurotransmitter dopamine (Theis, Si, & Kandel, 2003). How might CPEB stabilize the late phase of long-term facilitation? As outlined above, the stability of long-term facilitation seems to result from the persistence of structural changes at sensory neuron synapses, the decay of which
c27.indd Sec8:543
543
parallels the decay of the behavioral memory. These 5-HT-induced structural changes at the synapses between sensory and motor neurons include the remodeling of preexisting facilitated synapses, as well as the growth and establishment of new synaptic connections. The reorganization and growth of new synapses have two broad requirements: (1) structural (changes in shape, size, and number) and (2) regulatory (where and when to grow). The genes involved in both of these aspects of synaptic growth might be potential targets of apCPEB. The structural aspects of the synapses are dynamically controlled by reorganization of the cytoskeleton, which can be achieved either by redistribution of preexisting cytoskeletal components or by their local synthesis. Construction of cDNA libraries from the isolated axonal neurites of Aplysia sensory neurons has facilitated identification of mRNAs that encode structural proteins such as ␣1-tubulin and N-actin as well as translational elements including CPEB, the elongation factor eEF1␣ and several ribosomal proteins (Moccia et al., 2003; Moroz et al., 2006). Most of these transcripts are localized in the distal axonal processes of the sensory neurons and are inactive before synaptic stimulation. ApCPEB is capable of activating dormant mRNAs by elongating their polyA tails. Stimulation with 5-HT increases the amount of ApCPEB protein at the synapse and this in turn could lead to the local activation of mRNAs encoding both structural proteins (tubulin and N-actin) and regulatory molecules such as EphA2, CAMKII, and members of the ephrin family (Brittis, Lu, & Flanagan, 2002). Thus, CPEB might contribute to the stabilization of learning-related synaptic growth by controlling the local synthesis of both the cytoskeletal components of the synapse as well as regulatory molecules important for synaptic maturation. Biological molecules have a relatively short half-life (hours to days) compared to the duration of memory (days, weeks, even years). How then can the learning-induced alterations in the molecular composition of a synapse be maintained for such a long time? Most answers to this elusive question rely on some type of self-sustained mechanism that can somehow modulate synaptic strength and synaptic structure. For example, Malinow and colleagues have proposed that two regulatory pathways control the insertion and removal of AMPA receptors at the synapse: the maintenance pathway is always on and controls the constant turnover of receptor subunits, whereas the constructive pathway is only turned on during LTP induction (Malinow et al., 2000; Malinow & Malenka, 2002). The activation of the constructive pathway and insertion of new AMPARs would cause the growth and/or maturation of postsynaptic densities enabling the formation of new memories, whereas the maintenance pathway would be responsible for their stabilization (Hayashi et al., 2000;
8/18/09 6:25:16 PM
544
Synaptic and Cellular Basis of Learning
Lisman & Zhabotinsky, 2001). Another interesting model for long-term memory storage was suggested by Crick (1984) who proposed that autocatalytic kinases might provide the molecular mechanism for long-lasting, selfmaintained changes in synaptic function. Lisman further developed this idea based on the autocatalytic properties of the calcium/calmodulin-dependent protein kinase II (CaMKII; Lisman & Zhabotinsky, 2001). Kandel and Si proposed a model based on the prionlike properties of the Aplysia neuronal isoform of CPEB to explain how a population of unstable molecules can produce a stable change in synaptic form and function (Si, Lindquist, et al., 2003). CPEB has two conformational states: one is inactive or acts as a repressor, while the other is active. In a naive synapse, the basal level of CPEB expression is low and its state is inactive or repressive. However, if a given threshold is reached, CPEB switchs to the prion-like state, which activates the translation of dormant mRNAs through the elongation of their poly-A tail (Si, Giustetto, et al., 2003). Once the prion state is established at an activated synapse, dormant mRNAs, made in the cell body and distributed cell-wide, would be translated only at the activated synapses. Because the activated CPEB can be self-perpetuating, it could contribute to a self-sustaining, synapse-specific long-term molecular change and provide a mechanism for the stabilization of learning-related synaptic growth and the persistence of memory storage. These molecular mechanisms are not mutually exclusive: the synaptic translation of CaMKII mRNAs can be regulated by CPEB, and the synthesis and trafficking of new AMPAR subunits may require CaMKII activity as well as enhanced protein synthesis (Burgin et al., 1990; Y. S. Huang, Jung, Sarkissian, & Richter, 2002; Ouyang, Rosenstein, Kreiman, Schuman, & Kennedy, 1999).
SUMMARY Molecular genetics has brought about a dramatic unification within the biological sciences. A major advancement in our understanding of genes, their expression, and the structure of the proteins they encode has led to a refined appreciation of the conservation of cellular function at the molecular level that now provides a common conceptual framework for several, previously unrelated, disciplines: cell biology, biochemistry, development, immunology, and neurobiology. A parallel and potentially more profound unification is occurring between cognitive psychology—the science of mind—and neural science—the science of the brain. The ability to study the biological basis of mental function is providing a new paradigm for examining cognitive processes such as perception, language, learning, and
c27.indd Sec8:544
memory. As we outlined in this chapter, one of the key unifying findings emerging from the molecular study of both implicit and explicit memory processes is the unexpected realization that these distinct forms of memory, that differ not only in the neural systems involved, but also in the nature of the information stored, nevertheless may recruit the same restricted set of molecular logic for their longterm representation. Thus, whereas animals and humans are capable of a wide variety of learning processes that utilize a number of different second messenger and signaling cascades, they may share a common set of molecular mechanisms for the storage of long-term memory. In Aplysia, these mechanisms include a core sequence of three steps. First, the initiation step involves the PKA-mediated activation of CREB1 and the concomitant MAPK-mediated derepression of CREB2. Second, the consolidation step involves the induction by CREB1 of a set of immediate early genes such as the C-terminal ubiquitin hydrolase and the transcription factor ApC/EBP. Third, the stabilization step involves the down-regulation of apCAMs and the consequent remodeling of preexisting synapses and the growth of new synaptic connections. For synapse-specific forms of long-term facilitation, the local generation of a retrograde signaling cascade at the synapse travels to the nucleus to activate the transcriptional machinery. Newly synthesized gene products, both mRNAs and proteins, are then delivered specifically to the synapses whose activation originally triggered the wave of gene expression (Figure 27.7). Since many studies in the vertebrate brain have now found that immediate early genes are induced in the hippocampus and certain regions of the neocortex by treatments that lead to LTP, it will be of particular interest to investigate whether genes induced by CREB, perhaps of the C/ EBP family, are also required for long-term synaptic modifications in mammals. That the late phase of mossy fiber, Schaffer collateral, and perforant pathway LTP involves cAMP raises the additional, attractive possibility that, in the hippocampus as well, cAMP and PKA are recruited because they may be able to access the signaling pathways and transcriptional machinery required for synaptic growth and the persistence of memory storage. Perhaps the most striking findings to emerge from the cellular and molecular studies of memory storage in Aplysia and the mammalian brain are that long-term memory involves both transcription in the nucleus and structural changes at the synapse. The structural changes associated with the storage of long-term memory can be grouped into two general categories: remodeling of preexisting synapses and growth of new synapses (Bailey & Kandel, 1993; Bailey et al., 2004; Greenough & Bailey, 1988; Lamprecht & LeDoux, 2004; Yuste & Bonhoeffer,
8/18/09 6:25:16 PM
Summary
CRE
CREB-2
CRE Early 6
Early
5
CREB-1 MAPK
5HT 1
Ubiquitin Hydrolase
cAMP
AC Tail
3
Nucleus TAAC Late
CAAT Late C/EBP
4
C/EBP+ AF AF
C/EBP
Persistent Kinase
Effectors for Synaptic Growth
G 2 PKA
545
7 apCAM
K+ Channel
8
9 Ca2+ Channel 10
11 AMPA
NMDA
NMDA
Note. Long-term synaptic plasticity contributing to learning and memory involves a sequence of cellular, molecular, and structural mechanisms including 1: neurotransmitter release and short-term strengthening of synaptic connections, 2: equilibrium between kinase and phosphatase activities at the synapse, 3: retrograde transport from the synapse to the nucleus, 4: activation of nuclear transcription factors, 5: activity-dependent induction of gene expression, 6: chromatin alteration and epigenetic
changes in gene expression, 7: synaptic capture of newly synthesized gene products, 8: local protein synthesis at active synapses, 9: synaptic growth and the formation of new synapses, 10: activation of preexisting silent synapses, and 11: self-perpetuating mechanisms and the molecular basis of memory persistence. The location of these events, which may act in part to stabilize some of the changes that occur during short- and intermediate-term plasticity, moves from the synapse (1–2) to the nucleus (3–6) and then back to the synapse (7–11). Molecular details are discussed in the text.
2001). Despite an increasing body of evidence for changes in the number or structure of synaptic connections and long-term memory, it has so far proven difficult to follow individual structural changes at the same synapse over time and to relate directly this remodeling to physiological function and memory storage. Studies have shown that activity-dependent remodeling of preexistingsynapsesandthegrowthofnewsynapticconnections occurs in the mammalian CNS (Buchs & Muller, 1996; Colicos et al., 2001; De Paola, Arber, & Caroni, 2003; Engert & Bonhoeffer, 1999; Greenough & Bailey, 1998; Maletic-Savatic, Malinow, & Svoboda, 1999; Toni, Buchs, Nikonenko, Bron, & Muller, 1999). However, in the mammalian brain, these structural changes are difficult to study because the effects are often modest. Moreover, the specific role of this structural plasticity
remains unclear because the functional contribution of individual synapses to memory processes in these more complex neuronal networks is not yet well defined (Hayashi & Majewska, 2005; Lamprecht & LeDoux, 2004; Segal, 2005). For example, although the generation and enlargement of dendritic spines has been associated with the production of LTP and synaptic activity in organotypic hippocampal slices (Matsuzaki, Honkura, Ellis-Davies, & Kasai, 2004; Nagerl, Eberhorn, Cambridge, & Bonhoeffer, 2004) and acute slices of neonatal animals (Zhou, Homma, & Poo, 2004), these structural changes are much more subtle in the adult brain (Lang et al., 2004). In adults, there is only a modest production of new spines (Zuo, Lin, Chang, & Gan, 2005), and learning-related plasticity seems to rely more on subcellular changes than on anatomical changes. Thus, neuronal activity
Figure 27.7 Mechanisms of long-term memory formation.
c27.indd Sec9:545
AMPA
8/18/09 6:25:16 PM
546
Synaptic and Cellular Basis of Learning
regulates the transport of polysomes from dendritic shafts to active spines (Ostroff et al., 2002), as well as the trafficking of neurotransmitter receptors (Malinow & Malenka, 2002). By contrast, in Aplysia the learning-induced structural changes that accompany long-term sensitization in vivo and long-term facilitation in vitro are robust, highly reproducible, and easy to study and can be shown to be both functionally effective and capable of contributing to memory storage. Time-lapse imaging studies of the sensory to motor neuron synapse in culture have revealed that LTF is accompanied by two temporally and morphologically distinct classes of presynaptic structural change: the rapid activation of silent preexisting varicosities by filling with synaptic vesicles and the slower growth of new functional varicosities. These findings, the first to be made on individually identified presynaptic varicosities, suggest that the duration of the changes in synaptic effectiveness that accompany memory storage may be reflected by the differential regulation of two fundamentally disparate forms of presynaptic compartment: (1) nascent (silent) varicosities that can be rapidly and reversibly remodeled into active transmitter release sites and (2) mature, more stable, and functionally competent varicosities that following long-term training may undergo a process of fission to form new stable synaptic contacts. The increasing morphological correspondence between the studies of long-term sensitization in Aplysia and LTP in the mammalian hippocampus indicates that learning may resemble a process of neuronal growth and differentiation across a broad segment of the animal kingdom and suggests that new synapse formation may be a highly conserved feature for the storage of both implicit and explicit forms of long-term memory. One of the unifying principles emerging from these studies is that despite the different ways by which each form of memory is induced, the subsequent steps required for conversion of their short-term memory to one of longer duration may be similar. This apparent similarity in some of the molecular steps may be because for both implicit and explicit memory storage the synaptic growth is likely to represent the final and self-sustaining change that stabilizes the long-term process. Despite this association, surprisingly little is known about the molecular mechanisms that underlie learning-related changes in the structure of the synapse. Recent studies of the synaptic growth that accompanies long-term memory in Aplysia have begun to characterize the sequence of molecular events responsible for both the initiation and persistence of the structural change. This in turn has revealed that specific molecules and mechanisms important for de novo synapse formation during the development of the nervous system can be reutilized in the adult for the purposes of synaptic plasticity and memory storage.
c27.indd Sec9:546
These studies indicate that long-term memory involves the flow of information from receptors on the cell surface to the genome, as seen in other processes of cell differentiation and growth. Such changes may reflect the recruitment by environmental stimuli of developmental processes that are latent or inhibited in the fully differentiated neuron. An increasing body of evidence suggests that the cell and molecular changes accompanying long-term memory storage share several features in common with the cascade of events that underlie synapse formation during neuronal development. In both cases, the structural change exhibits a requirement for new protein and mRNA synthesis. These alterations in transcriptional and translational state can be initiated in the long-term process by the repeated or prolonged exposure to modulatory transmitters that, in this respect, appear to mimic the effects of growth factors and hormones during the cell cycle and differentiation. Thus, modulatory transmitters important for learning and memory activate not only the cytoplasmic second-messenger cascades required for the short-term process, but also activate a nuclear messenger system by which the transmitter can exert long-term regulation over the excitability and ultimately, the architecture of the neuron through changes in gene expression. Studies in Aplysia have further demonstrated that the earliest stages of long-term memory formation are associated with modulation of an immunoglobulin-related cell adhesion molecule homologous to NCAM. With the emergence of the nervous system, the Aplvsia NCAM becomes expressed exclusively in neurons and is specifically enriched at synapses. These cell adhesion molecules are maintained into adulthood, at which point they can be down-regulated by 5-HT, a modulatory transmitter important for both sensitization and classical conditioning in Aplysia and by cAMP, a second-messenger activated by 5HT. This down-regulation appears to serve as a preliminary and permissive step for the growth of synaptic connections that accompany the long-term process. Thus, a molecule used during development for cell adhesion and axon outgrowth is retained into adulthood, at which point it seems to restrain or inhibit growth until the molecule is rapidly and transiently decreased at the cell surface by a modulatory transmitter important for learning. The finding that 5HT leads to the rapid down-regulation of only one isoform of apCAM (the transmembrane isoform) and not the others (the GPI-linked isoforms) raises the interesting possibility that learning-related synaptic growth in the adult may be initiated by an activity-dependent recruitment of specific isoforms of adhesion molecules, similar to the modulation of cell-surface receptors during the fine-tuning of synaptic connections in the developing nervous system. One consequence of isoform recruitment is that it would allow
8/18/09 6:25:17 PM
References 547
neuronal activity to regulate the surface expression of each isoform, a process that might take on additional functional significance if these surface molecules were distributed differentially along the three-dimensional extent of the neuron. These studies also suggest that processing and storage of information in the nervous system may rely on the same mechanisms utilized by other cells in the body to organize and regulate membrane trafficking important for growth. Findings in other invertebrate systems and the mammalian brain suggest that the modulation of cell adhesion molecules is important for long-lasting forms of both developmental and learning-related synaptic plasticity (Benson, Schnapp, Shapiro, & Huntley, 2000; Fields & Itoh, 1996; Martin & Kandel, 1996; Murase & Schuman, 1999; Washbourne et al., 2004). Indeed, a number of studies in vertebrates have now shown that at critical developmental stages, the refinement of synaptic connections, both their growth and regression, is determined by an activitydependent process that seems related to LTP in the hippocampus (Antonini & Stryker, 1993; Constantine-Paton, Cline, & Debski, 1990; Goodman & Shatz, 1993). Finally, insights from the molecular studies of learning and memory in Aplysia suggest that the critical time window for new macromolecular synthesis that is a ubiquitous feature of long-term memory storage may be explained by a cascade of gene activation whereby one or more immediate-early genes control the transcription of late effector genes. The biological significance of an immediateearly–gene-dependent response in long-term plasticity may reside in the necessity of a convergent checkpoint that turns on a genetic program similar to the cascade of gene activation during cell differentiation. As is the case for development, in long-term memory, a convergent checkpoint and cascade of gene activation may be critical to preserve important functions that ultimately rely on a small number of cells. Critical time windows have been previously described in other contexts, especially as part of developmental processes. For example, establishment of the differentiated state in DNA viruses often requires a sequence of gene activation whereby early regulatory genes lead to the maintained expression of later effector genes. A similar time window is evident in the later stages of Drosophila development where the steroid hormone ecdysone induces growth and moulting by altering the expression of early genes that turn on the expression of later genes (Ashburner, 1990). The similarity between these critical periods and the one found in long-term memory suggests that aspects of the regulatory mechanisms underlying learning-related synaptic plasticity in the adult may eventually be understood in the context of the basic molecular program used to refine synaptic connections during the later stages of neuronal
c27.indd Sec10:547
development. Both processes appear to share a cascade of gene activation, with a critical time window during which the differentiated state is still labile and can be modified. That this feature is particularly well-developed in neurons, which characteristically remain plastic throughout most of their life cycle, and can grow and retract their synaptic connections on appropriate target cells in an activitydependent fashion, may underlie the unique ability of neurons to respond to environmental stimuli that is essential for learning and memory storage.
REFERENCES Abel, T., Martin, K. C., Bartsch, D., & Kandel, E. R. (1998, January 16). Memory suppressor genes: Inhibitory constraints on the storage of long- term memory. Science, 279, 338–341. Abel, T. P. V., Nguyen, V., Barad, M., Deuel, T. A., Kandel, E. R., & Bourtchouladze, R. (1997). Genetic demonstration of a role for PKA in the late phase of LTP and in hippocampus-based long-term memory. Cell, 88, 615–626. Ahmari, S. E., Buchanan, J., & Smith, S. J. (2000). Assembly of presynaptic active zones from cytoplasmic transport packets. Nature Neuroscience, 3, 445–451. Alarcon, J. M., Barco, A., & Kandel, E. R. (2006). Capture of the late phase of long-term potentiation within and across the apical and basilar dendritic compartments of CA1 pyramidal neurons: Synaptic tagging is compartment restricted. Journal of Neuroscience, 26, 256–264. Alarcon, J. M., Malleret, G., Touzani, K., Vronskaya, S., Ishii, S., Kandel, E. R., et al. (2004). Chromatin acetylation, memory, and LTP are impaired in CBP+/- mice: A model for the cognitive deficit in Rubinstein-Taybi syndrome and its amelioration. Neuron, 42, 947–959. Albensi, B. C., & Mattson, M. P. (2000). Evidence for the involvement of TNF and NF-kappaB in hippocampal synaptic plasticity. Synapse, 35, 151–159. Alberini, C. M., Ghirardi, M., Metz, R., & Kandel, E. R. (1994). C/EBP is an immediate-early gene required for the consolidation of long-term facilitation in Aplysia. Cell, 76, 1099–1114. Antonini, A., & Stryker, M. P. (1993, June 18). Rapid remodeling of axonal arbors in the visual cortex. Science, 260, 1819–1821. Ashburner, M. (1990). Puff, genes and hormones revisited. Cell, 61, 1–3. Bacskai, B. J., Hochner, B., Mahaut-Smith, M., Adams, S. R., Kaang, B. K., Kandel, E. R., et al. (1993, April 9). Spatially resolved dynamics of cAMP and protein kinase A subunits in Aplysia sensory neurons. Science, 260, 222–226. Bailey, C. H., Bartsch, D., & Kandel, E. R. (1996). Toward a molecular definition of long-term memory storage. Proceedings of the National Academy of Sciences, USA, 93, 13445–13452. Bailey, C. H., & Chen, M. (1983, April 1). Morphological basis of longterm habituation and sensitization in Aplysia. Science, 220, 91–93. Bailey, C. H., & Chen, M. (1988a). Long-term memory in Aplysia modulates the total number of varicosities of single identified sensory neurons. Proceedings of the National Academy of Sciences, USA, 85, 2373–2377. Bailey, C. H., & Chen, M. (1988b). Long-term sensitization in Aplysia increases the number of presynaptic contacts onto the identified gill motor neuron L7. Proceedings of the National Academy of Sciences, USA, 85, 9356–9359.
8/18/09 6:25:17 PM
548
Synaptic and Cellular Basis of Learning
Bailey, C. H., & Chen, M. (1988c). Morphological basis of short-term habituation in Aplysia. Journal of Neuroscience, 8, 2452–2459. Bailey, C. H., & Chen, M. (1989). Time course of structural changes at identified sensory neuron synapses during long-term sensitization in Aplysia. Journal of Neuroscience, 9, 1774–1780. Bailey, C. H., Chen, M., Keller, F., & Kandel, E. R. (1992). Serotoninmediated endocytosis of apCAM: An early step of learning-related synaptic growth in Aplysia. Science, 256, 645–649. Bailey, C. H., Kaang, B. K., Chen, M., Marin, C., Lim, C. S., Casadio, A., et al. (1997). Mutation in the phosphorylation sites of MAP kinase blocks learning-related internalization of apCAM in Aplysia sensory neurons. Neuron, 18, 913–924. Bailey, C. H., & Kandel, E. R. (1993). Structural changes accompanying memory storage. Annual Review of Physiology, 55, 397–426. Bailey, C. H., Kandel, E. R., & Si, K. (2004). The persistence of long-term memory: A molecular approach to self-sustaining changes in learninginduced synaptic growth. Neuron, 44, 49–57. Bailey, C. H., Montarolo, P. G., Chen, M., Kandel, E. R., & Schacher, S. (1992). Inhibitors of protein and RNA synthesis block the structural changes that accompany long-term heterosynaptic plasticity in the sensory neurons of Aplysia. Neuron, 9, 749–758. Bailey, C. H., Thompson, E. B., Castellucci, V. F., & Kandel, E. R. (1979). Ultrastructure of the synapses of sensory neurons that mediate the gillwithdrawal reflex in Aplysia. Journal of Neurocytology, 8, 415–444. Barco, A., Alarcon, J. M., & Kandel, E. R. (2002). Expression of constitutively active CREB protein facilitates the late phase of long-term potentiation by enhancing synaptic capture. Cell, 108, 689–703. Barco, A., Pittenger, C., & Kandel, E. R. (2003). CREB, memory enhancement and the treatment of memory disorders: Promises, pitfalls and prospects. Expert Opinion on Therapeutic Targets, 7, 101–114. Bartsch, D., Casadio, A., Karl, K. A., Serodio, P., & Kandel, E. R. (1998). CREB1 encodes a nuclear activator, a repressor, and a cytoplasmic modulator that form a regulatory unit critical for long-term facilitation. Cell, 95, 211–223. Bartsch, D., Ghirardi, M., Casadio, A., Giustetto, M., Karl, K. A., Zhu, H., et al. (2000). Enhancement of memory-related long-term facilitation by ApAF, a novel transcription factor that acts downstream from both CREB1 and CREB2. Cell, 103, 595–608. Bartsch, D., Ghirardi, M., Skehel, P. A., Karl, K. A., Herder, S. P., Chen, A., et al. (1995). Aplysia CREB2 represses long-term facilitation: Relief of repression converts transient facilitation into long-term functional and structural changes. Cell, 83, 979–992. Barzilai, A., Kennedy, T. E., Sweatt, J. D., & Kandel, E. R. (1989). 5-HT modulates protein synthesis and the expression of specific proteins during long-term facilitation in Aplysia sensory neurons. Neuron, 2, 1577–1586. Benfenati, F., Onofri, F., & Giovedi, S. (1999). Protein-protein interactions and protein modules in the control of neurotransmitter release. Philosophical Transactions of the Royal Society of London: Biological Sciences, 354, 243–257. Benson, D. L., Schnapp, L. M., Shapiro, L., & Huntley, G. W. (2000). Making memories stick: Cell adhesive molecules in synaptic plasticity. Trends in Cell Biology, 10, 473–480. Bliss, T. V., Collingridge, G. L., & Morris, R. G. (2003). Introduction: Longterm potentiation and structure of the issue. Philosophical Transactions of the Royal Society of London: Biological Sciences, 358, 607–611. Bourtchouladze, R., Lidge, R., Catapano, R., Stanley, J., Gossweiler, S., Romashko, D., et al. (2003). A mouse model of Rubinstein-Taybi syndrome: Defective long-term memory is ameliorated by inhibitors of phosphodiesterase 4. Proceedings of the National Academy of Sciences, USA, 100, 10518–10522. Bradke, F., & Dotti, C. G. (1999, March 19). The role of local actin instability in axon assembly. Science, 283, 1931–1934.
c27.indd Sec10:548
Bradshaw, K. D., Emptage, N. J., & Bliss, T. V. (2003). A role for dendritic protein synthesis in hippocampal late LTP. European Journal of Neuroscience, 18, 3150–3152. Brittis, P. A., Lu, Q., & Flanagan, J. G. (2002). Axonal protein synthesis provides a mechanism for localized regulation at an intermediate target. Cell, 110, 223–235. Brunelli, M., Castellucci, V., & Kandel, E. R. (1976, December 10). Synaptic facilitation and behavioral sensitization in Aplysia: Possible role of serotonin and cyclic AMP. Science, 194, 1178–1181. Buchs, P. A., & Muller, D. (1996). Induction of long-term potentiation is associated with major ultrastructural changes of activated synapses. Proceedings of the National Academy of Sciences, USA, 93, 8040–8045. Burgin, K. E., Waxham, M. N., Rickling, S., Westgate, S. A., Mobley, W. C., & Kelly, P. T. (1990). In situ hybridization histochemistry of Ca2+/ calmodulin-dependent protein kinase in developing rat brain. Journal of Neuroscience, 10, 1788–1798. Byers, D., Davis, R. L., & Kiger, J. A., Jr. (1981, January 1). Defect in cyclic AMP phosphodiesterase due to the dunce mutation of learning in drosophila melanogaster. Nature, 289, 79–81. Cammalleri, M., Lutjens, R., Berton, F., King, A. R., Simpson, C., Francesconi, W., et al. (2003). Time-restricted role for dendritic activation of the mTOR-p70S6K pathway in the induction of late-phase longterm potentiation in the CA1. Proceedings of the National Academy of Sciences, USA, 100, 14368–14373. Casadio, A., Martin, K. C., Giustetto, M., Zhu, H., Chen, M., Bartsch, D., et al. (1999). A transient, neuron-wide form of CREB-mediated longterm facilitation can be stabilized at specific synapses by local protein synthesis. Cell, 99, 221–237. Castellucci, V. F., Blumenfeld, H., Goelet, P., & Kandel, E. R. (1989, April 1). Inhibitor of protein synthesis blocks long-term behavioral sensitization in the isolated gill-withdrawal reflex of Aplysia. Science, 220, 91–93. Colicos, M. A., Collins, B. E., Sailor, M. J., & Goda, Y. (2001). Remodeling of synaptic actin induced by photoconductive stimulation. Cell, 107, 605–616. Conrad, P., Wu, F., & Schacher, S. (1999). Changes in functional glutamate receptors on a postsynaptic neuron accompany formation and maturation of an identified synapse. Journal of Neurobiology, 39, 237–248. Constantine-Paton, M., Cline, H. T., & Debski, E. (1990). Patterned activity, synaptic convergence, and the NMDA receptor in developing visual pathways. Annual Review of Neuroscience, 13, 129–154. Cracco, J. B., Serrano, P., Moskowitz, S. I., Bergold, P. J., & Sacktor, T. C. (2005). Protein synthesis-dependent LTP in isolated dendrites of CA1 pyramidal cells. Hippocampus, 15, 551–556. Crick, F. (1984, November 8). Memory and molecular turnover. Nature, 312, 101. Dale, N., & Kandel, E. R. (1993). L-glutamate may be the fast excitatory transmitter of Aplysia sensory neurons. Proceedings of the National Academy of Sciences, USA, 90, 7163–7167. Dale, N., Kandel, E. R., & Schacher, S. (1987). Serotonin produces longterm changes in the excitability of Aplysia sensory neurons in culture that depend on new protein synthesis. Journal of Neuroscience, 7, 2232–2238. Dash, P. K., Hochner, B., & Kandel, E. R. (1990, June 21). Injection of the cAMP-responsive element into the nucleus of Aplysia sensory neurons blocks long-term facilitation. Nature, 345, 718–721. Davis, R. L. (1996). Physiology and biochemistry of Drosophila learning mutants. Physiological Reviews, 76, 299–317. Davis, R. L., Cherry, J., Dauwalder, B., Han, P. L., & Skoulakis, E. (1995). The cyclic AMP system and Drosophila learning. Molecular and Cellular Biochemistry, 149–150, 271–278. De Paola, V., Arber, S., & Caroni, P. (2003). AMPA receptors regulate dynamic equilibrium of presynaptic terminals in mature hippocampal networks. Nature Neuroscience, 6, 491–500.
8/18/09 6:25:17 PM
References 549 Drain, P., & Folkers E., & Quinn, W. G. (1991). CAMP-dependent protein kinase and the disruption of learning in transgenic flies. Neuron, 6, 71–82. Dudai, Y., Jan, Y. N., Byers, D., Quinn, W. G., & Benzer S. (1976). Dunce, a mutant of Drosophila deficient in learning. Proceedings of the National Academy of Sciences, USA, 73, 1684–1688. Dudek, S. M., & Fields, R. D. (2002). Somatic action potentials are sufficient for late-phase LTP-related cell signaling. Proceedings of the National Academy of Sciences, USA, 99, 3962–3967. Duerr, J. S., & Quinn, W. G. (1982). Three drosophila mutations that block associative learning also affect habituation and sensitization. Proceedings of the National Academy of Sciences, USA, 79, 3646–3650. Engert, F., & Bonhoeffer, T. (1999, May 6). Dendritic spine changes associated with hippocampal long-term synaptic plasticity. Nature, 399, 66–70. English, J. D., & Sweatt, J. D. (1996). Activation of p42 mitogenactivated protein kinase in hippocampal long-term potentiation. Journal of Biological Chemistry, 271, 24329–24332. English, J. D., & Sweatt, J. D. (1997). A requirement for the mitogen-activated protein kinase cascade in hippocampal long-term potentiation. Journal of Biological Chemistry, 272, 19103–19106. Fields, R. D., & Itoh, K. (1996). Neural cell adhesion molecules in activitydependent development and synaptic plasticity. Trends in Neuroscience, 19, 473–480. Frey, U., Huang, Y.-Y., & Kandel, E. R. (1993, June 11). Effects of cAMP simulate a late stage of LTP in hippocampal CA1 neurons. Science, 260, 1661–1664. Frey, U., & Morris, R. G. (1997, February 6). Synaptic tagging and longterm potentiation. Nature, 385, 533–536. Frey, U., & Morris, R. G. (1998). Weak before strong: Dissociating synaptic tagging and plasticity-factor accounts of late-LTP. Neuropharmacology, 37, 545–552. Frost, W. N., Castellucci, V. F., Hawkins, R. D., & Kandel, E. R. (1985). Monosynaptic connections made by the sensory neurons of the gill- and siphon-withdrawal reflex in Aplysia participates in the storage of longterm memory for sensitization. Proceedings of the National Academy of Sciences, USA, 82, 8266–8269. Genoux, D., Haditsch, U., Knobloch, M., Michalon, A., Storm, D., & Mansuy, I. M. (2002, August 29). Protein phosphatase 1 is a molecular constraint on learning and memory. Nature, 418, 970–975. Ghirardi, M., Montarolo, P. G., & Kandel, E. R. (1995). A novel intermediate stage in the transition between short- and long-term facilitation in the sensory to motor neuron synapse of Aplysia. Neuron, 14, 413–420. Glanzman, D. L., Kandel, E. R., & Schacher, S. (1990, August 17). Targetdependent structural changes accompanying long-term synaptic facilitation in Aplysia neurons. Science, 249, 779–802. Goodman, C. S., & Shatz, C. J. (1993). Development mechanisms that generate precise patterns of neuronal connectivity. Cell, 72, 77–98. Greenough, W. T., & Bailey, C. H. (1988). The anatomy of a memory: Convergence of results across a diversity of tests. Trends in Neuroscience, 11, 142–147. Guan, Z., Giustetto, M., Lomvardas, S., Kim, J. H., Miniaci, M. C., Schwartz, J. H., et al. (2002). Integration of long-term-memory-related synaptic plasticity involves bidirectional regulation of gene expression and chromatin structure. Cell, 111, 483–493. Hai, T. W., Liu, F., Coukos, W. J., & Green, M. K. (1989). Transcription factor ATF cDNA clones: An extensive family of leucine zipper proteins able to selectively form DNA-binding heterodimers [published erratum appears in Genes and Development, 1990, April 4(4), 682]. Genes and Development, 3, 2083–2090. Hall, A. (1998, January 23). Rho GTPases and the actin cytoskeleton. Science, 279, 509–514.
c27.indd Sec10:549
Han, J. H., Lim, Y. S., Kandel, E. R., & Kaang, B. K. (2004). Role of Aplysia cell adhesion molecules during 5-HT-induced long-term functional and structural changes. Learning and Memory, 11, 421–435. Hatada, Y., Wu, F., Sun, Z. Y., Schacher, S., & Goldberg, D. J. (2000). Presynaptic morphological changes associated with long-term synaptic facilitation are triggered by actin polymerization at preexisting varicosities. Journal of Neuroscience, 20, RC82. Hayashi, Y., & Majewska, A. K. (2005). Dendritic spine geometry: Functional implication and regulation. Neuron, 46, 529–532. Hayashi, Y., Shi, S. H., Esteban, J. A., Piccini, A., Poncer, J. C., & Malinow, R. (2000, March 24). Driving AMPA receptors into synapses by LTP and CaMKII: Requirement for GluR1 and PDZ domain interaction. Science, 287, 2262–2267. Hegde, A. N., Inokuchi, K., Pei, W., Casadio, A., Ghirardi, M., Chain, D. G., et al. (1997). Ubiquitin C-terminal hydrolase is an immediate-early gene essential for long-term facilitation in Aplysia. Cell, 89, 115–126. Hsieh, J., & Gage, F. H. (2005). Chromatin remodeling in neural development and plasticity. Current Opinions in Cell Biology, 17, 664–671. Hu, Y., Barzilai, A., Chen, M., Bailey, C. H., & Kandel, E. R. (1993). 5HT and cAMP induce the formation of coated pits and vesicles and increase the expression of clathrin light chain in sensory neurons of Aplysia. Neuron, 10, 921–929. Huang, Y. S., Jung, M. Y., Sarkissian, M., & Richter, J. D. (2002). N-methylD-aspartate receptor signaling results in Aurora kinase-catalyzed CPEB phosphorylation and alpha CaMKII mRNA polyadenylation at synapses. EMBO Journal, 21, 2139–2148. Huang, Y.-Y., Li, X. C., & Kandel, E. R. (1994). CAMP contributes to mossy fiber LTP by initiating both a covalently mediated early phase and macromolecular synthesis-dependent late phase. Cell, 79, 69–79. Humeau, Y., Doussau, F., Vitiello, F., Greengard, P., Benfenati, F., & Poulain, B. (2001). Synapsin controls both reserve and releasable synaptic vesicle pools during neuronal activity and short-term plasticity in Aplysia. Journal of Neuroscience, 21, 4195–4206. Huntley, G. W., Benson, D. L., & Colman, D. R. (2002). Structural remodeling of the synapse in response to physiological activity. Cell, 108, 1–4. Izquierdo, I., & Cammarota, M. (2004, May 7). Neuroscience: Zif and the survival of memory. Science, 304, 829–830. Kandel, E. R. (2001, November 2). The molecular biology of memory storage: A dialogue between genes and synapses. Science, 294, 1030–1038. Karpinski, B. A., Morle, G. D., Huggenvik, J., Uhler, M. D., & Leiden, J. M. (1992). Molecular cloning of human CREB-2: An ATF/CREB transcription factor that can negatively regulate transcription from the cAMP response element. Proceedings of the National Academy of Sciences, USA, 89, 4820–4824. Keller, Y., & Schacher, S. (1990). Neuron-specific membrane glycoproteins promoting neurite fasciculation in Aplysia Californica. Journal of Cell Biology, 111, 2637–2650. Kim, J.-H., Udo, H., Li, H.-L., Youn, T. Y., Chen, M., Kandel, E. R., et al. (2003). Presynaptic activation of silent synapses and growth of new synapses contribute to intermediate and long-term facilitation in Aplysia. Neuron, 40, 151–165. Klein, M., & Kandel, E. R. (1980). Mechanism of calcium current modulation underlying presynaptic facilitation and behavioral sensitization in Aplysia. Proceedings of the National Academy of Sciences, USA, 77, 6912–6916. Korzus, E., Rosenfeld, M. G., & Mayford, M. (2004). CBP histone acetyltransferase activity is a critical component of memory consolidation. Neuron, 42, 961–972. Krucker, T., Siggins, G. R., & Halpain, S. (2000). Dynamic actin filaments are required for stable long-term potentiation (LTP) in area CA1 of the
8/18/09 6:25:18 PM
550
Synaptic and Cellular Basis of Learning
hippocampus. Proceedings of the National Academy of Sciences, USA, 97, 6856–6861. Lamprecht, R., & LeDoux, J. (2004). Structural plasticity and memory. Nature Reviews Neuroscience, 5, 45–54. Lang, C., Barco, A., Zablow, L., Kandel, E. R., Siegelbaum, S. A., & Zakharenko, S. S. (2004). Transient expansion of synaptically connected dendritic spines upon induction of hippocampal long-term potentiation. Proceedings of the National Academy of Sciences, USA, 101, 16665–16670. Lee, H. K., Barbarosie, M., Kameyama, K., Bear, M. F., & Huganir, R. L. (2000, June 22). Regulation of distinct AMPA receptor phosphorylation sites during bidirectional synaptic plasticity. Nature, 405, 955–959. Lee, S.-H., Lim, C.-S., Park, H., Lee, J.-A., Han, J.-H., Kim, H., et al. (2007). Nuclear translocation of CAM-associated protein activates transcription for long-term facilitation in Aplysia. Cell, 129, 801–812. Levenson, J. M., & Sweatt, J. D. (2005). Epigenetic mechanisms in memory formation. Nature Reviews Neuroscience, 6, 108–118. Li, Z., Aizenman, C. D., & Cline, H. T. (2002). Regulation Rho GTPases by crosstalk and neuronal activity in vivo. Neuron, 33, 741–750. Lisman, J. E., & Zhabotinsky, A. M. (2001). A model of synaptic memory: A CaMKII/PP1 switch that potentiates transmission by organizing an AMPA receptor anchoring assembly. Neuron, 31, 191–201. Lonze, B. E., & Ginty, D. D. (2002). Function and regulation of CREB family transcription factors in the nervous system. Neuron, 35, 605–623. Maletic-Savatic, M., Malinow, R., & Svoboda, K. (1999, March 19). Rapid dendritic morphogenesis in CAl hippocampal dendrites induced by synaptic activity. Science, 283, 1923–1927. Malinow, R., Mainen, Z. F., & Hayashi, Y. (2000). LTP mechanisms: From silence to four-lane traffic. Current Opinion in Neurobiology, 10, 352–357. Malinow, R., & Malenka, R. C. (2002). AMPA receptor trafficking and synaptic plasticity. Annual Review of Neuroscience, 25, 103–126. Malleret, G., Haditsch, U., Genoux, D., Jones, M. W., Bliss, T. V., Vanhoose, A. M., et al. (2001). Inducible and reversible enhancement of learning, memory, and long-term potentiation by genetic inhibition of calcineurin. Cell, 104, 675–686. Mansuy, I. M., Mayford, M., Jacob, B., Kandel, E. R., & Bach, M. E. (1998). Restricted and regulated overexpression reveals calcineurin as a key component in the transition from short-term to long-term memory. Cell, 92, 39–49. Martin, K. C., Casadio, A., Zhu, H.,Yaping, E., Rose, J., Chen, M., et al. (1997). Synapse-specific long-term facilitation of Aplysia sensory somatic synapses: A function for local protein synthesis memory storage. Cell, 91, 927–938. Martin, K. C., & Kandel, E. R. (1996). Cell adhesion molecules, CREB and the formation of new synaptic connections during development and learning. Neuron, 17, 567–570. Martin, K. C., Michael, D., Rose, J. C., Barad, M., Casadio, A., Zhu, H., et al. (1997). MAP kinase translocates into the nucleus of the presynaptic cell and is required for long-term facilitation in Aplysia. Neuron, 18, 899–912. Matsuzaki, M., Honkura, N., Ellis-Davies, G. C. R., & Kasai, H. (2004, June 17). Structural basis of long-term potentiation in single dendritic spines. Nature, 429, 761–766.
Mayr, B., & Montminy, M. (2001). Transcriptional regulation by the phosphorylation-dependent factor CREB. Nature Reviews Molecular Cell Biology, 2, 599–609. Michael, D., & Martin, K. C. (1998). Repeated pulses of serotonin required for long-term facilitation activate mitogen-activated protein kinase in sensory neurons of Aplysia. Proceedings of the National Academy of Sciences, USA, 95, 1864–1869. Miesenbock, G., De Angelis, D. A., & Rothman, J. E. (1998, July 9). Visualizing secretion and synaptic transmission with pH-sensitive green fluorescent proteins. Nature, 394, 192–195. Milner, B. (1985). Memory and the human brain. In M. Shafto (Ed.), How we know (pp. 1864–1869). San Francisco: Harper & Rowe. Moccia, R., Chen, D., Lyles, V., Kapuya, E. E. Y., Kalachikov, S., Spahn, C. M., et al. (2003). An unbiased cDNA library prepared from isolated: Aplysia sensory neuron processes is enriched for cytoskeletal and translational mRNAs. Journal of Neuroscience, 23, 9409–9417. Montarolo, P. G., Goelet, P., Castellucci, V. F., Morgan, J., Kandel, E. R., & Schacher, S. (1986, December 5). A critical period for macromolecular synthesis in long-term heterosynaptic facilitation in Aplysia. Science, 234, 1249–1254. Moroz, L. L., Edwards, J. R., Puthanveettil, S. V., Kohn, A. B., Ha, T., Heyland, A., et al. (2006). Neuronal transcriptome of Aplysia: Neuronal compartments and circuitry. Cell, 127, 1453–1467. Murase, S., & Schuman, E. M. (1999). The role of cell adhesion molecules in synaptic plasticity and memory. Current Opinions in Cell Biology, 11, 549–553. Nagerl, U. V., Eberhorn, N., Cambridge, S. B., & Bonhoeffer, T. (2004). Bidirectional activity-dependent morphological plasticity in hippocampal neurons. Neuron, 44, 759–767. Nakayama, A. Y., Harms, M. B., & Luo, L. (2000). Small GTPases Rac and Rho in maintenance of dendritic spines and branches in hippocampal pyramidal neurons. Journal of Neuroscience, 15, 5329–5338. Nazif, F. A., Byrne, J. H., & Cleary, L. J. (1991). CAMP induces long-term morphological changes in sensory neurons of Aplysia. Brain Research, 539, 324–327. Ostroff, L. E., Fiala, J. C., Allwardt, B., & Harris, K. M. (2002). Polyribosomes redistribute from dendritic shafts into spines with enlarged synapses during LTP in developing rat hippocampal slices. Neuron, 35, 535–545. Ouyang, Y., Rosenstein, A., Kreiman, G., Schuman, E. M., & Kennedy, M. B. (1999). Tetanic stimulation leads to increased accumulation of Ca(2+)/ calmodulin-dependent protein kinase II via dendritic protein synthesis in hippocampal neurons. Journal of Neuroscience, 19, 7823–7833. Perazzona, B., Isabel, G., Preat, T., & Davis, R. L. (2004). The role of cAMP response element-binding protein in Drosophila long-term memory. Journal of Neuroscience, 24, 8823–8828. Polster, M. R., Nadel, L., & Schachter, D. L. (1991). Cognitive neuroscience: An analysis of memory: A historical perspective. Journal of Cognitive Neuroscience, 3, 95–116. Purcell, A. L., Sharma, S. K., Bagnall, M. W., Sutton, M. A., & Carew, T. J. (2003). Activation of a tyrosine kinase-MAPK cascade enhances the induction of long-term synaptic facilitation and long-term memory in Aplysia. Neuron, 37, 473–484.
Matus, A. (2000, October 27). Actin-based plasticity in dendritic spines. Science, 290, 754–758.
Puthanveettil, S. V., Monje, F.J., Miniaci, M.C., Choi, Y.B., Karl, K.A., Khandros, E., et al. (2008). A new component in synaptic plasticity: Upregulation of kinesin in the neurons of the gill-withdrawal reflex. Cell, 135, 960–973.
Mauelshagen, J., Parker, G. R., & Carew, T. J. (1996). Dynamics of induction and expression of long-term synaptic facilitation in Aplysia. Journal of Neuroscience, 16, 7099–7108.
Ramanan, N., Shen, Y., Sarsfield, S., Lemberger, T., Schutz, G., Linden, D. J., et al. (2005). SRF mediates activity-induced gene expression and synaptic plasticity but not neuronal viability. Nature Neuroscience, 8, 759–767.
Mayford, M., Barzilai, A., Keller, F., Schacher, S., & Kandel, E. R. (1992, May 1). Modulation of an NCAM-related adhesion molecule with long-term synaptic plasticity in Aplysia. Science, 256, 638–644.
Scholz, K. P., & Byrne, J. H. (1987, February 6). Long-term sensitization in Aplysia: Biophysical correlates in tail sensory neurons. Science, 235, 685–687.
c27.indd Sec10:550
8/18/09 6:25:18 PM
References 551 Schwartz, H., Castellucci, V. F., & Kandel, E. R. (1971). Functions of identified neurons and synapses in abdominal ganglion of Aplysia in absence of protein synthesis. Journal of Neurophysiology, 34, 9639–9653.
Toni, N., Buchs, P. A., Nikonenko, I., Bron, C. R., & Muller, D. (1999, November 25). LTP promotes formation of multiple spine synapses between a single axon terminal and a dendrite. Nature, 402, 421–425.
Segal, M. (2005). Dendritic spines and long-term plasticity. Nature Reviews Neuroscience, 6, 277–284.
Trudeau, L. E., & Castellucci, V. F. (1993). Excitatory amino acid neurotransmission of sensory-motor and interneuronal synapses of Aplysia Californica. Journal of Neurophysiology, 70, 1221–1230.
Sharma, S. K., Bagnall, M. W., Sutton, M. A., & Carew, T. J. (2003). Inhibition of calcineurin facilitates the induction of memory for sensitization in Aplysia: Requirement of mitogen-activated protein kinase. Proceedings of the National Academy of Sciences, USA, 100, 4861–4866. Si, K., Giustetto, M., Etkin, A., Hsu, R., Janisiewicz, A. M., Miniaci, M. C., et al. (2003). A neuronal isoform of CPEB regulates local protein synthesis and stabilizes synapse-specific long-term facilitation in aplysia. Cell, 115, 893–904. Si, K., Lindquist, S., & Kandel, E. R. (2003). A neuronal isoform of the aplysia CPEB has prion-like properties. Cell, 115, 879–891. Sin, W. C., Haas, K., Ruthhazer, E. S., & Cline, H. T. (2002, October 3). Dendrite growth increased by visual activity requires NMDA receptor and Rho GTPases. Nature, 419, 475–480. Squire, L. R. (1992). Memory and the hippocampus: A synthesis from findings with rats, monkeys, and humans. Psychological Review, 99, 195–231. Squire, L. R., & Zola-Morgan, S. (1991, September 20). The medial temporal lobe memory system. Science, 253, 1380–1386. Steward, O., & Schuman, E. M. (2001). Protein synthesis at synaptic sites on dendrites. Annual Review of Neuroscience, 24, 299–325. Steward, O., & Schuman, E. M. (2003). Compartmentalized synthesis and degradation of proteins in neurons. Neuron, 40, 347–359. Sutton, M. A., Masters, S. E., Bagnall, M. W., & Carew, T. J. (2001). Molecular mechanisms underlying a unique intermediate phase of memory in Aplysia. Neuron, 31, 143–154. Sutton, M. A., & Schuman, E. M. (2005). Local translational control in dendrites and its role in long-term synaptic plasticity. Journal of Neurobiology, 64, 116–131. Theis, M., Si, K., & Kandel, E. R. (2003). Two previously undescribed members of the mouse CPEB family of genes and their inducible expression in the principal cell layers of the hippocampus. Proceedings of the National Academy of Sciences, USA, 100, 9602–9607.
c27.indd Sec10:551
Udo, H., Jin, I., Kim, J.-H., Li, H.-L., Youn, T., Hawkins, R. D., et al. (2005). Serotonin-induced regulation of the actin network for learningrelated synaptic growth requires CdC42, N-WASP and PAK in Aplysia sensory neurons. Neuron, 45, 887–901. Waddell, S., & Quinn, W. G. (2001). Flies, genes, and learning. Annual Review of Neuroscience, 24, 1283–1309. Washbourne, P., Dityatev, A., Scheiffele, P., Biederer, T., Weiner, J. A., Christopherson, K. S., et al. (2004). Cell adhesion molecules in synapse formation. Journal of Neuroscience, 24, 9244–9249. Winder, D. G., Mansuy, I. M., Osman, M., Moallem, T. M., & Kandel, E. R. (1998). Genetic and pharmacological evidence for a novel, intermediate phase of long-term potentiation suppressed by calcineurin. Cell, 92, 25–37. Wood, M. A., Kaplan, M. P., Park, A., Blanchard, E. J., Oliveira, A. M., Lombardi, T. L., et al. (2005). Transgenic mice expressing a truncated form of CREB-binding protein (CBP) exhibit deficits in hippocampal synaptic plasticity and memory storage. Learning and Memory, 12, 111–119. Yeh, S. H., Lin, C. H., & Gean, P. W. (2004). Acetylation of nuclear factorkappaB in rat amygdala improves long-term but not short-term retention of fear memory. Molecular Pharmacology, 65, 1286–1292. Yin, J. C., Wallach, J. S., Del Vecchio, M., Wilder, E. L., Zhou, H., Quinn, W. G., et al. (1994). Induction of a dominant negative CREB transgene specifically blocks long-term memory in Drosophila. Cell, 79, 49–58. Yuan, X. B., Jin, M., Xu, X., Song, Y. Q., Wu, C. P., Poo, M. M. (2003). Signaling and crosstalk of GTPases in mediating axon guidance. Nature Cell Biology, 5, 38–45. Yuste, R., & Bonhoeffer, T. (2001). Morphological changes in dendrititic spines associated with long-term synaptic plasticity. Annual Review of Neuroscience, 24, 1071–108. Zhang, W., & Benson, D. L. (2001). Stages of synapse development defined by dependence on F-actin. Journal of Neuroscience, 21, 5169–5181.
Threadgill, R., Bobb, K., & Ghosh, A. (1997). Regulation of dendritic growth and remodeling by Rho, Rac and Cdc42. Neuron, 19, 625–634.
Zhou, Q., Homma, K. J., & Poo, M. M. (2004). Shrinkage of dendritic spines associated with long-term depression of hippocampal synapses. Neuron, 44, 749–757.
Tischmeyer, W., & Grimm, R. (1999). Activation of immediate early genes and memory formation. Cellular and Molecular Life Sciences, 55, 564–574.
Zuo, Y., Lin, A., Chang, P., & Gan, W. B. (2005). Development of longterm dendritic spine stability in diverse regions of cerebral cortex. Neuron, 46, 181–189.
8/18/09 6:25:19 PM
Chapter 28
Memory HOWARD EICHENBAUM
memory capacities in animals with selective hippocampal damage and recordings of hippocampal neuronal activity in behaving animals and humans, have begun to reveal how the hippocampus mediates episodic memory. In this chapter, I outline some of the defining features of episodic memory and focus on aspects of its anatomical and physiological basis in the functional organization of the medial temporal lobe. Episodic memory is supported by a large network of brain areas, including prominently widespread neocortical areas that contribute to episodic memory by virtue of various aspects of cognitive and perceptual processing. The involved cortical areas include the prefrontal cortex and other areas that mediate working memory, effortful retrieval, source monitoring, and other cognitive processing functions that are essential to recollection (e.g., Aggleton & Brown, 2006; Henson, Rugg, Shallice, Josephs, & Dolan, 1999; Yonelinas, Otten, Shaw, & Rugg, 2005; see Chapters 29 and 30). Also, areas of the parietal and temporal cortex are involved in complex perceptual processing essential to configuration of the conceptual contents of information that is the subject of recollection (e.g., Uncapher, Otten, & Rugg, 2006). Projections from these areas strongly converge onto the medial temporal lobe, which also sends strong projections back to these cortical areas, suggesting a central role in organizing or extending the persistence of cortical representations. Outputs of these areas converge on, and are also the primary output targets of, the medial temporal lobe, and in particular, the hippocampus (Eichenbaum, 2000). The medial temporal lobe is special in this organization because, unlike neocortical areas, it plays a fully selective role in memory and not other cognitive or perceptual functions. Therefore, the following considerations about the neurobiology of episodic memory focus on the role of the medial temporal lobe, and in particular the hippocampus. Episodic memory is the capacity to remember unique personal experiences. Tulving (2002) distinguishes episodic memory by what should be considered subjective features of
The understanding of memory is one of the major objectives of cognitive and neuroscience research. Behavioral and neurobiological studies extending over the past 100 years have revealed that there are multiple types of memory and that different forms of memory are supported by distinct brain systems (Eichenbaum & Cohen, 2001). Most prominent among these is the system for declarative memory, our capacity to store and bring to consciousness everyday facts and experiences. An essential brain substrate of declarative memory was identified nearly 50 years ago in the case study of H. M., a man who became amnesic following removal of the medial temporal lobe to alleviate his epileptic seizures (Corkin, 1984; Scoville & Milner, 1957). H. M. was severely impaired in declarative memory, whereas his perceptual and cognitive abilities were intact, as were his capacities for other forms of memory, including short-term and working memory, and perceptual and motor skill learning. Furthermore, the deficit in the acquisition of new declarative memories was accompanied by a temporally graded retrograde amnesia, such that H. M. could recall information obtained remotely in life but he was impaired in recalling events that occurred shortly before the onset of amnesia. These observations suggest that the memory processing mediated by the medial temporal lobe begins during learning and continues to contribute to the consolidation of declarative memories over a prolonged period. Succeeding neuropsychological analyses on amnesic patients and functional imaging studies on normal humans elaborated the domain of capacities that are dependent on the medial temporal region and, in particular, the hippocampus (Eichenbaum & Cohen, 2001; Squire, Stark, & Clark, 2004). These studies have emphasized the critical role of the hippocampus in two components of declarative memory: (1) The hippocampus plays a critical role in episodic memory, our capacity for recollection of unique personal experiences, and (2) the hippocampus is involved in particular aspects of the acquisition of semantic or factual knowledge. These studies, plus detailed characterizations of spared and impaired 552
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c28.indd 552
8/18/09 6:26:16 PM
Role of the Hippocampus in Recollection and Familiarity
experience of remembering. Most prominent is autonoetic awareness, a sense of having had a particular experience. Also highlighted in Tulving’s conception are the memory of one’s self-involvement in the remembered episode and the capacity to mentally “replay” the experience. Importantly, these features all involve internalized subjective qualities of remembering, accessible only by verbal report and not through any objective measures of behavior or by assessing the contents of remembered events. This places key aspects of episodic memory difficult to characterize by objective assessments and entirely outside the province of testing in animals, severely limiting the kinds of neurobiological studies that could reveal the fundamental roles and nature of information coding in areas of the brain. For these reasons, these subjective features of episodic memory, including autonoetic awareness, self-involvement, and mental replay, are rarely examined in research on episodic memory in humans or its analogues in animals. Instead, studies on episodic memory typically evaluate objective features of the contents of memory, most commonly the context or source of remembered items, such as when or where an event occurred. The range of objective features of episodic memories is perhaps best illustrated by the common experience in which we sometimes meet someone who looks familiar, but cannot remember who he is or why we know him. A conversation ensues and eventually a critical reminder surfaces that generates a rich and complex memory. The memory includes contextual information about where and when we last encountered the person. A vivid recollection unfolds as a series of events that constitute the full encounter. Also, we are often able to recall and distinguish other encounters with that person and with related experiences. Features that are prominent in this example of episodic recollection are the basis of the following discussion of the role of the hippocampus and other medial temporal areas. HIPPOCAMPUS AND FEATURES OF EPISODIC MEMORY The previous anecdote reflects the fundamental features of episodic recollection in daily life. First, recollection of previous experiences is distinguished from a sense of familiarity, even when that sense of familiarity can be quite strong and provide a clue about the recency of prior experience. Second, a defining feature of recollection is that episodic memories for events involve the context in which they occurred, specifically when and where the event occurred. Third, a vivid episodic memory is structured by a temporal organization involving the flow of events in a unique experience. Fourth, specific episodic memories are distinguished
c28.indd Sec1:553
553
and related to other specific memories that contain substantial common information. In this chapter, I consider these four fundamental distinguishing features of episodic recollection, ask whether they characterize the memory capacities of animals as they do humans, and explore the role of the hippocampus in each. ROLE OF THE HIPPOCAMPUS IN RECOLLECTION AND FAMILIARITY As the incident described suggests, one of the ways familiarity and recollection are distinguished is by their retrieval dynamics. Familiarity occurs quickly and is graded in strength. Items from our past can generate a slight sense of familiarity or an intensely held belief that we have experienced them before. By contrast, recollection is qualitative. Its goodness is characterized by the number of associations we retrieve and we tend to retrieve each one in an all-ornone fashion. How can these properties be dissociated in the performance of human and animal subjects? The retrieval dynamics of recollection and familiarity have been distinguished in humans by the analysis of receiver operating characteristic (ROC) functions during recognition memory performance (Yonelinas, 2001). In a typical experiment, subjects study a list of words, then are tested for their capacity to identify the same words plus a set of words that were not studied as “old” or “new.” The resulting ROC analysis plots “hits,” that is, correct identifications of old items, against “false alarms,” incorrect identifications of new items as if they were old, across a range of confidence levels. This analysis typically reveals an asymmetric function characterized by an above-zero threshold of recognition at the most conservative criterion (zero false alarm rate) and thereafter a curvilinear performance function (Yonelinas, 2001; Figure 28.1A). The positive Y-intercept is viewed as an index of the recollection in the absence of measurable familiarity, whereas the degree of curvature reflects familiarity as typical of a signal-detection process (Macmillan & Creelman, 1991). Consistent with this view, under different experimental demands that favor one of these processes, the shape of the ROC curve takes on distinguishable functions. During performance that favors recollection, the ROC curve highlights the threshold component of recognition with performance at successively higher confidence levels characterized by a linear function (Figure 28.1B). In contrast, during performance that favors familiarity, the ROC curve is symmetrical and curvilinear (Figure 28.1C). Yonelinas et al. (2002) used ROC analysis to show that mild hypoxia that causes damage largely confined to the hippocampus resulted in a severe deficit in recollection
8/18/09 6:26:16 PM
554
Memory (A)
Word list recognition in humans
(B)
Familiarity
(C)
Recollection
Odor list recognition in rats (preoperative performance)
(E)
Odor list recognition in rats (postoperative performance)
(F)
Lengthened delay period in control rats
Probability of hits
1 0.8 0.6 0.4 0.2 0 (D)
0.8
3 4 5
0.6
Controls Hippocampus Con F
1
2
Group: R ⫽ 0.40, d' ⫽ 1.14
0.2 0
Figure 28.1 rats.
R
0.2 CH
1
0
Note: A–C: Performance of humans in verbal recognition. From “Components of Episodic Memory: The Contribution of Recollection and Familiarity,” by A. P. Yonelinas, 2001, Philosophical Transcripts of the Royal Society of London: Biological Sciences, 356, 1363–1374. Adapted with permission. D–F:, Performance of rats in odor recognition
but normal familiarity. The distinction between impaired recollection and spared familiarity was verified by measures of subjective experiences in recognition reflected in “remember” versus “know” judgments by the same patients. In addition, structural equation modeling methods used on a large sample of hypoxic patients revealed that hypoxic severity predicted the degree to which recollection, but not familiarity, was impaired. A similar pattern of deficient recollection and preserved familiarity was reported in a patient with relatively selective hippocampal atrophy related to meningitis (Aggleton et al., 2005). These studies indicate the hippocampus plays a selective role in recollection. However, other interpretations of the data on ROC analyses in normal human subjects have led to the view that recollection and familiarity reflect differences in strength of a single memory function (Wixted & Stretch, 2004) and many reports are mixed on whether ROC curves are more consistent with single or dual processes in recognition, suggesting that the dissociation of these processing functions may be dependent on parameters of testing and assumptions in the data analysis (Parks & Yonelinas, 2007; Wixted, 2007). In addition, another ROC study reported deficits in
c28.indd 554
0.2 CH
CH
0.2 0.4 0.6 0.8 Probability of false alarms
ROCs for recognition performance in humans and
Controls: R ⫽ 0.23, d' ⫽ 0.11
*
0.4
0
CH
0.2 0.4 0.6 0.8 Probability of false alarms
R
F
0.4
0
0
Controls: R ⫽ 0.33, d' ⫽ 1.02 Hippocampus: R ⫽ 0, d' ⫽ 1.03
Probability estimates
F
0.4
Probability estimates
Probability of hits
1
1
0
0.2 0.4 0.6 0.8 Probability of false alarms
1
(Fortin et al., 2004). (D) Normal rats tested with a 30-min delay. Insets: F ⫽ Familiarity estimates; R ⫽ recollection estimates. (E) Postoperative performance with a 30-min delay, including an estimated curve for controls based on familiarity alone (con F). (F) Control rats tested with a 75min memory delay. Diagonal dotted lines represent chance performance across criterion levels. C ⫽ control group; H ⫽ hippocampal group. Error bars indicate SEM. * p < .05.
both recollection and familiarity in hypoxic patients with identified hippocampal damage (Wais, Wixted, Hopkins, & Squire, 2006) and several other studies also reflect a mixture of results on whether the hippocampus is selectively involved in recollection or involved in both recollection and familiarity. Differences in the localization of damage in different patients as well as differences in the task demands across studies might account for the variability in results across these studies. To address whether recollection and familiarity can be distinguished in ROC functions by selective hippocampal damage, we developed an ROC protocol for assessing recollection and familiarity in rats and for examining the effects of highly selective experimental lesions of the hippocampus. Our recognition task exploited rats’ superb odor memory capacities (Fortin, Wright, & Eichenbaum, 2004). On each daily test session, rats initially sampled 10 common household scents mixed in with playground sand in a plastic cup containing a cereal reward. When each sample was presented, the animal would dig for the reward and incidentally smell the odor of the sand. Following a 30minute memory delay, the same odors plus 10 additional odors were presented one at a time in random order.
8/18/09 6:26:17 PM
Role of the Hippocampus in Recollection and Familiarity
On each recognition test, the animal followed a nonmatchto-sample rule such that it could dig in the target odor to obtain a reward if the target was “new” (a nonmatch) or could refrain from digging if the odor was “old” (a match) and instead obtain a reward in an empty cup on the opposite end of the test chamber. Initially, animals were trained with short lists of odors, and the list length was gradually increased to 10 items. In addition, in the final phase of training and testing, a different response criterion was encouraged for each daily session using a combination of variations in the height of the test cup, making it more or less difficult to respond to that cup, and manipulations of the reward magnitudes associated with correct responses to the test and the unscented cup. Notably, the use of a method for explicitly varying the animal’s bias is different from the use of confidence judgments in experiments on recognition in humans (Yonelinas, 2001); nevertheless, both methods successfully vary the subject’s criterion along the full range required to compute ROC curves. The ROC curve of intact control rats was asymmetric (Figure 28.1D), containing both a threshold component (above-zero Y-intercept) and a strong curvilinear component. This pattern is remarkably similar to the ROC of humans in verbal recognition performance (Figure 28.1A), consistent with a combination of recollection-like and familiarity-based components of recognition in animals. To explore the role of the hippocampus in recollection, subjects were subsequently divided into two groups matched on both performance components; one group received selective lesions of the hippocampus whereas the other group received sham control operations. After recovery, we again tested recognition performance at each response criterion. The ROC of control rats continued to reflect both recollection-like and familiarity components, whereas the ROC of animals with selective hippocampal lesions was fully symmetrical and curvilinear (Figure 28.1E), characteristic of familiarity-based recognition in humans (Figure 28.1C). To describe these patterns quantitatively, we calculated indices of recollection and familiarity (Figure 28.1E inset). Whereas familiarity remains normal in rats with hippocampal lesions, recollection is severely impaired. The overall level of performance (averaged across biases) on the task is slightly worse in the hippocampal group (66%, compared to 73% in controls). Given that any performance deficit would be expected to result in an ROC closer to the diagonal (chance performance; dashed line in Figure 28.1E), it is possible that the alteration in their ROC pattern resulting from the hippocampal lesion reflect a generalized decline in memory. To compare the ROC of hippocampal rats with the pattern of forgetting in normal animals, we challenged the memory of control rats by increasing the memory delay to 75 minutes. This manipulation
c28.indd 555
555
succeeded in reducing the overall level of performance of control animals to 64%, equivalent to that of the hippocampal rats. Yet, further testing of the controls showed that their ROC continued to have an asymmetrical threshold component, as indicated by an above-zero Y-intercept (Figure 28.1F). Notably, the controls’ ROC was distinctly more linear than that of both the hippocampal rats and the controls when tested at the shorter memory delay. This pattern of performance suggests that, in normal rats, familiarity fades more quickly than recollection, similar to observations on humans (Yonelinas, 2002). Moreover, comparison of the ROC curve in normal rats at the 75-minutes delay versus that of rats with hippocampal damage at the 30-min delay emphasizes the distinction between these two groups in their differential emphasis on recollection and familiarity, respectively, even when the overall levels of recognition success are equivalent. These findings strongly suggest that rats exhibit two distinct processes in recognition, one that is marked by a threshold retrieval dynamic characteristic of episodic recollection in humans, and another that follows a symmetrical and curvilinear processing function characteristic of familiarity in humans. These observations suggest comparable dual retrieval mechanisms underlying recognition in animals and humans, and strongly support the notion that the hippocampus plays a critical role only in the recollective processes that contribute to recognition. Memory for Where and When Events Occurred Several investigators have argued that animals are indeed capable of remembering the spatial and temporal context in which they experienced specific stimuli (Clayton, Bussey, & Dickinson, 2003; Day, Langston, & Morris, 2003). To further explore these aspects of episodic memory, we developed a task that assesses memory for events from a series of events that each involve the combination of an odor (“what”), the place in which it was experienced (“where”), and the order in which the presentations occurred (“when”; Ergorul & Eichenbaum, 2004). On each of a series of events, rats sampled an odor in a unique place along the periphery of a large open field (Figure 28.2A). Then, memory for when those events occurred was tested by presenting a choice between an arbitrarily selected pair of the odor cups in their original locations. We identified both the stimulus initially approached and the final choice in which the rat dug for food. Over a series of shaping phases, rats were trained to select the earlier presented odor of a pair randomly selected from the series. Rats performed well above chance (76.2% correct) in their choices on the test phase, indicating that they can remember the order of unique sequences of odors and places
8/18/09 6:26:17 PM
Memory
Test phase C B⫹
Standard
100 90 80 70 60 50 40 30
C⫹ B⫹ C
A⫹
B⫹
Odor Probe
D⫹
⫹
Spatial Probe
Figure 28.2 A: An example (B versus C) trial for a whatwhere-when test and odor and spatial probes. B: Comparison of performance (mean ⫾ SEM) versus percentage of correct first approaches on the what-where-when probe tests. C: Postsurgery performance of sham-control (n ⫽ 7) and hippocampal lesion (n ⫽ 7) groups. Note: (A) In the sample phase of every trial, rats were presented with four odors in series (A⫹ → B⫹ → C⫹ → D⫹), each at a different location
(Figure 28.2B). In addition, we also found that rats first approached the correct stimulus at well above chance level, indicating they remembered the sequence of places where the cups were presented prior to perceiving information about the odor at that location; importantly, separate tests showed that rats cannot accurately identify the odor in a cup until they arrive within at the edge of the cup. However, performance was not as accurate in the first approach as it was in the final choice, suggesting that rats begin by guessing the location of the earlier experienced cup, then confirm this choice using the smell of the cup. This hypothesis was confirmed in a control condition in which we presented the test cups without odors. In this condition, performance of intact animals fell to chance, indicating that when the selected location is not confirmed by the associated odor, performance is disrupted (Figure 28.2B). This pattern of results strongly suggests rats normally use a combination of “where” and “what” information to judge “when” the events occurred. To examine the role of the hippocampus, animals were subsequently separated into matched groups, one of which received selective hippocampal lesions. Subsequently, intact rats continued to choose well on the standard “what-wherewhen” trials (Figure 28.2C). By contrast, the performance of animals with hippocampal lesions was no better than chance. In addition, whereas intact rats continued to perform well on the initial approach, rats with hippocampal lesions approached the correct choice less often than expected by chance. Contrary to the strategy of normal rats and the reinforcement contingency of the test phase, rats with hippocampal damage were inclined to visit the more recently
c28.indd 556
7% 19%
Standard
Standard
Percent correct (%)
Sample phase
100 90 80 70 60 50 40 30
100 90 80 70 60 50 40 30
Controls Hippocampus
Percent correct (%)
(B) Percent correct (%)
(A)
Percent correct (%)
556
100 90 80 70 60 50 40 30
Controls Hippocampus
First approach
Odor
Spatial
Standard
Standard
First Approach
Odor
Spatial
on a platform. In the following test phase, odors B and C were presented in their sample locations in the what-where-when choice test, or next to each other in the odor probe, or two nonodorous stimuli were presented in the sample locations of B and C in the spatial probe. ⫹ ⫽ Reinforced stimulus; arrow on the platform: position of the rat at the starting-point (arrowhead corresponds to the rat’s head); star symbol: the experimenter ’s fixed position throughout testing. (B) Presurgery performance of normal rats (n ⫽ 14). (C) Dashed line ⫽ Chance level. * p ⬍ .05.
presented and rewarded place rather than the earlier visited locus. This observation indicates an intact spatial memory in rats with hippocampal damage and this memory was employed despite its maladaptive consequences. Rats with hippocampal damage could detect the earlier presented odor when presented without spatial cues (odor probe test; Figure 28.2A, data in Figure 28.2C) and the observation of a preference for the most recently presented odor indicates that hippocampal lesioned animals could identify and remember the odors and places in some way. These findings, combined with the failure of rats with hippocampal damage in the standard condition, indicate that the hippocampus is critical for effectively combining the “what,” “when,” and “where” qualities of each experience to compose the retrieved memory. Normal rats initially employ their memory of the places of presented cups and approach the location of the earlier experience. Then they confirm the presence of the correct odor in that location. Animals with hippocampal damage fail on both aspects of this task and, instead, their behavior is guided by another form of memory that leads to the incorrect first approach. That they can initially approach the most recently rewarded location indicated their spatial memory is intact. However, it appears they are driven to approach the last rewarded cup rather than combine the what-wherewhen cues to select the earlier event. Events Are Represented as Items in Context Additional insights about the fundamental properties of memory representation can be gained through the analysis
8/18/09 6:26:17 PM
Role of the Hippocampus in Recollection and Familiarity
c28.indd 557
Odors in context
30
Number of cells
of neural activity patterns associated with the critical stimuli and behavioral events that occur in animals performing memory tasks. These studies can confirm the evidence from tests of brain damage by providing evidence of normal coding of features of memory that are lost following selective damage of the same brain areas. These studies can provide insights about where and how particular types of information are encoded within the circuitry of the hippocampus and associated brain structures. A wealth of studies have shown that hippocampal neurons fire associated with the ongoing behavior and the context of events as well as the animal’s location (Eichenbaum, 2004). The combination of spatial and nonspatial features of events captured by hippocampal neuronal activity is consistent with the view that the hippocampus encodes many features of events and the places where they occur. Two studies provide examples that highlight the rapid associative coding of events and places by hippocampal neurons. In one study rats were trained on an auditory fear conditioning task in which a tone was paired with shock to produce conditioned freezing to subsequent tone presentations (Moita, Moisis, Zhou, LeDoux, & Blair, 2003). Prior to fear conditioning, few hippocampal cells were activated by the auditory stimulus. Following pairings of tone presentations and shocks, many cells fired briskly to the tone and did so only when the animal was in a particular place where the cell had fired above baseline prior to conditioning. Another study examined the firing properties of hippocampal neurons in monkeys performing a task where they rapidly learned new scene-location associations (Wirth et al., 2003). Just as the monkeys acquired a new response to a location in the scene, neurons in the hippocampus changed their firing patterns to become selective to particular scenes. Additional studies have directly examined the extent to which hippocampal neurons encode specific stimuli and places where they occur by training subjects to perform the same memory judgments at many locations in the environment. In one study, rats performed a task in which they had to recognize any of nine olfactory cues placed in any of nine locations (Wood, Dudchenko, & Eichenbaum, 1999). On each trial, a reward was available when the rat responded to a cue that differed from (was a nonmatch to) the immediately preceding stimulus. Because the location of the discriminative stimuli was varied systematically, cellular activity related to the stimuli and behavior could be dissociated from that related to the animal’s location. Some hippocampal cells encoded particular odor stimuli, others were activated when the rat sampled any odor at a particular place, and yet others fired associated with whether the odor matched or differed from the previous cue (Figure 28.3). However, the largest subset of hippocampal neurons
557
20
10 Odor
Place
Match
Figure 28.3 Incidence of hippocampal neurons that encode odors, places where odors were sampled, the match/nonmatch status of the odor, or combinations of odor, place, and match/nonmatch status. Note: From “The Global Record of Memory in Hippocampal Neuronal Activity,” by E. Wood, P. A. Dudchenko, H. Eichenbaum, 1999, Nature, 97, pp. 613–616. Reprinted with permission.
fired only when associated with a particular combination of the odor, the place where it was sampled, and the matchnonmatch status of the odor. In a similar task created for humans, Ekstrom et al. (2003) recorded the activity of hippocampal neurons as people played a taxi driver game, searching for passengers picked up and dropped off at various locations in a virtual reality town. Some cells encoded particular cues or fired as the subject traversed specific locations. Also, many of these cells fired selectively when the subject viewed of a particular scene from a particular place or passed a location while pursuing a particular goal. Hippocampal cells that represent specific salient objects in the context of a particular environment have also been observed in studies of rats engaged in foraging (Gothard, Skaggs, Moore, & McNaughton, 1996; Rivard et al., 2004) and place learning (Hollup, Molden, Donnett, Moser, & Moser, 2001) in open fields. Furthermore, parallel evidence from functional imaging has shown that the human hippocampus is selectively activated during association of item and the context in which it was experienced (e.g., Davachi, Mitchell, & Wagner, 2003; Ranganath et al., 2003; for reviews, see Davachi, 2006; Eichenbaum, Yonelinas, & Ranganath, 2007). Thus, in rats, monkeys, and humans, a prevalent property of hippocampal firing patterns involves the representation of unique associations of stimuli and their significance, specific behaviors, and the places where these events occurred. Memory for the Order of Events within a Unique Experience In addition to memory for the spatial and temporal context of distinct events, a vivid recollection often involves recalling the flow of events within a single experience.
8/18/09 6:26:18 PM
558
Memory
(A)
(B)
vs
E⫺
A⫹ vs
B⫹ vs E⫺
Sequence presentation
D⫺
A⫹
vs
C⫺
B⫹ vs D⫺ C⫹ vs E⫺ Select odor cup appearing earlier in the sequence
Odors A through E
Percent correct (%)
One of the following: A⫹
Odors sequence: A
2.5-min
B
2.5-min
C
2.5-min
D
2.5-min
B⫹
Order performance 100
Order
D⫺
Controls Hippocampus
90 80
*
Approach
Dig for reward
Wait 2.5 minutes
A⫺ vs H⫹ D⫺
vs
B⫺ vs U⫹
C⫺ vs T⫹
S⫹
K⫹
E⫺ vs
Select odor cup not presented in the sequence
⫹ C⫺ T
Figure 28.4 Memory for order and item in rats. Note: A: On each trial the animal was presented with a series of 5 odors (e.g., odors A through E), the animal was then either probed for its memory of the order of elements in the series (top) or its memory of the individual items presented (bottom). ⫹ ⫽ Rewarded odor; – ⫽ Nonrewarded odor. B: Hippocampal animals were impaired on all sequential order probes. Performances on different probes are grouped according to the lag (number
To investigate the memory for the order of events in a unique experience, we developed a behavioral protocol that assesses memory for episodes composed of a unique sequence of olfactory stimuli presented to the animal as it remains in its cage (Fortin, Agster, & Eichenbaum, 2002; see also Kesner, Gilbert, & Barua, 2002). In addition, our design allowed us to directly compare memory for the sequential order of odor events with recognition of the odors in the list independent of memory for their order. On each trial, rats were presented with a series of five odors, selected randomly from a large pool of common household scents. Memory for each series was subsequently probed by a choice test where the animal was reinforced for selecting the earlier of two of the odors that had appeared in the series. For example, the rat might be initially presented with odors A then B then C then D then E. Following the delay, two nonadjacent odors, for example B and D, were presented and the animal would be rewarded for selecting odor that appeared earlier (in this case, B). On each trial, any pair of nonadjacent odors might be presented as the probe, so the animal had to remember the entire sequence in order to perform well throughout the testing session. After training over many days, rats performed sequential order judgments well above chance levels (Figure 28.4), indicating they can remember the order of a sequence of events in unique experiences. To examine the role of the
c28.indd 558
Percent correct (%)
(C)100
Item One of the following:
B vs E
A vs E
60
B vs D
C vs E
A vs D
Lag 1 For each sample cup:
*
70
50 Probe: A vs C
E
*
*
* *
Lag 2
Lag 3
Item performance Controls Hippocampus
90 80 70 60 50 Probe:
A vs X
B vs X
C vs X
D vs X
E vs X
of intervening elements). C: Hippocampal animals performed as well as controls on the recognition probes. X ⫽ Randomly selected odor that was not presented in the series and used as the alternative choice. From “Critical Role of the Hippocampus in Memory for Sequences of Events,” by N. J. Fortin, K. L. Agster, and H. Eichenbaum, 2002, Nature Neuroscience, 5, pp. 459 & 460. Reprinted with permission. * p < .05.
hippocampus in memory for the order of events in unique experiences, these subjects were divided into two groups matched for performance, then animals in one group were given selective hippocampal lesions whereas those in the other group received sham operations. After recovery, all animals were tested again on memory for the order of odors in unique odor sequences. Intact rats continued to perform well whereas rats with hippocampal lesions were severely impaired, performing no better than chance except when the judgment was easiest (when the odors were first and last in the series). The same rats were then also tested on their ability to recognize odors that were presented in the series. On each trial, a series of five odors was presented in a format identical to that used in the previous testing. Then recognition was probed using a choice test in which the animal was presented with one of the odors from the series and another odor from the pool that was not in the series, and food was buried in the odor not presented in the series. For example, the rat might instead be presented with the series A through E then, following a delay, an odor selected randomly from those initially sampled and an odor not presented in the sequence, for example, A and X, were presented. The rat would be rewarded for choosing X. Both intact rats and rats with selective hippocampal damage acquired the task rapidly and there was no overall performance difference
8/18/09 6:26:18 PM
Role of the Hippocampus in Recollection and Familiarity
between the groups in acquisition rate or final level of recognition performance (Figure 28.4). Furthermore, in both groups, recognition scores were consistently superior on probes involving odors that appeared later in the series, suggesting some forgetting of items that had to be remembered for a longer period and through more intervening items. A potential confound in any study that employs time as a critical dimension in episodic memory is that memories obtained at different times are likely to differ in the strength of their memory traces, due to the inherent decremental nature of memory traces. To what extent could normal animals be using differences in the relative strengths of memory traces for the odors to judge their sequential order? The observation of a temporal gradient in recognition performance by normal animals suggests that memories were in fact stronger for the more recently presented items in each sequence. These differences in trace strength potentially provide sufficient signals for the animals to judge the order of their presentation. However, the observation of the same temporal gradient of recognition performance in rats with hippocampal damage indicated that they had normal access to the differences in trace strengths for the odors. Yet these intact trace-strength differences were not sufficient to support above chance performance in the order probes. These considerations strongly suggest that normal rats also could not utilize the relative strengths of memories for the recently experienced odors, and instead based their sequential order judgments directly on remembering the odor sequence. The findings indicate that animals have the capacity to recollect the flow of events in unique experiences and that the hippocampus plays a critical role in this capacity. Episodes Are Represented as Sequences of Events Another common observation across species and many different behavioral protocols is that different hippocampal neurons fire during each successive event that composes task performance. Some cells are active during simple behaviors such as foraging for food (e.g., Muller, Kubie, & Ranck, 1987) and learned behaviors directed at relevant stimuli that have to be remembered (e.g., Hampson, Heyser, & Deadwyler, 1993), and the firing patterns have been observed across a broad range of learning protocols, from classical conditioning, discrimination learning, and nonmatching or matching to sample tasks to a variety of spatial learning and memory tasks (for review, see Eichenbaum, 2004). In each of these paradigms, a substantial proportion of hippocampal neurons show time-locked activations associated with each sequential event. Many of these cells show striking specificities corresponding to particular combinations of stimuli, behaviors, and the spatial location of the event.
c28.indd 559
559
These sequential firing patterns can be envisioned to represent a series of events and their places that compose a meaningful episode, and the information contained in these representations both distinguishes and links related episodes. Consider, for example, a study in which rats were trained on the classic spatial alternation task in a modified T-maze (Wood, Dudchenko, Robitsek, & Eichenbaum, 2000). Performance on this task requires that the animal distinguish left-turn and right-turn episodes and that it remember the immediately preceding episode to guide the choice on the current trial, and in that way, the task is similar in demands to those of episodic memory (Figure 28.5). We found that hippocampal neurons encode each sequential behavioral event and its locus within one type of episode, with most cells firing only when the rat is performing within either the left-turn or the right-turn type of episode. This was particularly evident for cells that fired when the rat was on the “stem” of the maze, that is, when the it traversed the same locations on both types of trials (Figure 28.5). Indeed, virtually all cells that fired when the rat was on the maze stem fired differentially on left-turn versus right-turn trials. The majority of cells showed strong selectivity, some firing almost exclusively as the rat performed one of the trial types, suggesting they were part of the representations of only one kind of episode. Other cells fired substantially on both trial types, potentially providing a link between left-turn and right-turn representations by the common places traversed on both trial types. These findings indicated that separate ensembles of neurons encoded the sequences of events that composed left-turn and rightturn trials. Notably there were also some cells that fired similarly on both trial types; these might serve to link the two types of episodes. Functional imaging studies in humans have also revealed hippocampal involvement in both spatial and nonspatial sequence representation. Several studies have shown that the hippocampus is active when people recall routes between specific start points and goals, but not when subjects merely follow a set of cues through space (Hartley, Maguire, Spiers, & Burgess, 2003). Evidence from functional imaging studies also indicates that the specific role of the hippocampus is to represent sequences of places traversed in a route, rather than a mapping of the environment. In a study examining navigation from a route perspective (person centered) and survey perspective (looking down from above; Shelton & Gabrieli, 2002), the hippocampus was more activated in the route condition, where navigation requires the association and continuous updating of different views and movements throughout the environment. In postscanning tests, subjects were asked to draw a map that described how they navigated the virtual environment in the route and survey conditions. All of the subjects
8/18/09 6:26:19 PM
560
Memory (A) ITI
ITI
(B) Adjusted means ** *** *** ⫺30
⫺20 ⫺10 0 Mean firing rate (Hz)
10
Adjusted means
*** ⫺5
Figure 28.5 Activity of hippocampal neurons in the T-maze spatial alternation task. Note: A: T-maze alternation task. Rats performed a continuous alternation task in which they traversed the central stem of the apparatus on each trial and then alternated between left and right turns at the T junction. ITI, Inter-trial interval. B: Examples of hippocampal neurons distinguishing left and right-turn episodes in the central stem. The stem was divided into four sectors for data analyses. In each example, the paths taken by the animals on the central stem are plotted in the left panel (light gray, left-turn trial; dark gray, right-turn trial). In the middle panels,
used a sequential strategy in their drawings of the route, but not survey task. As discussed by Shelton and Gabrieli (2002), route building required that subjects link together the sequences and views experienced while navigating the environment, engaging the hippocampus as a result of its purported role in mediating a memory space in both humans and animals (Eichenbaum, Dudchencko, Wood, Shapiro, & Tanila, 1999). In addition, the hippocampus is selectively activated when people learn sequences of pictures (Kumaran & Maguire, 2006). Even greater hippocampal activation is observed when subjects must disambiguate picture sequences that overlap, parallel to our findings on hippocampal cells that disambiguate spatial sequences (Wood et al., 2000). Networking Memories A third defining quality of recollection is our capacity to bring to mind multiple related memories, that is, memories that have common elements, and to make inferences from the information contained in those memories. To examine the extent to which animals can link memories
c28.indd 560
0 5 10 15 Mean firing rate (Hz)
the location of the rat when individual spikes occurred is indicated separately for left-turn trials (black dots) and right-turn trials (black dashes). In the right panel, the mean firing rate of the cell for each sector, adjusted for variations in firing associated with covariates, is shown separately for left-turn trials (light gray) and right-turn trials (dark gray). From “Hippocampal Neurons Encode Information about Different Types of Memory Episodes Occurring in the Same Location,” by E. Wood, P. Dudchenko, R. J. Robitsek, and H. Eichenbaum, 2000, Neuron, 27, p. 626. Reprinted with permission. ** p ⬍ .01; *** p ⬍ .001.
that share common elements, we studied whether rats can learn a set odor problems that share elements, and then tested whether they had integrated the memories into networks that support inferential judgments. One experiment compared the ability of normal rats and rats with selective damage to the hippocampus on their ability to learn a set of paired associate problems that contained common elements, and to interleave the representations of these problems in support of novel inferential judgments (Bunsey & Eichenbaum, 1996). Animals were initially trained on two sets of overlapping odor paired associates (e.g., A goes with B, B goes with C). Then the rats were given probe tests to determine if they could infer the relationships between items that were only indirectly associated through the common elements (A goes with C ?). Normal rats learned the paired associates and showed strong transitivity in the probe tests (Figure 28.6). Rats with selective hippocampal lesions also learned the pairs over several trials but were severely impaired in the probes, showing no evidence of transitivity. In another experiment, rats learned a hierarchical series of premises that involved odor choice judgments between overlapping elements (e.g., A > B, B > C, C > D, D > E),
8/18/09 6:26:19 PM
Role of the Hippocampus in Recollection and Familiarity
A X
Premise pairs B⫹ B Y B Y Y⫹
Associative inference A C ? Z X C ? Z
C⫹ Z C Z⫹
*
70
0.4 Control 0.3
50
30 20 10
Preference index
Errors to criterion
60
40
561
Hippo
0.2 0.1 0.0 ⫺0.1 ⫺0.2
Figure 28.6 Performance on the associative transitivity task. Note: Rats with hippocampal lesions acquire the premise problems as readily as intact rats. Intact rats demonstrate transitive inference by a preference in the appropriate indirectly related stimulus. In contrast rats
then were probed on the relationship between indirectly related items (B > D ?; Figure 28.6). Normal rats learned the series and showed robust transitive inference on the probe tests. Rats with hippocampal damage also learned each of the initial premises but failed to show transitivity (Dusek & Eichenbaum, 1997). The combined findings from these studies show that rats with hippocampal damage can learn even complex associations, such as those embodied in the odor paired–associates and conditional discriminations. However, without a hippocampus, they do not interleave the distinct experiences by their common elements to form a relational network that supports inferential memory expression. Importantly, according to the present view, the hippocampus does not compute or directly mediate transitive judgments. Rather, the hippocampus mediates only the encoding and retrieval of information about previous experiences on which cortical areas might accomplish the critical judgment. One neocortical association area that receives hippocampal outputs and is likely critical to inferential judgments is prefrontal cortex (Waltz et al., 1999). Hippocampus Encodes Events That Can Link Related Memories In virtually all the studies described, some hippocampal neurons encode features that are common among different experiences—these representations could provide links between distinct memories. For example, in Moita and colleagues’ (2003) study of auditory fear conditioning, whereas some cells only fired to a tone when the animal was in a particular place, others fired associated with the
c28.indd 561
with hippocampal lesions do not show transitivity, indicating they have not represented the indirect relations. Hippo ⫽ Hippocampal lesion. Data source: Bunsey and Eichenbaum (1996).
tone wherever it was presented across trials. In the Wood et al. (1999) study on odor recognition memory, whereas some cells showed striking associative coding of odors, their match/nonmatch status, and places, other cells fired associated with one of those features across different trials. Some cells fired during a particular phase of the approach toward any stimulus cup. Others fired differentially as the rat sampled a particular odor, regardless of its location or match-nonmatch status. Other cells fired only when the rat sampled the odor at a particular place, regardless of the odor or its status. Yet, other cells fired differentially associated with the match and nonmatch status of the odor, regardless of the odor or where it was sampled. Similarly, in Ekstrom and colleagues’ (2003) study on humans performing a virtual navigation task, whereas some hippocampal neurons fired associated with combinations of views, goals, and places, other cells fired when subjects viewed particular scenes, occupied particular locations, or had particular goals in findings passengers or locations for drop off. Also, Rivard, et al., (2004) studied rats exploring objects in open fields, finding that whereas some cells fired selectively associated with an object in one environment, others fired associated with the same object across environments. The notion that these cells might reflect the linking of important features across experiences and the abstraction of common information was highlighted in more recent studies on monkeys and humans. Hampson, Pons, Stanford, and Deadwyler (2004) trained monkeys on matching to sample problems, then probed the nature of the representation of
8/18/09 6:26:20 PM
562
Memory
stimuli by recording from hippocampal cells when the animals were shown novel stimuli that shared features with the trained cues. They found many hippocampal neurons that encoded meaningful categories of stimulus features and appeared to employ these representations to recognize the same features across many situations. Kreiman, Koch, and Fried (2000) characterized hippocampal firing patterns in humans during presentations of a variety of visual stimuli. They reported a substantial number of hippocampal neurons that fired when the subject viewed specific categories of material, for example, faces, famous people, animals, scenes, houses, across many exemplars of each. A subsequent study showed that some hippocampal neurons are activated when a subject views any of a variety of different images of a particular person, suggesting these cells could link the recollection of many specific memories related to that person (Quiroga, Reddy, Kreiman, Koch, & Fried, 2005). This combination of findings across species provides compelling evidence for the notion that some hippocampal cells represent common features among the various episodes that could serve to link memories obtained in separate experiences. Furthermore, recent functional imaging studies have associated activation of the hippocampus in humans to the performance of performing transitive inference tasks similar to those described above as dependent on the hippocampus in animals. In one study, subjects learned overlapping paired associations between faces and houses or direct face-face associations (Preston, Shrager, Dudukovic, & Gabrieli, 2004). The hippocampus was selectively activated when people identified the indirect associations between faces that were paired with the same house as compared with direct face-face associations. In another study, subjects were trained on the task which involves a hierarchical series of judgments (A ⬎ B, B ⬎ C, C ⬎ D, D ⬎ E) or a series of nonoverlapping judgments (K ⬎ L, M ⬎ N, O ⬎ P, Q ⬎ R; Heckers, Zalezak, Weiss, Ditman, & Titone, 2004). The hippocampus was activated when subjects performed transitive judgments as compared to novel judgments between items taken from the nonoverlapping pairs. Under some circumstances, it may be possible to indirectly relate items without a memory network (O’Reilly & Rudy, 2001; Van Elzakker, O’Reilly, & Rudy, 2003), but the previous described results provide compelling evidence that the hippocampus is indeed involved in binding-related memories and in using these memories to make novel inferential judgments.
EPISODIC MEMORY AND THE HIPPOCAMPAL MEMORY SYSTEM A consideration of the anatomical organization of the major circuitry involving the hippocampus and neocortex
c28.indd 562
“what”
“where” Neocortical areas
PRC-LEA Item
PHC-MEA Context
Parahippocampal region
Hippocampus Item-in-Context
Figure 28.7 Functional organization of the hippocampal system.
provides further insights into basic mechanisms that underlie recollection across diverse species. In primates, the hippocampus receives an enormous variety of information from virtually every cortical association area, and this information is funneled into the hippocampus via the parahippocampal region, which is subdivided into the perirhinal cortex, parahippocampal cortex, and entorhinal cortex (Figure 28.7). The cortical outputs of hippocampal processing involve feedback connections from the hippocampus successively back to the entorhinal cortex, then the perirhinal and parahippocampal cortex, and finally, the neocortical areas from which the inputs to the hippocampus originated (Amaral & Witter, 1995). To what extent is the organization of this system similar in mammalian species? The internal circuitry of the hippocampus itself is largely conserved across mammalian species (Manns & Eichenbaum, 2006). The subdivisions of the hippocampus are connected by a serial, unidirectional path, starting with the dentate gyrus, and continuing through CA3, then CA1, and then the subiculum. Furthermore, anatomical details involving several topographical and parallel organizations are highly similar in species including rats, cats, and monkeys, as well as other species (see Amaral & Witter, 1995; Witter, Wouterlood, Naber, & Van Haeften, 2000, for reviews). There is also considerable conservation of the areas of the parahippocampal region. The perirhinal, parahippocampal (called postrhinal cortex in rats), and entorhinal subdivisions of the parahippocampal region are similar in cytoarhcitechture in rats, mice, and monkeys, and the connectivity among these areas is also remarkably
8/18/09 6:26:20 PM
Episodic Memory and the Hippocampal Memory System 563
similar (Burwell, Witter, & Amaral, 1995). In contrast to the conservation of hippocampal and parahippocampal circuitry, the neocortical regions that are the ultimate origin of hippocampal inputs differ substantially from species to species. For example, there are numerous dissimilarities in the neocortex that reflect general differences between small-brained and big-brained mammals, such as cortical size, laminar stratification, and number of polymodal association areas (Krubitzer & Kaas, 2005; Manns & Eichenbaum, 2006). Further, the extent of cortical areas devoted to a particular sensory modality also varies substantially between species. Despite major species differences in the neocortex, the organization of cortical inputs to the hippocampus is remarkably similar in rodents and primates. Across species, most of the neocortical input to the perirhinal cortex comes from association areas that process unimodal sensory information about qualities of objects (i.e., “what” information), whereas most of the neocortical input to the parahippocampal cortex comes from areas that process polymodal spatial (“where”) information (Burwell et al., 1995; Suzuki & Amaral, 1994). There are connections between the perirhinal cortex and parahippocampal cortex, but the “what” and “where” streams of processing remain largely segregated as the perirhinal cortex projects primarily to the lateral entorhinal area whereas the parahippocampal cortex projects mainly to the medial entorhinal area. Similarly, there are some connections between the entorhinal areas, but the “what” and “where” information streams mainly converge within the hippocampus. These anatomical considerations suggest a functional organization of the flow of information into and out of the hippocampus. Substantial evidence indicates that neurons in the perirhinal cortex and lateral entorhinal cortex are involved in the representation of individual perceptual stimuli. Electrophysiological studies on monkeys and rats performing simple recognition tasks have shown that many cells in the perirhinal cortex exhibit enhanced or suppressed responses to stimuli when they reappear in a recognition test (Suzuki & Eichenbaum, 2000). Similarly, in humans, among all areas within the medial temporal lobe, the perirhinal area selectively shows suppressed responses to familiar stimuli (Henson, Cansino, Herron, Robb, & Rugg, 2003). Complementary studies in animals with damage to the perirhinal cortex indicate that this area may be critical to memory for individual stimuli in the delayed nonmatching to sample task in rats (Otto & Eichenbaum, 1992) and monkeys (Suzuki, Zola-Morgan, Squire, & Amaral, 1993). These and other data have led several investigators to the view that the perirhinal cortex is specialized for identifying the memory strength of individual stimuli (e.g., Brown & Aggleton, 2001; Henson et al., 2003).
c28.indd 563
By contrast, the parahippocampal cortex and medial entorhinal area may be specialized for processing spatial context. Whereas perirhinal and lateral entorhinal neurons have poor spatial coding properties, parahippocampal and medial entorhinal neurons show strong spatial coding (Hargreaves, Rao, Lee, & Knierim, 2005). In addition, whereas object recognition is impaired following perirhinal damage, object-location recognition is deficient following parahippocampal cortex damage in rats (Gaffan, Healey, & Eacott, 2004) and monkeys (Alvarado & Bachevalier, 2005). Similarly, perirhinal cortex damage results in greater impairment in memory for object pairings whereas parahippocampal cortex lesions results in greater impairment in memory for the context in which an object was presented (Norman & Eacott, 2005). Parallel findings from functional imaging studies in humans have dissociated object processing in the perirhinal cortex from spatial processing in the parahippocampal cortex (Pihlajamaki et al., 2004). Furthermore, whereas the perirhinal cortex is activated in association with the memory strength of specific stimuli (Henson et al., 2003), the parahippocampal cortex is activated during recall of spatial and nonspatial context (Bar & Aminoff, 2003). Compelling support for differentiation of functions associated with episodic recollection comes from withinstudy dissociations that reveal activation of the perirhinal cortex selectively is associated with familiarity and activity in the hippocampus as well as the parahippocampal cortex is selectively associated with recollection (Eichenbaum et al., 2007). These and many other results summarized in this review suggest a functional dissociation between the perirhinal cortex, where activation changes are consistently associated with familiarity, and the hippocampus and parahippocampal cortex, where activation changes are consistently associated with recollection. An outstanding question in these studies is whether the parahippocampal cortex and hippocampus play different roles in recollection. The findings on parahippocampal activation associated with viewing spatial scenes suggests the possibility that this area is activated during recollection because recall involves retrieval of spatial contextual information (Bar & Aminoff, 2003). By contrast, the hippocampus may be activated associated with the combination of item and context information. These findings are consistent with the anatomically guided hypothesis about the functional organization of the hippocampal system presented in Figure 28.6 and suggest mechanisms by which the anatomical components of this system interact in support of the phenomenology of recollection. Following experience with a stimulus, the perirhinal and lateral entorhinal areas may match a memory cue to a stored template of the stimulus, reflected in suppressed
8/18/09 6:26:20 PM
564
Memory
activation that signals familiarity. Outputs from perirhinal and lateral entorhinal areas back to neocortical areas may be sufficient to generate the sense of familiarity without participation of the hippocampus. In addition, during the initial experience, information about the to-be-remembered stimulus, processed by the perirhinal and lateral entorhinal areas, and about the spatial and possibly nonspatial context of the stimulus, is processed by the parahippocampal and medial entorhinal areas, converge in the hippocampus. During subsequent retrieval, presentation of the cue may drive the recovery of object-context representations in the hippocampus that, via back projections, regenerates a representation of the contextual associations in parahippocampal and medial entorhinal areas, which cascades that information back to neocortical areas that originally processed the item and contextual information. This processing pathway may constitute a principle mechanism for recollection of unique events across species (Eichenbaum et al., 2007).
a reconciliation of the findings on animals and humans. The objectively observable features of episodic recollection are supported by interactions between the cortex and hippocampus similarly in all mammalian species. Where species differ most is in the elaboration of the cerebral cortex, including those areas implicated in representation of the self. Therefore, consistent with Moscovitch’s proposal, the information processing and neural system that supports episodic recollection appears to be conserved across species, but the contents of episodic memories may differ among species, including the nature and extent of self-awareness as a part of the information that is encoded and retrieved in an episodic memory. Therefore, future investigations on animals about how the cortical-hippocampal system supports episodic memory are entirely valid, whereas investigations on self-awareness in memory can be considered a distinct question to be pursued independently in analyses of the relevant cortical networks.
SUMMARY REFERENCES The findings reviewed in this chapter indicate that humans and animals possess the same objectively observable features of episodic memory, and that the hippocampus plays a critical role in each. Animals as well as humans can remember where and when events occurred, and their retrieval dynamics indicates this remembering is similar to that in human recollection. Animals and humans can remember the order of events in unique experiences, and they can disambiguate overlapping experiences. Furthermore, each of these capacities is as dependent on the hippocampus in animals as it is in humans. Indeed, the cortical-hippocampal system that mediates each of these features of recollection is remarkably conserved in mammalian species, including humans. In an effort to focus on cross-species comparisons, the current review omitted consideration of Tulving’s (2002) requirement for subjectively experienced features of recollection, specifically the inclusion of one’s awareness of participation in a remembered episode. As stated at the outset, autonoetic awareness in episodic memory is beyond investigation in animals, and this would seem to preclude a definitive conclusion about whether animals have the full set of features of episodic memory. However, functional imaging studies have identified a network of cortical areas that is engaged in autobiographical memory (Cabeza & St. Jacques, 2007) and the sense of self (Northoff & Bermpohl, 2004). Combining these anatomical findings with the proposal by Moscovitch (1995) that self-awareness in episodic memory is constituted as the encoding and retrieval of information about one’s personal participation in the episode, suggests
c28.indd 564
Aggleton, J. P., & Brown, M. W. (2006). Interleaving brain systems for episodic and recognition memory. Trends in Cognitive Sciences, 10, 455–463. Aggleton, J. P., Vann, S. D., Denby, C., Dix, S., Mayes, A. R., et al. (2005). Sparing of the familiarity component of recognition memory in a patient with hippocampal pathology. Neuropsychologia, 43, 1810–1823. Alvarado, M. C., & Bachevalier, J. (2005). Comparison of the effects of damage to the perirhinal and parahippocampal cortex on transverse patterning and location memory in rhesus macaques. Journal of Neuroscience, 25, 1599–1609. Amaral, D. G., & Witter, M. P. (1995). Hippocampal formation. In G. Pacinos (Ed.), The rat nervous system (2nd ed., pp. 443–493). San Diego, CA: Academic Press. Bar, M., & Aminoff, E. (2003). Cortical analysis of visual context. Neuron, 38, 347–358. Brown, M. W., & Aggleton, J. P. (2001). Recognition memory: What are the roles of the perirhinal cortex and hippocampus? Nature Reviews Neuroscience, 2, 51–61. Bunsey, M., & Eichenbaum, H. (1996, January 18). Conservation of hippocampal memory function in rats and humans. Nature, 379, 255–257. Burwell, R. D., Witter, M. P., & Amaral, D. G. (1995). Perirhinal and postrhinal cortices of the rat: A review of the neuroanatomical literature and comparison with findings from the monkey brain. Hippocampus, 5, 390–408. Cabeza, R., & St. Jacques, P. (2007). Functional neuroimaging of autobiographical memory. Trends in Cognitive Sciences, 11, 219–227. Clayton, N. S., Bussey, T. J., & Dickinson, A. (2003). Can animals recall the past and plan for the future? Nature Reviews Neuroscience, 4, 685–691. Corkin, S. (1984). Lasting consequences of bilateral medial temporal lobectomy: Clinical course and experimental findings in HM. Seminars in Neurology, 4, 249–259. Davachi, L. (2006). Item, context and relational episodic encoding in humans. Current Opinion in Neurobiology, 16, 693–700.
8/18/09 6:26:21 PM
References 565 Davachi, L., Mitchell, J. P., & Wagner, A. D. (2003). Multiple routes to memory: Distinct medial temporal lobe processes build item and source memories. Proceedings of the National Academy of Sciences, USA, 100, 2157–2162. Day, M., Langston, R., & Morris, R. G. M. (2003, July 10). Glutamatereceptor-mediated encoding and retrieval of paired-associate learning. Nature, 424, 205–209. Dusek, J. A., & Eichenbaum, H. (1997). The hippocampus and memory for orderly stimulus relations. Proceedings of the National Academy of Sciences, USA, 94, 7109–7114. Eichenbaum, H. (2000). A cortical-hippocampal system for declarative memory. Nature Reviews Neuroscience, 1, 41–50.
Hollup, S. A., Molden, S., Donnett, J. G., Moser, M. B., & Moser, E. I. (2001). Accumulation of hippocampal place fields at the goal location in an annular watermaze task. Journal of Neuroscience, 21, 1635–1644. Kesner, R. P., Gilbert, P. E., & Barua, L. A. (2002). The role of the hippocampus in memory for the temporal order of a sequence of odors. Behavioral Neuroscience, 116, 286–290. Kreiman, K., Koch, C., & Fried, I. (2000). Category specific visual responses of single neurons in the human medial temporal lobe. Nature Neuroscience, 3, 946–953.
Eichenbaum, H. (2004). Hippocampus: Cognitive processes and neural representations that underlie declarative memory. Neuron, 44, 109–120.
Krubitzer, L., & Kaas, J. (2005). The evolution of the neocortex in mammals: How is phenotypic diversity generated? Current Opinions in Neurobiology, 15, 444–453.
Eichenbaum, H., & Cohen, N. J. (2001). From conditioning to conscious recollection: Memory systems of the brain. New York: Oxford University Press.
Kumaran, D., & Maguire, E. A. (2006). The dynamics of hippocampal activation during encoding of overlapping sequences. Neuron, 49, 617–629.
Eichenbaum, H., Dudchencko, P., Wood, E., Shapiro, M., & Tanila, H. (1999). The hippocampus, memory, and place cells: Is it spatial memory or a memory space? Neuron, 23, 209–226.
Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A user ’s guide. New York: Cambridge University Press.
Eichenbaum, H., Yonelinas, A. R., & Ranganath, C. (2007). The medial temporal lobe and recognition memory. Annual Review of Neuroscience, 20, 123–152. Ekstrom, A. D., Kahana, M. J., Caplan, J. B., Fields, T. A., Isham, E. A., Newman, E. L., et al. (2003, September 11). Cellular networks underlying human spatial navigation. Nature, 425, 184–187.
Manns, J. R., & Eichenbaum, H. (2006). Evolution of the hippocampus. In J. H. Kaas (Ed.), Evolution of nervous systems (Vol 3. pp. 465–490). New York: Academic Press. Moita, M. A. P., Moisis, S., Zhou, Y., LeDoux, J. E., & Blair, H. T. (2003). Hippocampal place cells acquire location specific responses to the conditioned stimulus during auditory fear conditioning. Neuron, 37, 485–497.
Ergorul, C., & Eichenbaum, H. (2004). The hippocampus and memory for “What,” “Where,” and “When.”Learning and Memory, 11, 397–405.
Moscovitch, M. (1995). Recovered consciousness: A hypothesis concerning modularity and episodic memory. Journal of Clinical Experimental Neuropsychology, 17, 276–290.
Fortin, N. J., Agster, K. L., & Eichenbaum, H. (2002). Critical role of the hippocampus in memory for sequences of events. Nature Neuroscience, 5, 458–462.
Muller, R. U., Kubie, J. L., & Ranck, J. B., Jr. (1987). Spatial firing patterns of hippocampal complex spike cells in a fixed environment. Journal of Neuroscience, 7, 1935–1950.
Fortin, N. J., Wright, S. P., & Eichenbaum, H. (2004, September 9). Recollection-like memory retrieval in rats is dependent on the hippocampus. Nature, 431, 188–191.
Norman, G., & Eacott, M. J. (2005). Dissociable effects of lesions to the perirhinal cortex and the postrhinal cortex on memory for context and objects in rats. Behavioral Neuroscience, 119, 557–566.
Gaffan, E. A., Healey, A. N., & Eacott, M. J. (2004). Objects and positions in visual scenes: Effects of perirhinal and postrhinal cortex lesions in the rat. Behavioral Neuroscience, 118, 992–1010.
Northoff, G., & Bermpohl, F. (2004). Cortical midline structures and the self. Trends in Cognitive Science, 8, 102–107.
Gothard, K. M., Skaggs, W. E., Moore, K. M., & McNaughton, B. L. (1996). Binding of hippocampal CA1 neural activity to multiple reference frames in a landmark-based navigation task. Journal of Neuroscience, 16, 823–835. Hampson, R. E., Heyser, C. J., & Deadwyler, S. A. (1993). Hippocampal cell firing correlates of delayed-match-to-sample performance in the rat. Behavioral Neuroscience, 107, 715–739. Hampson, R. E., Pons, T. P., Stanford, T. R., & Deadwyler, S. A. (2004). Categorization in the monkey hippocampus: A possible mechanism for encoding information into memory. Proceedings of the National Academy of Sciences, USA, 101, 3184–3189. Hargreaves, E. L., Rao, G., Lee, I., & Knierim, J. J. (2005, June 17). Major dissociation between medial and lateral entorhinal input to dorsal hippocampus. Science, 308(5729), 1792–1794. Hartley, T., Maguire, E. A., Spiers, H. J., & Burgess, N. (2003). The wellworn route and the path less traveled: Distinct neural bases of route following and wayfinding in humans. Neuron, 37, 877–888. Heckers, S., Zalezak, M., Weiss, A. P., Ditman, T., & Titone, D. (2004). Hippocampal activation during transitive inference in humans. Hippocampus, 14, 153–162. Henson, R. N., Cansino, S., Herron, J. E., Robb, W. G., & Rugg, M. D. (2003). A familiarity signal in human anterior medial temporal cortex? Hippocampus, 13, 301–304. Henson, R. N., Rugg, M. D., Shallice, T., Josephs, O., & Dolan, R. J. (1999). Recollection and familiarity in recognition memory: An
c28.indd 565
event- related functional magnetic resonance imaging study. Journal of Neuroscience, 19, 3962–3972.
O’Reilly, R. C., & Rudy, J. W. (2001). Conjunctive representations in learning and memory: Principles of cortical and hippocampal function. Psychological Reviews, 108, 311–345. Otto, T., & Eichenbaum, H. (1992). Complementary roles of orbital prefrontal cortex and the perirhinal-entorhinal cortices in an odor-guided delayed non-matching to sample task. Behavioral Neuroscience, 106, 763–776. Parks, C. M., & Yonelinas, A. P. (2007). Moving beyond signal detection models: Comment on Wixted. Psychological Review, 114, 188–202. Pihlajamaki, M., Tanila, H., Kononen, M., Hanninen, A., Soininen, H., & Aronen, H. J. (2004). Visual presentation of novel objects and new spatial arrangements of objects differentially activates the medial temporal lobe areas in humans. European Journal of Neuroscience, 19, 1939–1949. Preston, A., Shrager, Y., Dudukovic, N. M., & Gabrieli, J. D. E. (2004). Hippocampal contribution to the novel use of relational information in declarative memory. Hippocampus, 14, 148–152. Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., & Fried, I. (2005, June 23). Invariant visual representation by single neurons in the human brain. Nature, 435(7045), 1102–1107. Ranganath, C., Yonelinas, A. P., Cohen, M. X., Dy, C. J., Tom, S. M., & D’Esposito, M. (2003). Dissociable correlates of recollection and familiarity within the medial temporal lobes. Neuropsychologia, 42, 2–13. Rivard, B., Li, Y., Lenck-Santini, P.-P., Poucet, B., & Muller, R. U. (2004). Representation of objects in space by two classes of hippocampal pyramidal cells. Journal of General Physiology, 124, 9–25.
8/18/09 6:26:21 PM
566
Memory
Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal lesions. Journal of Neurology, Neurosurgery, and Psychiatry, 20, 11–12.
Wirth, S., Yanike, M., Frank, L. M., Smith, A. C., Brown, E. N., & Suzuki, W. A. (2003, June 6). Single neurons in the monkey hippocampus and learning of new associations. Science, 300, 1578–1581.
Shelton, A. L., & Gabrieli, J. D. E. (2002). Neural correlates of encoding space from route and survey perspectives. Journal of Neuroscience, 22, 2711–2717.
Witter, M. P., Wouterlood, F. G., Naber, P. A., & Van Haeften, T. (2000). Anatomical organization of the parahippocampal-hippocampal network. Annals of the New York Academy of Science, 911, 1–24.
Squire, L. R., Stark, C. E. L., & Clark, R. E. (2004). The medial temporal lobe. Annual Review of Neuroscience, 27, 279–306.
Wixted, J. T. (2007). Dual process theory and signal detection theory of recognition memory. Psychological Review, 114, 152–176.
Suzuki, W. A., & Amaral, D. G. (1994). Perirhinal and parahippocampal cortices of the macaque monkey: Cortical afferents. Journal of Comparative Neurology, 350, 497–533.
Wixted, J. T., & Stretch, V. (2004). In defense of the signal detection interpretation of remember/know judgments. Psychonomic Bulletin and Review, 11, 616–641.
Suzuki, W. A., & Eichenbaum, H. (2000). The neurophysiology of memory. Annals of the New York Academy of Science, 911, 175–191.
Wood, E., Dudchenko, P. A., & Eichenbaum, H. (1999, February 18). The global record of memory in hippocampal neuronal activity. Nature, 397, 613–616.
Suzuki, W. A., Zola-Morgan, S., Squire, L. R., & Amaral, D. G. (1993). Lesions of the perirhinal and parahippocampal cortices in the monkey produce long-lasting memory impairment in the visual and tactual modalities. Journal of Neuroscience, 13, 2430–2451. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review Psychology, 53, 1–25. Uncapher, M. R., Otten, L. J., & Rugg, M. D. (2006). Episodic encoding is more than the sum of its parts: An fMRI investigation of multifeatural contextual encoding. Neuron, 52, 547–556. Van Elzakker, M., O’Reilly, R. C., & Rudy, J. W. (2003). Transitivity, flexibility, conjunctive representations, and the hippocampus: Pt. I. An empirical analysis. Hippocampus, 13, 334–340.
Wood, E., Dudchenko, P. A., Robitsek, R. J., & Eichenbaum, H. (2000). Hippocampal neurons encode information about different types of memory episodes occurring in the same location. Neuron, 27, 623–633. Yonelinas, A. P. (2001). Components of episodic memory: The contribution of recollection and familiarity. Philosophical Transcripts of the Royal Society of London: Biological Sciences, 356, 1363–1374. Yonelinas, A. P. (2002). The nature of recollection and familiarity: A review of 30 years of research. Journal of Memory and Language, 46, 441–517.
Wais, P. E., Wixted, J. T., Hopkins, R. O., & Squire, L. R. (2006). The hippocampus supports both the recollection and the familiarity components of recognition memory. Neuron, 49, 459–466.
Yonelinas, A. P., Kroll, N. E., Quamme, J. R., Lazzara, M. M., Sauve, M. J., Widaman, K. F., et al. (2002). Effects of extensive temporal lobe damage or mild hypoxia on recollection and familiarity. Nature Neuroscience, 5, 1236–1241.
Waltz, J. A., Knowlton, B. J., Holyoak, K. J., Boone, K. B., Mishkin, F. S., Sanbtos, M. M., et al. (1999). A system for relational reasoning in human prefrontal cortex. Psychological Science, 10, 119–125.
Yonelinas, A. P, Otten, L. J., Shaw, K. N., & Rugg, M. D. (2005). Separating the brain regions involved in recollection and familiarity in recognition memory. Journal of Neuroscience, 25, 3002–3008.
c28.indd 566
8/18/09 6:26:21 PM
Chapter 29
Psychological and Neural Mechanisms of Short-Term Memory CINDY LUSTIG, MARC G. BERMAN, DEREK EVAN NEE, RICHARD L. LEWIS, KATHERINE SLEDGE MOORE, AND JOHN JONIDES
To comprehend this sentence, you must hold the beginning phrase in mind while reading and processing the rest of the words. The ability to remember and process information over a short time is essential to almost any activity, making short-term memory essential for cognition. In this chapter, we integrate psychological constructs of short-term memory that are drawn from behavioral data with their likely neural bases, especially as revealed by studies of patients and studies that use neuroimaging. Our discussion is organized around three questions that any account of short-term memory must address:
3. What causes forgetting? A complete theory of shortterm memory must describe how information learned only seconds ago can be forgotten. We consider the behavioral and neurophysiological evidence for the two dominant accounts of forgetting (interference and decay) and we suggest a possible mechanism for short-term forgetting that may underlie both proposed accounts. After addressing these questions, we sketch out a model that illustrates the links between psychological constructs and neural structures as an item moves through the stages of short-term memory from initial perception, maintenance over time and in the face of interfering information, and ultimate retrieval back into the focus of attention. To presage that model, we argue that short-term memory exhibits the following properties:
1. What is the structure of short-term memory? A proper theory must describe an architecture that implements the short-term storage of representations. The dominant answer to this question has long been a model consisting of short-term storage buffers that are coordinated by a central executive and that are dissociable from long-term storage. Recently, there has been a shift toward models that do not distinguish short- and longterm stores. Instead, these models posit that short-term memory consists of a focus of attention that operates on perceptual and long-term memory representations. Our review focuses on these recent models and their likely neural underpinnings. 2. What processes operate on the stored information? A proper theory must articulate the processes that create and operate on representations, and how these processes can be implemented within the structure described. These processes may include encoding and maintenance operations, shifts of the attentional focus, and retrieval of items into the focus of attention. Although rehearsal is often colloquially associated with short-term memory, we argue that it represents a strategic use of retrieval rather than a primary process.
• Short-term memory consists of the temporary activation of long-term memory or perceptual representations. • This temporary activation is severely limited to at most four representations. • There are elementary processes that operate on these representations to encode, maintain, and retrieve them. • Forgetting is largely accounted for by interference among competing representations. STRUCTURE OF SHORT-TERM MEMORY Classic Model: Short-Term and Long-Term Memory as Separate Stores Any discussion of short-term memory (STM) must begin with the highly influential model developed by Baddeley and colleagues (e.g., Baddeley, 1986, 1992; Baddeley & Hitch,
567
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c29.indd 567
8/17/09 2:27:07 PM
568
Psychological and Neural Mechanisms of Short-Term Memory
1974; Repov & Baddeley, 2006). This model is the prototypical example of multistore models of short-term memory. The defining feature of these models is that they describe STM as separate from long-term memory (LTM). That is, information held in mind over the course of a few seconds is stored separately from information held over the course of long periods of time. Other common features include the separation of STM into different buffers based on information modality, and the separation between storage buffers and the executive control processes that coordinate the buffers and operate on the material within them. Figure 29.1 illustrates Baddeley’s working memory model (Baddeley, 2000; Baddeley & Hitch, 1974) and the brain structures that have been linked to each component (Smith & Jonides, 1999). Fundamental components of the model include the short-term storage buffers, which are different for different types of information and from long-term storage. The phonological loop is assumed to hold information that can be rehearsed verbally (e.g., letters, digits). A visuospatial sketchpad is assumed to maintain visual information and can be further fractionated into visual/object and spatial stores (Repov & Baddeley, 2006; Smith et al., 1995). An episodic buffer that draws on the other buffers and LTM has been added to account for the retention of multimodal information (Baddeley, 2000). A separate central executive is responsible for working memory processes that require operations on the items stored in the buffers. This central executive is also thought to be responsible for coordinating the interplay among the various buffers and their interactions with LTM. The earliest evidence for buffers that vary by modality came from studies showing that secondary verbal tasks interfered with verbal STM but not visual STM, and vice versa (e.g., Brooks, 1968; den Heyer & Barrett, 1971). This double dissociation implied uniquely verbal processes
for verbal STM, and uniquely visual processes for visual STM, arguing for separate stores. More recent neuroimaging research has further investigated the neural correlates of the reputed independence of STM buffers. Verbal STM has been shown to rely primarily on left inferior frontal and left parietal cortices, spatial STM on right posterior dorsal frontal and right parietal cortices, and object/visual STM on left inferior frontal, left parietal, and left inferior temporal cortices (e.g., Awh et al., 1996; Jonides et al., 1993; Smith & Jonides, 1997; see review by Wager & Smith, 2003). Verbal STM shows a marked left-hemisphere preference, whereas spatial and object STM can be distinguished mainly by a dorsal versus ventral separation in posterior cortices (consistent with Ungerleider & Haxby, 1994; see Baddeley, 2003, for an account of the function of these regions in the service of STM). These neural dissociations provide further evidence for separable short-term stores, and they are illustrated as well in Figure 29.2 from Fuster (2001) who laid out an argument for the separability of storage by modality and the interaction of these storage systems with frontal mechanisms. The idea of separate storage by modality is still wellaccepted, especially with regard to the posterior regions (e.g., left parietal for verbal, right parietal for spatial). However, the rest of Baddeley’s model—which argues for separable STM and LTM systems—is less well supported. Initially, the most compelling data to motivate a separation of STM from LTM came from brain-injured patients who seemed to show a double dissociation between the two systems. Patients with parietal and temporal lobe damage showed impaired short-term phonological capabilities but
Visual Associated brain regions: right posterior dorsal frontal right parietal cortices
Auditory c
c Sample
Delay
Match
Cue
Delay
Choice
Delay
c Choice
Spatial buffer Verbal buffer Associated brain regions: Left inferior Frontal Left parietal
Tactile
Central executive Associated brain regions: Dorsal frontal
Object/ visual buffer Associated brain regions: left inferior frontal left parietal left inferior temporal cortices
Figure 29.1 Baddeley’s working memory model.
c29.indd 568
Spatial Cue
Sample
Delay
Choice
Figure 29.2 Cortical interactions in working memory. Note: Fuster emphasized the reciprocal and reentrant connections between the prefrontal cortex and sensory regions in working memory; note that similar connectivity applies in long-term memory. From “The Prefrontal Cortex: An Update: Time Is of the Essence,” by J. M. Fuster, 2001, Neuron, 30, p. 329. Reprinted with permission.
8/17/09 2:27:07 PM
Structure of Short-Term Memory 569
intact long-term memory (Shallice & Warrington, 1970; Vallar & Papagano, 2002). Conversely, patients with medial temporal lobe (MTL) damage were often claimed to demonstrate impaired long-term memory but preserved short-term memory (e.g., Baddeley & Warrington, 1970; Scoville & Milner, 1957). However, some research suggests that MTL patients’ real problem may be forming new associations and bindings, a process preferentially tapped by long-term episodic memory tests, but that can also be endemic in STM tests. For example, if the task requires associating an item with a particular spatial location, these patients show profound deficits even after very short delays (Olson, Page, Moore, Chatterjee, & Verfaellie, 2006). On the other side of the STM/LTM dissociation, patients with left perisylvian damage that results in STM deficits also have deficits in phonological processing in general, suggesting a deficit that extends beyond STM per se (e.g., Martin, 1993). Finally, functional neuroimaging data from healthy adults also suggest that STM and LTM have more commonalities than differences (e.g., Braver et al., 2001; Cabeza, Dolcos, Graham, & Nyberg, 2002; Ranganath & Blumenfeld, 2005). Thus, the view that STM and LTM are separable based on studies of patients is open to reinterpretation. Another argument for separate STM and LTM systems arose from early work using single-unit recordings that appeared to identify cortical regions specialized for STM, consistent with the assumption that the two types of memory are stored in separate areas of the brain. This work showed single-unit activity in dorsolateral prefrontal cortical regions (principal sulcus, inferior convexity) that was selectively responsive to memoranda during the delay (retention interval) of STM tasks. This delay activity was interpreted as evidence that these regions were the storage sites for STM (e.g., Funahashi, Bruce, & Goldman-Rakic, 1989; Fuster, 1973; Wilson, O’Scalaidhe, & GoldmanRakic, 1993; see Jacobsen, 1936, for preceding lesion work). However, the sustained activation of frontal cortex during the delay period does not necessarily mean that this region is a site of STM storage. As we review in the following section, many other regions of neocortex also show activation that outlasts the physical presence of a stimulus and provides a possible neural basis for STM storage. The alternative view that we promote is that prefrontal cortical involvement in STM reflects the operation of processes that guide the use (encoding, maintenance, and retrieval) of information that is primarily perceived and stored via posterior regions (the same areas of purported LTM storage). This view receives support from studies showing that prefrontal cortical involvement may not be necessary for STM except in the face of distraction (Malmo, 1942; Postle & D’Esposito, 1999). By contrast, patients with left temporo-parietal damage show deficits in
c29.indd 569
phonological storage, regardless of the effects of interference (Vallar & Baddeley, 1984; Vallar & Papagano, 2002). One item on which multistore and unitary-store models agree is that central executive control processes are primarily implemented by prefrontal cortex (see, e.g., Figure 29.3). A meta-analysis of 60 functional neuroimaging studies indicated that increased demand for executive processing recruits the dorsolateral frontal cortex and posterior parietal cortex (Wager & Smith, 2003). By contrast, storage processes recruit predominately posterior areas in the primary and secondary association cortex. These results corroborate the evidence from lesion studies and support the distinction between storage and executive processing. Unitary-Store Models: Shared Representations in Perception, STM, and LTM Figure 29.4 illustrates several unitary-store models. The shared assumption of these models is that STM consists of a temporary activation of the same representations used for initial perception and LTM. As shown in Figure 29.4, the types of activation associated with STM may include both a privileged status in the focus of attention (most likely IFS
IFS
SF
SF CC
Figure 29.3 (Figure C.29 in color section) Executive functions are largely localized in prefrontal cortex. Note: Each color indicates data from studies of a different type of executive process. Green Response conflict; Pink Task novelty; Yellow Working memory load; Yellow Working memory load; Red Working memory delay; Blue Perceptual difficulty. From “Common Regions of the Human Frontal Lobe Recruited by Diverse Cognitive Demands,” by J. Duncan and A. M. Owen, 2000, Trends in Neuroscience, 23, pp. 475–483. Adapted with permission.
8/17/09 2:27:08 PM
570
Psychological and Neural Mechanisms of Short-Term Memory (A)
(B)
LTM with varying levels of activation
(C) Non-activated portion of LTM
Non-activated portion of LTM
Activated portion of LTM
Focus 1 item
Focus 4 items
Focus 1 item
McElree 2001
Cowan 2000
Oberauer 2002
Activated portion of LTM
Region of direct access 3 items
Note: A: From “Working Memory and Focal Attention,” by B. McElree, 2001, Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, pp. 817–835. Adapted with permission. B: From “The Magical Number 4 in Short-Term Memory: A Reconsideration of
Mental Storage Capacity,” by N. Cowan, 2000, Behavioral and Brain Sciences, 24, pp. 87–185. Adapted with permission. C: From “Access to Information in Working Memory: Exploring the Focus of Attention,” by K. Oberauer, 2002, Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, pp. 411–421. Adapted with permission.
implemented by active firing of the neurons involved in the representation) and out-of-focus but highly activated representations within LTM (perhaps implemented by short-term plasticity and synchronization of spontaneous activity in the neurons composing the representation). As we elaborate in the following section, the major distinction among models within the unitary-store family concerns the size or capacity of the attentional focus. Early versions of unitary-store models (e.g., Anderson, 1983; Atkinson & Shiffrin, 1971; Hebb, 1949) fell out of favor during the predominance of the multistore account. Recent developments have called some of the assumptions of the multistore model into doubt (see Jonides et al., 2008; Postle, 2006, for a more detailed discussion of the issues). At the same time, unitary-store models have been revived and elaborated (Anderson et al., 2004; Cowan, 1988, 1995, 2000; McElree, 2001; Oberauer, 2002; Verhaeghen, Cerella, & Basak, 2004, and others). When an item is initially perceived, it activates a distributed set of neurons throughout the brain regions involved in processing its different feature components: For a visually presented item, these would include some neurons in V4 that code color information about the object, some in the inferotemporal cortex that code shape information, and so on. Medial temporal lobe structures are involved in processing contextual information, including those aspects later needed for episodic memory. Depending on consolidation processes, the coordinated pattern of activation among the different feature components ultimately results in synaptic changes and long-term storage. Where does STM fit into this picture? Our take on this question is most like Oberauer’s (2002; see Figure 29.4C). One representation is in the focus of attention, either as the
result of recent perception or retrieval from LTM. The network of neurons involved in this representation is actively firing in conjunction with frontal and parietal networks involved in attention to that representation. This representation is immediately available and accessible; functionally, this means that it may be used to guide immediate action. A limited set of other recently perceived/retrieved representations are not in the focus, but maintain a relatively high level of activation and availability, perhaps implemented by shortterm plasticity mechanisms such as increased coordination of spontaneous activity (Destexhe & Contreras, 2006; Sussillo Toyoizumi, & Maass, 2007). This is the state that Oberauer (2002) terms “the region of direct access” (region here does not refer to brain region but a functional state), where roughly three items are in a heightened state of activation and can be accessed faster than those in the activated portions of LTM can, but slower than the one in the focus of attention. In short, unitary-store models posit that perception, STM, and LTM use the same underlying representations, but the state of those representations (active firing, short-term plasticity, long-term plasticity) may differ depending on which of these functions is involved. This contrasts with multistore models, which posit STM buffers that are separate from LTM storage. Unitary- and multistore models agree that posterior regions are clearly differentiated by information type (e.g., auditory, visual, spatial). However, they differ in their view of frontal activations: Unitary-store models view these as primarily related to processes operating on the representations, especially those involved in selecting a representation for the focus of attention and keeping it there. From the multistore perspective, frontal activations are coding the content of the representation itself—they are the site of the STM buffers. The current weight of evidence, from
Figure 29.4 Unitary STM models.
c29.indd 570
8/17/09 2:27:09 PM
Structure of Short-Term Memory 571 IPS tracks wm load to 4 VSTM
L
R
(t) 5.97 4.35
Figure 29.5 (Figure C.30 in color section) In a visual shortterm memory (VSTM) task, the intraparietal sulcus (IPS) tracks capacity up to four items. Note: Pictured on the left are IPS regions (left and right) activated during the memory task. Pictured on the right is a timecourse of IPS activity
neuroimaging studies, lesion data, and modeling work (see reviews by Damasio, 1989; McClelland, McNaughton, & O’Reilly, 1995; Reuter-Lorenz & Jonides, 2007), favors the unitary-store models.
Controversies over Capacity Regardless of whether one subscribes to multi- or unitarystore models, an important question is how much information can be held within a storage buffer (multistore models) or the focus of attention (unitary-store models). Multistore models describe capacity limits as dependent on the individual buffers, in particular the speed with which information can be rehearsed in that buffer versus the speed with which information is forgotten (Baddeley, 1986, 1992; Repov & Baddeley, 2006). In the verbal domain, for example, it has been shown that approximately two seconds’ worth of verbal information can be recirculated successfully (e.g., Baddeley, Thomson, & Buchanan, 1975). In unitary-store models, there may be some constraints imposed by the material-specific aspects of the representation, but the critical questions surround the capacity of the focus of attention. Even Miller’s (1956) classic paper acknowledged that the traditional estimate of “seven plus or minus two” is too large because it is based on studies that allowed participants to engage in processes of rehearsal and chunking, and therefore reflects contributions of the focus of attention, selectively activated representations, and LTM (see also, Cowan, 2000; Waugh & Norman, 1965.) Current models differ on whether the capacity is about four items (Cowan, 2000) or only one (Garavan, 1998; McElree, 2001; Verhaeghen & Basak, in press). Figure 29.4 shows the slight variations among these unitary STM models; we next review some of the evidence and issues surrounding this capacity debate.
c29.indd 571
Signal change (%)
0.3 1 2 3 4 6 8
0.2 0.1 0 0.1
0
2
4
6 8 10 12 14 Time (s)
following stimulus presentation. Different patterned lines indicate the number of items to be remembered (1, 2, 3, 4, 6, or 8) on each trial. From “Capacity Limit of Visual Short-Term Memory in Human Posterior Parietal Cortex,” by J. J. Todd and R. Marois, 2004, Nature, 428, pp. 751–754. Adapted with permission.
Behavioral and Neural Evidence for the Magic Number 4 Cowan (2000) reviewed an impressive body of evidence leading to his conclusion that the capacity limit is four items, plus or minus one (e.g., Sperling, 1960; see Cowan’s table 1). An important line of evidence comes from change-detection and other tasks that do not require the serial recall of individual items that may lead to interference in output and therefore underestimate capacity. For example, Luck and Vogel (1997) presented subjects with 1 to 12 colored squares in an array. After a blank interval, another array of squares was presented, in which one square may have changed color. Subjects were to respond whether the arrays were identical. In these experiments and others (e.g., Pashler, 1988), there are sharp drop-offs in performance after approximately four items. Electrophysiological and neuroimaging studies also support the idea of a four-item capacity limit. The first such report was by Vogel and Machizawa (2004) who recorded event-related potentials (ERPs) from subjects as they performed a visual change-detection task. ERPs recorded shortly after the onset of the retention interval in this task indicated a negative-going wave over parietal and occipital sites that persisted for the duration of the retention interval and was sensitive to the number of items held in memory. Importantly, this signal plateaued when the array size reached between three and four items. The amplitude of this activity was strongly correlated with estimates of each subject’s memory capacity and was less pronounced on incorrect than correct trials, indicating that it was strongly related to performance. Subsequent functional magnetic resonance imaging (fMRI) studies have observed similar load- and accuracy-dependent activations, especially in intraparietal and intraoccipital sulci (see Figure 29.5:
8/17/09 2:27:09 PM
572
Psychological and Neural Mechanisms of Short-Term Memory
were retrieved at similar rates (which McElree deems a measure of accessibility, the speed of retrieval) despite differing in accuracy (which McElree describes as a measure of availability, the probability of successful retrieval). Oberauer (2002) suggested a compromise solution to the “one versus four” debate. In his model, up to four items can be highly accessible, but only one of these items can be in the focus of attention. This model is similar to that of Cowan (2000), but adds the assumption that an important method of accessing short-term memories is to focus attention on one item, depending on task demands. Thus, in tasks that serially demand attention to several items (such as those of Garavan, 1998, or McElree, 2001), the mechanism that accomplishes this involves changes in the focus of attention among temporarily activated representations in LTM.
Todd & Marois, 2004, 2005). These particular regions have been implicated by others (e.g., Yantis & Serences, 2003) in the control of attentional allocation, supporting the idea that one rate-limiting step in STM capacity has to do with the allocation of attention (Cowan, 2000; McElree, 1998, 2001; Oberauer, 2002). Evidence for More Severe Limits on Focus Capacity Others argue for a much more limited capacity of just one item (e.g., Garavan, 1998; McElree, 2001; Verhaeghen & Basak, in press). This estimate is based on studies using a combination of response time and accuracy as measures of performance. For example, Garavan (1998) required subjects to keep two running counts in STM—one for triangles and one for squares—as shape stimuli appeared one after another in random order. This task can be seen in Figure 29.6. Subjects controlled their own presentation rate, and Garavan measured the time spent processing each figure before moving on to the next. Responses to a figure of one category (e.g., a triangle) that followed a figure from the other category (e.g., a square) were fully 500 ms longer than responses to the second of two figures from the same category (e.g., a triangle followed by another triangle). These findings suggested that attention could be focused on only one internal counter in STM at a time. Switching attention from one counter to another incurred a substantial cost in time. Other evidence comes from McElree (1998) who found that the last item in a list was retrieved substantially faster than other items, suggesting that it was still in the focus. Other items were retrieved at a rate that was substantially slower than the last item. Those other items, however,
Alternatives to Capacity Limits Based on Number of Items Some researchers disagree with fixed item-based limits, in part because these limits seem mutable. For example, practice may improve subjects’ ability to use processes such as chunking to allow greater capacities by tying together individual items into a single unit (McElree, 1998; Verhaeghen et al., 2004, but see Oberauer, 2006). However, proponents of the fixed-capacity view might retort that practice alters the amount of information that can be coded into a single representation, not the total number of representations that can be held in STM (Miller, 1956). Another attack on fixed-capacity views comes from questioning the assumption that items are the appropriate unit for expressing capacity limits. Wilken and Ma (2004) demonstrated that a signal-detection account of STM, in
Square counter: 1 Triangle counter: 0 Trial 1 500 ms counter switch cost
Square counter: 1 Triangle counter: 1 Square counter: 1 Triangle counter: 2
Trial 2
Trial 3 500 ms counter switch cost
Square counter: 2 Triangle counter: 2
Trial 4
Square counter: 3 Triangle counter: 2
Time Trial 5
Figure 29.6 The counter-updating task.
c29.indd 572
8/17/09 2:27:10 PM
Three Core Processes of STM: Encoding, Maintenance, and Retrieval
which STM capacity is primarily constrained by noise, better fit behavioral data than an item-based fixed capacity model. Recent data from change-detection tasks suggest that object complexity (Eng, Chen, & Jiang, 2005) and similarity (Awh, Barton, & Vogel, 2007) play an important role in determining capacity. Xu and Chun (2006) offer neuroimaging evidence that may reconcile the item-based and complexity accounts: In a change-detection task, they found that activation of inferior parietal sulcus tracked a capacity limit of four, but nearby regions were sensitive to the complexity of the memoranda, as were the behavioral results. Building on these findings, we suggest a new view of capacity. The fundamental idea that attention can be allocated to one piece of information in memory is correct, but the definition of what that one piece is needs to be clarified. It cannot be that just one physical item is in the focus of attention at any one time, because if that were so, hardly any computation would be possible. How could one add 34, for example, if attention could be allocated only to the “3” or the “4” or the “” operation? We propose that what attention focuses on is what is bound together into a single functional context, whether that context is defined by time, space, or some other stimulus characteristic such as similarity or task relevance. By this account, attention can be placed on the whole problem “34,” allowing relevant computations to be made. Put another way, the critical unit is at the level of representation as perceived by the subject. This is not necessarily the same as the physical “item” presented by an experimenter. Chunking is one special case of a single representation holding multiple items. One can also think of more everyday examples: In considering this chapter, the level of representation could be the entire chapter, the current page, a single word, or a single letter. Letters are bound together by the functional context of a word, and so on. Complexity comes into play by limiting the number of subcomponents that can be bound into one functional context. This approach has the advantage of permitting novel relations to be established among familiar items to form new representations. This addresses one of the criticisms of the purest form of the unitary models: If STM is strictly limited to an activated portion of LTM, then the system can never entertain new thoughts. Summary What is the structure of STM? We favor the unitary-store model, in which the representational bases for perception, STM, and LTM are identical. That is, the same neocortical representations that are the repository of semantic knowledge are activated when a piece of information is main-
c29.indd 573
573
tained for the short-term, whether that activation is due to perceiving that information or retrieving it from LTM (Wheeler, Peterson, & Buckner, 2000). Different regions of neocortex represent different types of information (e.g., verbal, spatial), and it is therefore to be expected that STM is also organized by information-type. Empirically, STM often cashes out as the four or so items whose representations can be temporarily activated and processed simultaneously. However, this item-based limit is flexible and dependent on factors such as complexity and experience. The critical feature of unitary-store models is the severely limited focus of attention. Although the capacity of that focus is still under debate, we believe it is one representation, although this representation may consist of several items bound together into one functional context. For the relatively simple stimuli used in laboratory experiments, this limit also appears to be around four items, suggesting that it may be related to the factors that place a similar four-item limit on subitizing, another attention-demanding process.
THREE CORE PROCESSES OF STM: ENCODING, MAINTENANCE, AND RETRIEVAL How does this structure work—that is, what are the processes of STM? Many have been suggested, including rehearsal, attention shifts, updating, and interference resolution. However, we argue that these complex processes represent combinations or special cases of three basic types, which govern the transition of memory representations into and out of the focus of attention: Encoding processes select sensory information and transform it into the representation that occupies the focus, maintenance processes keep the representation in the focus and protect it from interference or decay, and retrieval processes bring information from the past back into the focus. Encoding Items into the Focus Although detailed accounts of encoding processes are usually left to theories of perception, most accounts of STM make several assumptions about how encoding occurs. First, perceptual information is assumed to have immediate but capacity-limited access to the focus of attention. Perceptual information can serve as the object of the focus just as information from the past does. Several of the experiments cited by Cowan (2000) as evidence for a capacity of four involved representations of objects presented currently or less than a second ago. These include visual tracking experiments (Pylyshyn et al., 1994), enumeration
8/17/09 2:27:10 PM
574
Psychological and Neural Mechanisms of Short-Term Memory
(Trick & Pylyshyn, 1993), and whole-report of spatial arrays and spatiotemporal arrays (Darwin, Turvey, & Crowder, 1972; Sperling, 1960). Similarly, in McElree’s (2006) and Garavan’s (1998) experiments, each incoming item in the stream of material (words or letters or objects) is assumed to be momentarily represented in the focus. Second, current theories assume that encoding new representations into the focus results in the displacement of other representations. For example, in McElree’s singleitem focus model, each incoming item has its turn in the focus and replaces the previous item. The work reviewed earlier showing performance discontinuities after the putative limit of STM capacity has been reached appears to support the idea of whole-item displacement (e.g., Cowan, 2000; Garavan, 1998; McElree, 2001; Oberauer, 2002). However, it is not clear how simple item-based displacement accounts for the effects of similarity or complexity on capacity estimates. One possibility is that these factors influence how items compete with each other for access to the focus. Another possibility is that complexity and similarity influence the set of featural components needed to represent an item, and items compete with each other for this limited feature-based representational resource. In other words, the more overlap there is between the patterns of activation that represent two items (in V4, inferotemporal cortex, fusiform gyrus, etc.), the more likely those items are to interfere with each other. We expand further on these ideas in the section on forgetting. Third, all current accounts assume that perceptual information does not have automatic, obligatory access to the focused state. Instead, given the severe limits on capacity, attentional control is required to ensure that task-relevant items are included in the focus, and task-irrelevant items are excluded. Postle (2005) found that while subjects maintained information in STM, increased activity in the dorsolateral prefrontal cortex during the presentation of distracting material was accompanied by a selective decrease in inferior temporal regions involved in object representation. This pattern suggests that prefrontal regions selectively modulated posterior perceptual areas to prevent incoming sensory input from disrupting the representation of task-relevant memoranda. Just as Postle (2005) found evidence to suggest that prefrontal activations prevent distracting sensory information from being encoded, we suggest that frontal and parietal areas are responsible for selective attention toward relevant inputs. This involves biasing posterior sensory regions toward important target stimuli. As these items are encoded, the medial temporal lobe binds each item to a functional context (e.g., a temporal and/or spatial context). Simultaneously, short-term synaptic plasticity works across cortical areas to maintain the representation
c29.indd 574
(including contextual information) even once it is no longer active in the attentional focus. Zucker and Regehr (2002) identified at least three distinct plasticity mechanisms which begin to operate on this time scale (tens of milliseconds), and which together are sufficient to produce memories lasting several seconds. (For the use of this mechanism in a prominent neural network model of STM, see Burgess & Hitch, 1999, 2005, 2006). Maintaining Items in the Focus Once an item is in the focus of attention, what keeps it there? If the item is in the perceptual present, the answer is clear: attentionally-modulated perceptual encoding. The more pressing question is: What keeps something in the cognitive focus when it is not currently perceived? For many neuroscientists, this is the central question of STM—how is information held in mind for the purpose of future action after the perceptual input is gone? Extensive evidence from both humans and nonhuman primates supports the idea that prefrontal-posterior circuits underlie active maintenance. The role of posterior perceptual regions is relatively clear; activity in these regions likely recapitulates the initial perceptual encoding of the representation, but what of the activation in prefrontal circuits? For example, consider the classical evidence, introduced earlier, that some neurons fire selectively during the delay period in delayed-match-to-sample tasks (e.g., Funahashi et al., 1989; Fuster, 1973). Does this activation (and its counterpart in human neuroimaging studies (e.g., Jha & McCarthy, 2000) suggest that a representation of the information is also held in the prefrontal cortex, or does it reflect some other process? Early interpretations of these frontal activations linked them directly to STM representations (Goldman-Rakic, 1987). By contrast, more recent theories suggest that they subserve attentional control processes that maintain representations in posterior areas (Ranganath, 2006; Ruchkin, Grafman, Cameraon, & Berndt, 2003). For example, maintenance operations may modulate perceptual encoding to prevent incoming perceptual stimuli from disrupting the focused representation in posterior cortex (Postle, 2005). Mechanistic descriptions of how maintenance might occur are found in computational neural-network models hypothesizing that prefrontal cortical circuits support attractors, which are selfsustaining activation patterns observed in certain classes of recurrent networks (Hopfield, 1982; Polk, Simen, & Lewis, 2002). A major challenge is to develop computational models that are able to engage in active maintenance of representations in posterior cortex while simultaneously processing, to some degree, incoming perceptual material (see Renart, Paraga, & Rolls, 1999, for one example).
8/17/09 2:27:11 PM
Three Core Processes of STM: Encoding, Maintenance, and Retrieval 5.0 4.0 3.0 d´
We thus adopt the following view of maintenance operations: prefrontal and parietal regions perform attentional control processes by signaling posterior sensory regions to continue a high level of activation for the representation that is currently in the focus of attention even after it is no longer physically presented. These control regions may differ by type of material (e.g., Smith & Jonides, 1997). The attentional focus on this representation protects it from interference and decay and keeps it in an immediately accessible state. Other representations that are maintained in STM, but not in the focus, are supported by short-term plasticity mechanisms that increase the coordination or connection weights between the features of these representations. As we describe next, stochastic variability in the neurons that make up these representations may eventually lead them to decay, and they are also vulnerable to competitive interference from each other, other items in memory, or incoming stimuli (Zucker & Regehr, 2002).
575
SP - 5 SP - 4 SP - 3 SP - 2 SP - 1
2.0 1.0 0.0 1.0 0.0
0.5
1.0 1.5 Total processing time (sec)
2.0
2.5
Figure 29.7 Results from McElree and Dosher (1989). Note: Observed average d’values as a function of total processing time for serial positions one (SP1) through five (SP5). (Smooth functions are derived from the estimated parameters of exponential model fits.) From “Serial Position and Set Size in Short-Term Memory: Time Course of Recognition,” by B. McElree and B. A. Dosher, 1989, Journal of Experimental Psychology: General, 118, p. 357. Reprinted with permission.
Retrieval into the Focus Most major STM theories do not include detailed treatments of retrieval, although the limited-focus models assume that there is some way of bringing information from LTM into the focus. This process can be labeled “retrieval” (c.f., McElree, 2006; Sternberg, 1966), but that label does not imply the spatial metaphor of moving items from one store to another. Instead, it is important to keep in mind our assumption that the same underlying neural representations subserve both STM and LTM, and that the question is whether that representation is currently in the highly activated state that constitutes the focus. There is now considerable evidence, mostly from mathematical models of behavioral data, that STM retrieval of item-information is a rapid, parallel, content-addressable process. Early models of STM retrieval (e.g., Sternberg, 1966) postulated a serial search process. However, current models favor a parallel search process because it can better account for reaction time data such as those shown in Figure 29.7. McElree and Dosher (1989) manipulated the response deadline in a standard item-recognition task, in which participants are presented with a rapid sequence of to-be-remembered verbal items (e.g., letters or digits), followed by a probe item. The task was to identify whether the probe was a member of the memory set. The speed at which an item was retrieved was thought to measure its accessibility (exhibited in Figure 29.7 by the rise to asymptote), whereas accuracy measured availability (exhibited in Figure 29.7 by the asymptote). As described previously, the last item (which is the most recent) is accessed more quickly than the others, suggesting that it is already in the focus. The other items are accessed at a uniform rate,
c29.indd 575
suggesting a parallel search among them to bring them into the focus (see also Hockley, 1984; see review by McElree, 2006). What are the neural underpinnings of STM retrieval, and does it differ—if at all—from LTM retrieval? As described earlier, similar activations in posterior perceptual regions support the idea that STM and LTM both operate on the same neural representations—that is, that they are similar in structure with regard to how representations are stored. STM and LTM may also be similar in process, at least when it comes to retrieval: The retrieval processes described for STM are isomorphic with those posited for LTM (e.g., Anderson et al., 2004; Gillund & Shiffrin, 1984; Murdock, 1982; Plaut, 1997). This isomorphism logically follows from the idea that out-of-focus STM representations are simply LTM representations in a special state of activation, which may include short-term plasticity. This unification is one of the principal theoretical virtues of the recent STM retrieval models such as those championed by McElree (2006). Extensive studies have delineated a network of medial temporal lobe (MTL) regions, lateral prefrontal regions, and anterior prefrontal regions active in long-term retrieval tasks (e.g., Buckner, Koutstaal, Schacter, Wagner, & Rosen, 1998; Cabeza & Nyberg, 2000; Fletcher & Henson, 2001). As described earlier, although MTL structures were originally thought not to play a role in STM, recent work has shown that they come into play when the task demands remembering novel information or making associations, regardless of the time scale.
8/17/09 2:27:11 PM
576
Psychological and Neural Mechanisms of Short-Term Memory
Like the MTL, the frontal cortex is used similarly in retrieval for STM and LTM, as evidenced in numerous neuroimaging studies (see Figure 29.8). For example, eventrelated studies of a standard STM probe-recognition task find activations in lateral prefrontal regions (e.g., D’Esposito, Postle, Jonides, & Smith, 1999; D’Esposito & Postle, 2000) and anterior prefrontal regions (Badre & Wagner, 2005) often implicated in LTM retrieval. Some of these studies used retention intervals that were somewhat longer than the typical behavioral STM task, making them vulnerable to the criticism that the activations in fact represented LTM retrieval. However, a meta-analysis of studies that involved bringing very recently presented items into the focus of attention likewise found specific involvement of the lateral and anterior prefrontal cortex (Johnson et al., 2005). Therefore, these regions appear to be involved in retrieval, regardless of time scale. Even stronger evidence derives from recent imaging studies that directly compare short- versus long-term retrieval tasks using within-subjects designs. The two types of tasks activate highly overlapping regions in the dorsolateral, ventrolateral, and anterior prefrontal cortex (Cabeza, Dolcos, Graham, & Nyberg, 2002; Ranganath, Johnson, D’Esposito, 2003; Talmi, Grady, Goshen-Gottstein, & Moscovitch, 2005). In some cases, STM and LTM tasks involve the same regions but differ in the relative amount of activation shown within these regions. For example, Cabeza et al. (2002) reported similar engagement of medial temporal regions in both types of task, but greater anterior and ventrolateral activation in the long-term episodic tasks. However, Talmi et al. (2005) reported greater activation in both medial temporal and lateral frontal cortices for recognition probes of the earliest items in a 12-item list (where LTM would be more prominent) versus the last or secondBA 47/45
Visual Ctx
BA 45
HC/pHC
Figure 29.8 (Figure C.31 in color section) Overlap between regions involved in short-term memory (red) and long-term memory (blue). Note: From “Similarities and Differences in the Neural Correlates of Episodic Memory Retrieval and Working Memory,” by R. Cabeza, F. Dolcos, R. Graham, and L. Nyberg, 2002, NeuroImage, 16, pp. 317–330. Adapted with permission.
c29.indd 576
to-last items (where STM would be more prominent). This discrepancy might be explained if items at the end of the list were still in the focus of attention, and thus did not require cue-based retrieval processes. Notably, the end-oflist items preceded the probe by less than 2 seconds, within the time span classically suggested for verbal STM (e.g., Baddeley et al., 1975). Summary The bulk of the neuroimaging evidence points to the conclusion that the recruitment of frontal and medial temporal regions depends on whether the information is currently in or out of focus, not whether the task nominally tests short or long time spans (see Sakai, 2003, for a more extensive review) because these regions are not involved in memory storage per se. Thus, MTL regions will increase activation in response to items in a typical LTM or STM task if these items have fallen out of the focus of attention in order to retrieve their functional context. Likewise, frontal regions will increase activation when any item (from an LTM or STM task) is being retrieved into the focus of attention during rehearsal or in preparation to make a response. Frontal regions are thought to perform several operations during retrieval including initiating retrieval, accessing stored representations, and selecting among competing representations (Badre & Wagner, 2007; Sakai, 2003). Relationship of STM Processes to Rehearsal Rehearsal intuitively seems like the prototypical STM process. However, many formal and computational theories of STM exclude rehearsal from their list of core processes (e.g., Anderson & Matessa, 1997; Burgess & Hitch, 2006; Meyer & Kieras, 1997). Cowan (2000) describes evidence that first-grade children do not use verbal rehearsal strategies, but nevertheless have measurable STM capacities. In fact, Cowan (2000) uses young children’s failure to use rehearsal to argue that their performance is indicative of the fundamental capacity limits of STM. We take the view that rehearsal is simply a controlled sequence of retrievals and re-encodings of items into the focus of attention (c.f., Baddeley, 1986; Cowan, 1995). The theoretical force of this idea becomes apparent when it is coupled with our other assumptions about the structures and processes of the underlying STM architecture. We briefly sketch here two interesting sets of empirical predictions that follow from this view. When coupled with the idea of a single-item focus, the assumption that rehearsal is a sequence of retrievals into the focus makes a clear prediction: A just-rehearsed item should display the same retrieval dynamics as a justperceived item. This prediction was directly tested by
8/17/09 2:27:11 PM
Why Do We Forget?
McElree (2006) using a version of his response-deadline recognition task. A retention interval occurred between the list-presentation and the probe, and subjects were trained to rehearse the list at a particular rate during that interval. Knowing the rate at which subjects rehearsed made it possible to know when each item was rehearsed, and thus when it was hypothetically re-established in the focus. The results were compelling: A just-rehearsed item showed the same fast retrieval dynamics that typify a just-perceived item in experiments without a retention interval (see previous section). In other words, the difference in speed-accuracy tradeoff functions for in-focus versus out-of-focus items was apparent regardless of whether the dichotomy was established by internally controlled rehearsal or externally controlled perception. The assumption that rehearsal is a controlled strategy also yields interesting predictions. If rehearsal is the controlled composition of more primitive STM processes, then rehearsal should activate the same brain circuits as the primitive processes, along with additional (frontal) circuits associated with their control. In other words, there should be overlap of rehearsal with brain areas subserving retrieval and initial perceptual encoding. Likewise, there should be control areas distinct from those of the primitive processes. Both predictions receive support from neuroimaging studies. The first prediction is broadly confirmed: There is now considerable evidence for the re-activation of areas associated with initial perceptual encoding in tasks that require rehearsal (see Jonides, Lacey, & Nee, 2005, for a recent review; note also that there is evidence for reactivation in LTM retrieval: Wheeler et al., 2000, 2006). The second prediction—that rehearsal engages additional control areas beyond those participating in maintenance, encoding, and retrieval— receives support from two effects. One is that verbal rehearsal engages a set of frontal structures associated with articulation and its planning: supplementary motor, premotor, inferior frontal, and posterior parietal areas (e.g., Chein & Fiez, 2001; Jonides, Smith, Marshuetz, Koeppe, & Reuter-Lorenz, 1998; Smith & Jonides, 1999). The other is that spatial rehearsal engages attentionally mediated occipital regions, suggesting rehearsal processes that include retrieval of spatial information (Awh, Jonides, & Reuter-Lorenz, 1998; Awh & Jonides, 2001). Summary There is substantial evidence supporting the idea that rehearsal is a process composed of more fundamental STM processes, namely retrieval and encoding. In addition, a just-perceived item is functionally equivalent to a just-rehearsed item, showing that the focus of attention has similar properties in these two cases.
c29.indd 577
577
WHY DO WE FORGET? Forgetting in STM is a vexing problem: What accounts for failures to retrieve something encoded just seconds ago? There are two major explanations for forgetting, often placed in opposition: time-based decay and similaritybased interference. Next, we describe some of the major findings in the literature related to each of these explanations, and we suggest that they may ultimately result from the same underlying principles. Decay Theories: Intuitive but Problematic The central claim of decay theory is that as time passes, information in memory erodes, and so it is less available for later retrieval. This explanation has strong intuitive appeal. However, theories of STM that rely on decay to explain forgetting face two strong criticisms. First, experiments attempting to demonstrate decay can seldom eliminate confounds and alternative explanations. Second, most psychological theories that posit decay do not include a mechanistic explanation of how it might occur. Without such an explanation, it is difficult to see decay theories as any more than a restatement of the phenomenon. Next we review the debates on this issue and ultimately suggest a possible mechanism for decay. Retention-Interval Confounds: Controlling for Rehearsal and Interference The classic Brown-Peterson procedure (J. Brown, 1958; Peterson & Peterson, 1959) illustrates many of the difficulties in providing evidence for decay. In this procedure, participants were asked to learn consonant trigrams (e.g., DPW). Each trigram was followed by a retention interval during which participants counted backward to prevent rehearsal, followed by their attempt to recall the trigram. Performance decreased as retention interval increased, apparently providing good evidence for time-based decay. However, Keppel and Underwood (1962) showed that almost no forgetting occurs for the earliest trials, regardless of the retention interval. The effects of the retention interval became apparent only after several trials had passed, suggesting that proactive interference from previous memoranda was the major mechanism of forgetting, and that it was this influence that increased over time. A major problem in testing decay theories is controlling for what occurs during the retention interval, especially with human subjects. One common method is to attempt to prevent rehearsal by requiring subjects to perform another attention-demanding task during the interval— for example, requiring them to count backwards during the Brown-Petersen task. However, the difficulty of the
8/17/09 2:27:12 PM
578
Psychological and Neural Mechanisms of Short-Term Memory
retention-interval task does not appear to influence the amount of forgetting that occurs, raising the possibility that the retention-interval task relies on a different resourcepool than does the primary memory task, and thus may not ultimately be effective in preventing rehearsal (Roediger, Knight, & Kantowitz, 1977). Another problem is that most tasks that fill the retention interval require subjects to use STM. This could lead to active displacement of items from the focus of attention (e.g., McElree, 2001). Thus, the problem with retentioninterval tasks is that they are questionable in preventing rehearsal of the to-be-remembered information, and they also introduce new, distracting information that may engage STM. Several attempts have been made to escape the rehearsal conundrum by using stimuli that are not easily converted to verbal codes (e.g., pure tones; Harris, 1952) or by varying the retention interval during implicit memory procedures, where participants do not know that their memory is being tested, and so they would have no reason to rehearse (McKone, 1995). These experiments provide some of the best behavioral evidence for decay, although they are still somewhat vulnerable to Keppel and Underwood’s (1962) criticism about prior trials. Another potential problem is that even if they are not deliberately rehearsing, participants’ brains and minds are not inactive during the retention interval (Raichle et al., 2001). There is increasing evidence that the processes ongoing during nominal “resting states” are related to memory, including STM (Hampson, Driesen, Skudlarski, Gore, & Constable, 2006). Spontaneous retrieval of other memories during the retention interval could interfere with memory for the experimental items. So, although experiments that reduce the influence of rehearsal provide some of the best evidence of decay, they are not definitive. What Happens Neurally during the Delay? Stimulus-associated neural activity usually declines during a retention interval. This decline seems like a prime candidate for a mechanism of decay. However, it has been more difficult than expected to show a relation between reduced neural activity and reduced memory. Single-cell results like those of Fuster (1973, 1995) are often cited as evidence for decay. In monkeys performing a delayed-response task, delay-period activity in inferotemporal cortex steadily declined over 18 seconds (see also, Pasternak & Greenlee, 2005). At a molar level, human neuroimaging studies often show delay-period activity in prefrontal and posterior regions, and this activity is often thought to support maintenance or storage (see review by Smith & Jonides, 1999). As reviewed earlier, it is likely that the posterior regions support storage, and that
c29.indd 578
frontal regions support processes related to interferenceresolution, control, attention, response preparation, motivation, and reward. Consistent with the primate data, Jha and McCarthy (2000) found a general decline in activation in posterior regions over a delay period, which suggests some neural evidence for decay. However, this decline in activation was not obviously related to performance, which suggests two (not mutually exclusive) possibilities: (1) the decline in activation was not representative of decay, so it did not correlate with performance; or (2) these regions might not have been storage regions (but see Todd & Marios, 2004; Xu & Chun, 2006, for evidence more supportive of load sensitivity in posterior regions). The idea that neural activity decays also faces a serious challenge from the classic results of Malmo (1942), who found that a monkey with frontal lesions was able to perform a delayed response task extremely well (97% correct) if visual stimulation and motor movement (and therefore associated interference) were restricted during a 10-second delay. By contrast, in unrestricted conditions, performance was as low as 25% correct (see also, D’Esposito & Postle, 1999; Postle & D’Esposito, 1999). In summary, evidence for time-based declines in neural activity that would naturally be thought to be part of a decay process is mixed. Is There a Mechanism for Decay? At least two key empirical results (Harris, 1952; McKone, 1998) do seem to implicate some kind of time-dependent decay. If one assumes that decay happens, how might it occur? One possibility—perhaps most compatible with results like those of Malmo (1942)—is that what changes over time is not the integrity of the representation itself, but the likelihood that attention will be attracted away from it. This explanation is also compatible with the focus-ofattention view of STM. By this explanation, the representation within the focus does not decay. However, as more time passes, there is a greater likelihood that attention is attracted away from this representation and toward external stimuli or other memories. For recently presented items outside of the focus, decay may occur because of stochastic variability in the activity of the neurons that make up an item’s representation. The temporal synchronization of neuronal activity is an important part of the representation (e.g., Deiber et al., 2007; Jensen, 2006; Lisman & Idiart, 1995), and it is possible that being in the focus helps to maintain this synchrony. As time out of the focus increases, variability in the firing rates of individual neurons may cause them to fall increasingly out of synchrony, unless they are reset by rehearsal. By this hypothesis, as the neurons fall out of synchrony,
8/17/09 2:27:12 PM
Why Do We Forget?
the pattern that forms the representation becomes increasingly difficult to discriminate from surrounding neural noise. See Lustig, Matell, and Meck (2005) for an example that integrates neural findings with computational (Frank, Loughry, & O’Reilly, 2001) and behaviorally based (G. D. A. Brown, Preece, & Hulme, 2000) models of STM. Interference Theories: Comprehensive but Complex Interference effects play several roles in memory theory: First, they are the dominant explanation of forgetting. Second, some have suggested that STM capacity and its variation among individuals are largely determined by the ability to overcome interference (e.g., Hasher & Zacks, 1988; Unsworth & Engle, 2007). Finally, differential interference effects in short- and long-term memory have been used to justify the idea that they are separate systems, and common interference effects have been used to justify the idea that they are a unitary system. Interference theory has a problem opposite that of decay: It is comprehensive but complex (Crowder, 1976). The basic principles are straightforward. Items in memory compete, with the amount of interference determined by the similarity, number, and strength of the competitors. The complexity stems from the fact that interference may occur at multiple stages (encoding, retrieval, and possibly storage) and at multiple levels (the representation itself, or its association with a cue or a response). Interference from the past (proactive interference, PI) may affect both the encoding and the retrieval of new items, and it often increases over time. By contrast, interference from new items onto older memories (retroactive interference, RI) frequently decreases over time, and may not be as reliant on similarity (see discussion by Wixted, 2004). Retrieval Interference It can be difficult to select between items that are similar to each other. For example, if participants learn and recall four lists from the same category (e.g., flowers), recall performance shows typical PI effects: decreasing performance across the lists. However, if the category of the fourth list is changed, even subtly (e.g., wildflowers) memory for this list can be nearly as high as on the very first trial (Wickens, 1970). Importantly, this “release from PI” occurs even if the subject is only made aware of the category shift after the list has been learned (Gardiner, Craik, & Birtwist, 1972). This suggests that the effects of category-change occur largely at retrieval, by helping participants differentiate and thus select recent-list items from others in memory. Selection and retrieval processes remain an important topic in interference research. Functional neuroimaging
c29.indd 579
579
studies consistently identify a region in left inferior frontal gyrus (LIFG) as active during interference-resolution, at least for verbal materials (see the review by Jonides & Nee, 2006). This region appears to be generally important for selection among competing alternatives, for example, in semantic memory as well as in STM (Thompson-Schill, D’Esposito, Aguirre, & Farah, 1997). In STM, LIFG is most prominent during the test phase of interference trials, and its activation during this phase often correlates with behavioral measures of interference-resolution (D’Esposito, et al., 1999; Jonides et al., 1998; Reuter-Lorenz et al., 2000; Thompson-Schill et al., 2002). These findings attest to the importance of processes for resolving retrieval interference. The commonality of the neural substrate for interference-resolution across short-term and long-term tasks provides yet further support for the hypothesis of shared retrieval processes for the two types of memory. Interference effects occur at multiple levels and it is important to distinguish between interference at the level of representations and interference at the level of responses. The LIFG effects described previously appear to be familiarity-based and to occur at the level of representations. Items on a current trial must be distinguished and selected from among items on previous trials that are familiar because of prior exposure, but are currently incorrect. A separate contribution occurs at the level of responses: An item associated with a positive response on a prior trial may now be associated with a negative response, or vice versa. This response-based conflict can be separated from the familiarity-based conflict, and its resolution appears to relate more to activity in the anterior cingulate (Nelson, Reuter-Lorenz, Sylvester, Jonides, & Smith, 2003; see Figure 29.9). Other Mechanisms for Interference Effects Many studies examining encoding in STM have focused on retroactive interference (RI): how new information disrupts previous memories. Early theorists described this disruption in terms of displacement of entire items from STM, perhaps by disrupting consolidation (e.g., Waugh & Norman, 1965). However, rapid serial visual presentation (RSVP) studies suggest that this type of consolidation is complete within a very short time—less than 500 ms, and in some situations as short as 50 ms (Vogel, Woodman, & Luck, 2006). What about interference effects beyond this time window? As reviewed previously, most current focus-based models implicitly assume something like whole-item displacement is at work. It is not clear how these models account for similarity-based interference. Two recent models (Nairne, 2002; Oberauer, 2006) suggest a possible modification.
8/17/09 2:27:12 PM
580
Psychological and Neural Mechanisms of Short-Term Memory (B)
(A)
4 mPFC/ACC
Familiarity conflict Response conflict
z 10
IFG
Average t-score
3 2 1 0 1 z 30
2
IFG
mPFC
Figure 29.9 Dissociations between stimulus (familiarity) and response-based conflict. Note: A: mPFC/ACC region of interest is associated with response-conflict, and IFG is associated with familiarity-based conflict. B: Average t-scores in the IFG and mPFC for familiarity- and response-conflict conditions. From “Dissociable Neural Mechanisms Underlying Response-Based and Familiarity-Based Conflict in Working Memory,” by J. K. Nelson, P. A. Reuter-Lorenz, C. Y. C. Sylvester, J. Jonides, and E. E. Smith, 2003, Proceedings of the National Academy of Sciences, 100, pp. 11171–11175. Reprinted with permission.
Rather than a competition at the item level for a singlefocus resource, these models posit a lower-level similarity-based competition for “feature units.” By this idea, representations are composed of bundles of features (e.g., color, shape, spatial location, temporal location), which are in turn distributed over multiple units. The more two items overlap, the more they compete for these feature units, resulting in greater interference. This proposed mechanism fits well with the idea that perception, STM, and LTM rely on representations that are distributed throughout sensory, semantic, and motor cortex (Postle, 2006). As we describe next, it is also congruent with the stochastic mechanism we suggested earlier for decay. Interference-Based Decay The mechanism we earlier proposed for decay is based on the idea that stochastic variability causes the neurons making up a representation to fall out of synchrony. Using the terminology of Nairne (2002) and Oberauer (2006), the feature units become less tightly bound. Feature units that are not part of a representation also show some random activity due to their own stochastic variability, creating a noise distribution. Over time, there is an increasing likelihood that the feature units making up the to-be-remembered item’s representation will overlap with those of the noise distribution, making them increasingly difficult to distinguish. This increasing overlap with the noise distribution and loss of feature binding could lead to the smooth forgetting functions often interpreted as evidence for decay. Such a mechanism could also account for strength-oflearning effects and similarity-based interference. Poorly learned items might have fewer differentiating features and be less tightly bound, thus making their representations
c29.indd 580
more difficult to discriminate from the noise distribution to begin with, and faster to lose their integrity by falling out of synchrony. McKone (1998) found that nonwords decayed faster than words, and were also more susceptible to interference. Similarity-based interference could occur because of competition between representations for control over shared feature units, increasing the rate at which any given representation would lose integrity. In summary, although it is still speculative, this model of neural representations and how they change over time due to intrinsic variability in neuronal activity supplies a unified mechanism for interference and decay.
SKETCH OF SHORT-TERM MEMORY AT WORK In our review thus far, we brought together the literature on behavioral and neuroscience data concerned with shortterm memory. Here, we sketch out how this integration might work on a moment-to-moment basis throughout a typical STM task: an N-back probe recognition task. Figure 29.10 illustrates the task events in terms of the stimulus display and the subject’s response. The stages of the task are displayed in a more abstract form in Figure 29.11, with the task events at the bottom of the figure and the putative cognitive events at the top. In the task, the participant sees a letter presented for 700 ms and must respond “yes” if the letter matches a letter seen 4th-back, and respond “no” if the letter does not match the letter 4thback. Therefore, in this task, the participant must actively maintain four items to match the current probe against the 4th-back item. The participant must also keep track of the other items for future trials.
8/17/09 2:27:12 PM
Sketch of Short-Term Memory at Work
We adopt the STM architecture of Oberauer (2002) along with our elaboration of the processes involved in STM and forgetting to explain how this task would be accomplished. To reiterate, the focus of attention consists of frontal and parietal regions biasing posterior cortical areas that are involved in perception and storage of LTM representations. Items outside the focus are either in a highly accessible state, or are more dormant in the posterior representational cortices. In Figure 29.10, the participant first encodes the letter B, activating posterior perceptual areas, presumably in left inferior temporal cortex (Polk et al., 2002). This item moves into the focus of attention, and the MTL initiates its contextual binding. B does not match the 4th-back item, so the subject responds “No.” Next, the letter D is shown, which displaces the letter B from the focus of attention, and D is encoded using the same process as described for
“No”
Matches the 4th-back item
B D
“No”
M
“No”
“No”
J “Yes”
B
Figure 29.10 Sample STM task: The N-Back task (where N 4)
1. Encode 1st item ‘B’ in the focus.
B
2. Encode 2nd item ‘D’ in the focus. Item B moves into the region of Direct Access, but is brought back into the focus periodically with rehearsal.
B
B. The participant must still keep the letter B active, even though it is not the focus of attention, and thus it remains in a highly active state outside the focus of attention. MTL activation persists for B, maintaining the item’s context; however, this activation is greatly decreased due to stochastic drift as outlined previously, possibly leading to decay of the representation. This process continues for the rest of the items. The participant responds with “No” to the first four items (B, D, M, and J) because they do not match the letter 4th-back. Moreover, each item displaces the previous item from the focus of attention. In addition, throughout the task, the participant rehearses the items, which periodically brings the items that are out of the focus back into the focus as illustrated in panel 2 of Figure 29.11. Frontal and parietal areas increase biasing in the posterior regions to retrieve these items back into the focus during rehearsal. Rehearsal itself is mediated by premotor and inferior frontal gyrus regions. Finally, the participant gets the letter B, which does match the item 4th-back. Before we move on to the retrieval process, notice the following depicted in Figure 29.11. First, the representations of items 1 and 2 (B and D) overlap due to their featural and contextual similarities (shape, phonology, and temporal context). Second, items 3 and 4 (M and J) are much farther away from items 1 and 2. This is because these items do not share many features with items 1 and 2. In addition, item 3 is no longer within the activated/highly available state. This item has suffered from proactive interference from items 1 and 2 and retroactive interference from item 4.
rd 3. Encode 3 item ‘M’ in the focus Item ‘D’ moves into the region of Direct Access and has many overlapping Features with item ‘B’.
B M D
D
581
4. Encode 4th item ‘J’ in the focus. Item ‘M’ moves into the region of Direct Access, but does not share as many features with items ‘B’ and ‘D’.
B J D M
Note: this rehearsal occurs for all the items, but will only be illustrated for item 1 (i.e., ‘B’). 5. Encode the 5th item ‘B’ into the focus. This item is the cue and warrants a ‘yes’ response, unlike all the other stimuli that necessitated ‘no’ responses. Item ‘J’ moves into the region of Direct Access, but does not share many features with the other items.
6. The cue ‘B’ (i.e., the 5th item) matches item ‘B’ (i.e., the 1st item), but does not match any of the other items. Therefore, this item is the item selected from the cue-based parallel retrieval, which brings item 1 back into the focus to form one functional context and warrants a yes response. In addition, item ‘D’ may be retrieved incorrectly from time to time because it is similar to the first item ‘B’ both in visual features and in temporal features (both occurred around the same time).
B B J D M
Figure 29.11
c29.indd 581
B J D B M
Model of STM performing the N-Back task.
8/17/09 2:27:13 PM
582
Psychological and Neural Mechanisms of Short-Term Memory
In addition to this interference, this item has also lost representational fidelity due to stochastic decline in neural firing when it was not in focus. When the cue letter B is presented, the participant performs a cue-based retrieval of that item. The cue best matches item 1, but it also may be subject to some similarity-based interference from item 2, which could induce an incorrect response or delay the correct response, “Yes.” There is also similarity-based interference from items 3 and 4, but this interference is much weaker. Item 1 is then brought back into the focus, replacing the cue, and the participant responds affirmatively.
SUMMARY Let us step away from this particular example and take stock of what we now know about short-term memory, both the psychological and the neural mechanisms. Our review of the structure, processes, and forgetting mechanisms of STM lead us to the following synthesis of the facts of the matter. It is this synthesis with which we close.
Neural Mechanisms of Short-Term Memory • Frontal and parietal systems mediate the control of the focus of attention by their connections with and modulations of activity in posterior regions that represent the features of representations within the focus. • The (largely posterior) systems that represent item features for perception, action, or LTM storage also represent those features for STM. Items within the focus of attention are represented by patterns of heightened, synchronized firing of neurons in these (verbal, spatial, motor, etc.) regions. • Medial temporal structures are important for binding items to their context (including information about time and spatial location), and for retrieving items whose context is no longer in the focus of attention (an STM function) or fully consolidated in neocortex (an episodic LTM function). • The inherent variability of neuronal activity may contribute to the loss of integrity of neural representations, and thus lead to forgetting.
References Psychological Mechanisms of Short-Term Memory • The core of short-term memory is a focus of attention containing a single functional context and the items bound within it. • The representations that the focus of attention operates on in STM are isomorphic with those that form the basis of initial perception and storage in LTM. • These focused representations consist of bundles of features for stored information. Those features can include those that tie an item to its functional context— for example, serial order, time, or location—and novel relations among familiar items. • Representations enter the focus of attention via perceptual encoding or via cue-based retrieval from LTM. • Controlled, active maintenance processes are required to keep a representation in the focus, especially in the face of other distracting material. • Rehearsal is not a core STM maintenance process, in that it does not keep a representation consistently within the focus. Instead, it consists of controlled but sequential retrieval of highly activated but out-of-focus LTM representations into the focus. • Forgetting occurs when the fidelity of a representation declines over time due to stochastic processes (“pure” decay), or because of similarity-based competition between representations for features (interference-based decay). Similarity also influences competition between representations for the focus of attention (retrieval or selection-based interference).
c29.indd 582
Anderson, J. R. (1983). Retrieval of information from long-term memory. Science, 200, 25–30. Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of mind. Psychological Review, 111, 1036–1060. Anderson, J. R., & Matessa, M. (1997). A production system theory of serial memory. Psychological Review, 104, 728–748. Atkinson, R. C., & Shiffrin, R. M. (1971). The control of short-term memory. Scientific American, 224, 82–90. Awh, E., Barton, B., & Vogel, E. K. (2007). Visual working memory represents a fixed number of items regardless of complexity. Psychological Science, 18, 622–628. Awh, E., & Jonides, J. (2001). Overlapping mechanisms of attention and spatial working memory. Trends in Cognitive Science, 5, 119–126. Awh, E., Jonides, J., & Reuter-Lorenz, P. A. (1998). Rehearsal in spatial working memory. Journal of Experimental Psychology: Human Perception and Performance, 24, 780–790. Awh, E., Jonides, J., Smith, E. E., Buxton, R. B., Frank, L. R., Love, T., et al. (1999). Rehearsal in spatial working memory: Evidence from neuroimaging. Psychological Science, 10, 433–437. Awh, E., Jonides, J., Smith, E. E., Schumacher, E. H., Koeppe, R. A., & Katz, S. (1996). Dissociation of storage and rehearsal in verbal working memory: Evidence from PET. Psychological Science, 7, 25–31. Baddeley, A. D. (1986). Working memory. Oxford, England: Clarendon Press. Baddeley, A. D. (1992). Working memory. Science, 225, 556–559. Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4, 417–423. Baddeley, A. D. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience, 4, 829–839. Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. A. Bower (Ed.), Recent advances in learning and motivation (Vol. 8, pp. 47–90). New York: Academic Press.
8/17/09 2:27:13 PM
References 583 Baddeley, A. D., Thomson, N., & Buchanan, M. (1975). Word length and structure of short-term memory. Journal of Verbal Learning and Verbal Behavior, 14, 575–589.
den Heyer, K., & Barrett, B. (1971). Selective loss of visual and verbal information in STM by means of visual and verbal interpolated tasks. Psychonomic Science, 25, 100–102.
Baddeley, A. D., & Warrington, E. K. (1970). Amnesia and the distinction between long- and short-term memory. Journal of Verbal Learning and Verbal Behavior, 9, 176–189.
D’Esposito, M., & Postle, B. R. (1999). The dependence of span and delayedresponse performance on the prefrontal cortex. Neuropsychologia, 37, 1303–1315.
Badre, D., & Wagner, A. D. (2005). Frontal lobe mechanisms that resolve proactive interference. Cerebral Cortex, 15, 2003–2012.
D’Esposito, M., & Postle, B. R. (2000). Neural correlates of processes contributing to working-memory function: Evidence from neuropsychological and pharmacological studies. In S. Monsell & J. Driver (Eds.), Control of cognitive processes (pp. 580–602). Cambridge, MA: MIT Press.
Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45, 2883–2901. Braver, T. S., Barch, D. M., Kelley, W. M., Buckner, R. L., Cohen, N. J., Miezin, F. M., et al. (2001). Direct comparison of prefrontal cortex regions engaged by working and long-term memory tasks. Neuroimage, 14, 48–59. Brooks, L. R. (1968). Spatial and verbal components of the act of recall. Canadian Journal of Psychology, 22, 349–368. Brown, G. D. A., Preece, T., & Hulme, C. (2000). Oscillator-based memory for serial order. Psychological Review, 107, 127–181. Brown, J. (1958). Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology, 10, 12–21. Buckner, R. L., Koutstaal, W., Schacter, D. L., Wagner, A. D., & Rosen, B. R. (1998). Frontal-anatomic study of episodic retrieval using fMRI: Pt. I. Retrieval effort versus retrieval success. NeuroImage, 7, 151–162. Burgess, N., & Hitch, G. J. (1999). Memory for order: A network model off the phonological loop and its timing. Psychological Review, 106, 551–581. Burgess, N., & Hitch, G. J. (2005). Computational models of working memory: Putting long-term memory into context. Trends in Cognitive Sciences, 9, 535–541. Burgess, N., & Hitch, G. J. (2006). A revised model of short-term memory and long-term learning of verbal sequences. Journal of Memory and Language, 55, 627–652. Cabeza, R., Dolcos, F., Graham, R., & Nyberg, L. (2002). Similarities and differences in the neural correlates of episodic memory retrieval and working memory. Neuroimage, 16, 317–330. Cabeza, R., & Nyberg, L. (2000). Imaging cognition: Pt. II. An empirical review of 275 PET and fMRI studies. Journal of Cognitive Neuroscience, 9, 254–265. Chein, J. M., & Fiez, J. A. (2001). Dissociation of verbal working memory systems components using a delayed serial recall task. Cerebral Cortex, 11, 1003–1014. Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information processing system. Psychological Bulletin, 104, 163–191. Cowan, N. (1995). Attention and memory: An integrated framework. New York: Oxford University Press. Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behavioral and Brain Sciences, 24, 87–185. Crowder, R. (1976). Principles of learning and memory. Hillsdale, NJ: Erlbaum.
Destexhe, A., & Conterars, D. (2006). Neuronal computation with stochastic network states. Science, 314, 85–90. Duncan, J., & Owen, A. M. (2000). Common regions of the human frontal lobe recruited by diverse cognitive demands. Trends in Neuroscience, 23, 475–483. Eng, H. Y., Chen, D. Y., & Jiang, Y. H. (2005). Visual working memory for simple and complex visual stimuli. Psychonomic Bulletin and Review, 12, 1127–1133. Fletcher, P. C., & Henson, R. N. A. (2001). Frontal lobes and human memory: Insights from functional neuroimaging. Brain, 124, 849–881. Frank, M. J., Loughry, B., & O’Reilly, R. C. (2001). Interactions between the frontal cortex and basal ganglia in working memory: A computational model. Cognitive, Affective, and Behavioral Neuroscience, 1, 137–160. Funahashi, S., Bruce, C. J., & Goldman-Rakic, P. S. (1989). Mnemonic coding of visual space in the monkey’s dorsolateral prefrontal cortex. Journal of Neurophysiology, 61, 331–349. Fuster, J. M. (1973). Unit activity in prefrontal cortex during delayed response performance: Neuronal correlates of transient memory. Journal of Neurophysiology, 36, 61–78. Fuster, J. M. (1995). Memory in the cerebral cortex. Cambridge, MA: MIT Press. Fuster, J. M. (2001). The prefrontal cortex: An update. Time is of the essence. Neuron, 30, 319–333. Garavan, H. (1998). Serial attention within working memory. Memory and Cognition, 26, 263–276. Gardiner, J. M., Craik, F. I. M., & Birtwist, J. (1972). Retrieval cues and release from proactive inhibition. Journal of Verbal Learning and Verbal Behavior, 11, 778–783. Gillund, G., & Shiffrin, R. M. (1984). A retrieval model for both recognition and recall. Psychological Review, 91, 1–67. Goldman-Rakic, P. S. (1987). Circuitry of primate pre-frontal cortex and regulation of behavior by representational memory. In F. Plum (Ed.), Handbook of physiology: The nervous system (Vol. 5, pp. 373–417). Bethesda, MD: American Physiological Society. Hampson, M., Driesen, N. R., Skudlarski, P., Gore, J. C., & Constable, R. T. (2006). Brain connectivity related to working memory performance. Journal of Neuroscience, 26, 13338–13343.
Damasio, A. R. (1989). Time-locked multiregional retroactivation: A system-level proposal for the neuronal substrates of recall and recognition. Cognition, 33, 25–62.
Harris, J. D. (1952). The decline of pitch discrimination with time. Journal of Experimental Psychology, 43, 96–99.
Darwin, C. J., Turvey, M. T., & Crowder, R. G. (1972). Auditory analogue of Sperling partial report procedure: Evidence for brief auditory storage. Cognitive Psychology, 3, 255–267.
Hasher, L., & Zacks, R. T. (1988). Working memory, comprehension, and aging: A review and a new view. In G. H. Bower (Ed.), The psychology of learning and motivation (Vol. 22, pp. 193–225). New York: Academic Press.
Deiber, M. P., Missonnier, P., Bertrand, O., Gold, G., Fazio-Costa, L., Ibanez, V., et al. (2007). Distinction between perceptual and attentional processing in working memory tasks: A study of phase-locked and induced oscillatory brain dynamics. Journal of Cognitive Neuroscience, 19, 158–172.
c29.indd 583
D’Esposito, M., Postle, B. R., Jonides, J., Smith, E. E., & Lease, J. (1999). The neural substrate and temporal dynamics of interference effects in working memory as revealed by event-related fMRI. Proceedings of the National Academy of Science, USA, 96, 7514–7519.
Hebb, D. O. (1949). The organization of behavior. New York: Wiley. Hockley, W. E. (1984). Analysis of response-time distributions in the study of cognitive processes. Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 598–615.
8/17/09 2:27:14 PM
584
Psychological and Neural Mechanisms of Short-Term Memory
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, USA, 70, 2554–2558. Jacobsen, C. F. (1936). The functions of the frontal association areas in monkeys. Computational Psychology Monograph, 13, 1–60. Jensen, O. (2006). Maintenance of multiple working memory items by temporal segmentation. Neuroscience, 139, 237–249. Jha, A. P., & McCarthy, G. (2000). The influence of memory load upon delay-interval activity in a working-memory task: An event-related functional MRI study. Journal of Cognitive Neuroscience, 12, 90–105. Johnson, M. K., Raye, C. L., Mitchell, K. J., Greene, E. J., Cunningham, W. A., & Sanislow, C. A. (2005). Using fMRI to investigate a component process of reflection: Prefrontal correlates of refreshing a just-activated representation. Cognitive, Affective, and Behavioral Neuroscience, 5, 339–361. Jonides, J., Lacey, S. C., & Nee, D. E. (2005). Processes of working memory in mind and brain. Current Directions in Psychological Science, 14, 2–5. Jonides, J., Lewis, R. L., Nee, D. E., Lustig, C. A., Berman, M. G., Moore, K. S. (2008). The mind and brain of short-term memory. Annual Review of Psychology, 59, 193–224. Jonides, J., & Nee, D. E. (2006). Brain mechanisms of proactive interference in working memory. Neuroscience, 139, 181–193. Jonides, J., Smith, E. E., Koeppe, R. A., Awh, E., Minoshima, S., & Mintun, M. A. (1993, June 17). Spatial working memory in humans as revealed by PET. Nature, 363, 623–625. Jonides, J., Smith, E. E., Marshuetz, C., Koeppe, R. A., & Reuter-Lorenz, P. A. (1998). Inhibition in verbal working memory revealed by brain activation. Proceedings of the National Academy of Sciences, USA, 95, 8410–8413. Keppel, G., & Underwood, B. J. (1962). Proactive-inhibition in shortterm retention of single items. Journal of Verbal Learning and Verbal Behavior, 1, 153–161.
McKone, E. (1998). The decay of short-term implicit memory: Unpacking lag. Memory and Cognition, 26, 1173–1186. Meyer, D. E., & Kieras, D. E. (1997). A computational theory of executive cognitive processes and multiple-task performance: Pt. 1. Basic mechanisms. Psychological Review, 104, 3–65. Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81–97. Murdock, B. B. (1982). A theory for the storage and retrieval of item and associative information. Psychological Review, 89, 609–626. Nairne, J. S. (2002). Remembering over the short-term: The case against the standard model. Annual Review of Psychology, 53, 53–81. Nelson, J. K., Reuter-Lorenz, P. A., Sylvester, C. Y. C., Jonides, J., & Smith, E. E. (2003). Dissociable neural mechanisms underlying responsebased and familiarity-based conflict in working memory. Proceedings of the National Academy of Sciences, USA, 100, 11171–11175. Oberauer, K. (2002). Access to information in working memory: Exploring the focus of attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 411–421. Oberauer, K. (2006). Is the focus of attention in working memory expanded through practice? Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 197–214. Olson, I. R., Page, K., Moore, K. S., Chatterjee, A., & Verfaellie, M. (2006). Working memory for conjunctions relies on the medial temporal lobe. Journal of Neuroscience, 26, 4596–5601. Pashler, H. (1988). Familiarity and visual change detection. Perception and Psychophysics, 44, 369–378. Pasternak, T., & Greenlee, M. W. (2005). Working memory in primate sensory systems. Nature Reviews Neuroscience, 6, 97–107. Peterson, L. R., & Peterson, M. J. (1959). Short-term retention of individual verbal items. Journal of Experimental Psychology, 58, 193–198.
Lisman, J. E., & Idiart, M. A. P. (1995). Storage of 7/-2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.
Plaut, D. C. (1997). Structure and function in the lexical system: Insights from distributed models of word reading and lexical decision. Language and Cognitive Processes, 12, 765–805.
Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279–281.
Polk, T. A., Simen, P., & Lewis, R. L. (2002). A computational approach to control in complex cognition. Cognitive Brain Research, 15, 71–83.
Lustig, C., Matell, M. S., & Meck, W. H. (2005). Not “just” a coincidence: Frontal-striatal interactions in working memory and interval timing. Memory, 13, 441–448.
Polk, T. A., Stallcup, M., Aguirre, G. K., Alsop, D. C., D’Esposito, M., Detre, J. A., et al. (2002). Neural specialization for letter recognition. Journal of Cognitive Neuroscience, 14, 145–159.
Malmo, R. B. (1942). Interference factors in delayed response in monkeys after removal of frontal lobes. Journal of Neurophysiology, 5, 295–308.
Postle, B. R. (2005). Delay-period activity in the prefrontal cortex: One function is sensory gating. Journal of Cognitive Neuroscience, 17, 1679–1690.
Martin, R. C. (1993). Short-term memory and sentence processing: Evidence from neuropsychology. Memory and Cognition, 21, 176–183. McClelland, J. L., McNaughton, B. L., & O’Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102, 419–457. McElree, B. (1998). Attended and non-attended states in working memory: Accessing categorized structures. Journal of Memory and Language, 38, 225–252. McElree, B. (2001). Working memory and focal attention. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 817–835.
Postle, B. R. (2006). Working memory as an emergent property of the mind and brain. Neuroscience, 139, 23–38. Postle, B. R., & D’Esposito, M. (1999). “What” then “where” in visual working memory: An event-related, fMRI study. Journal of Cognitive Neuroscience 11, 585–597. Pylyshyn, Z., W. (1994). Some primitive mechanisms of spatial attention. Cognition, 50, 363–384. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, USA, 98, 676–682.
McElree, B. (2006). Accessing recent events. Psychology of Learning and Motivation, 46, 155–200.
Ranganath, C. (2006). Working memory for visual objects: Complementary roles of inferior temporal, medial temporal, and prefrontal cortex. Neuroscience, 139, 277–289.
McElree, B., & Dosher, B. A. (1989). Serial position and set size in shortterm memory: Time course of recognition. Journal of Experimental Psychology: General, 118, 346–373.
Ranganath, C., & Blumenfeld, R. S. (2005). Doubts about double dissociations between short- and long-term memory. Trends in Cognitive Sciences, 9, 374–380.
McKone, E. (1995). Short-term implicit memory for words and nonwords. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1108–1126.
Ranganath, C., Johnson, M. K., & D’Esposito, M. (2003). Prefrontal activity associated with working memory and episodic long-term memory. Neuropsychologia, 41, 378–379.
c29.indd 584
8/17/09 2:27:14 PM
References 585 Renart, A., Parga, N., & Rolls, E. T. (1999). Backward projections in the cerebral cortex: Implications for memory storage. Neural Computation, 11, 1349–1388. Repov, G., & Baddeley, A. D. (2006). The multi-component model of working memory: Explorations in experimental cognitive psychology. Neuroscience, 139, 5–21. Reuter-Lorenz, P. A., & Jonides, J. (2007). The executive is central to working memory: Insights from age, performance and task variations. In A. R. Conway, C. Jarrold, M. J. Kane, A. Miyake, & J. N. Towse (Eds.), Variations in working memory (pp. 250–270). New York: Oxford University Press. Reuter-Lorenz, P. A., Jonides, J., Smith, E. E., Hartley, A., Miller, A., Marshuetz, C., et al. (2000). Age differences in the frontal lateralization of verbal and spatial working memory revealed by PET. Journal of Cognitive Neuroscience, 12, 174–187. Roediger, H. L., Knight, J. L., & Kantowitz, B. H. (1977). Inferring decay in short-term-memory: The issue of capacity. Memory and Cognition, 5, 167–176. Ruchkin, D. S., Grafman, J., Cameron, K., & Berndt, R. S. (2003). Working memory retention systems: A state of activated long-term memory. Behavioral and Brain Sciences, 26, 709–777. Sakai, K. (2003). Reactivation of memory: Role of medial temporal lobe and prefrontal cortex. Reviews in Neurosciences, 14, 241–251. Scoville, W. B., & Milner, B. (1957). Loss of recent memory after bilateral hippocampal lesions. Journal of Neurological and Neurosurgical Psychiatry, 20, 11–21. Shallice, T., & Warrington, E. K. (1970). Independent functioning of verbal memory stores: A neuropsychological study. Quarterly Journal of Experimental Psychology, 22, 261–273. Smith, E. E., & Jonides, J. (1997). Working memory: A view from neuroimaging. Cognitive Psychology, 33, 5–42. Smith, E. E., & Jonides, J. (1999). Neuroscience: Storage and executive processes in the frontal lobes. Science, 283, 1657–1661. Smith, E. E., Jonides, J., Koeppe, R. A., Awh, E., Schumacher, E. H., & Minoshima, S. (1995). Spatial versus object working-memory: PET investigations. Journal of Cognitive Neuroscience, 7, 337–356. Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs, 74, Whole No. 498. Sternberg, S. (1966). High speed scanning in human memory. Science, 153, 652–654. Sussillo, D., Toyoizumi, T., & Maass, W. (2007). Self-tuning of neural circuits through short-term synaptic plasticity. Journal of Neurophysiology, 97, 4079–4095. Talmi, D., Grady, C. L., Goshen-Gottstein, Y., & Moscovitch, M. (2005). Neuroimaging the serial position curve. Psychological Science, 16, 716–723. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences, USA, 94, 14792–14797. Thompson-Schill, S. L., Jonides, J., Marshuetz, C., Smith, E. E., D’Esposito, M., Kan, I. P., et al. (2002). Effects of frontal lobe damage on interference effects in working memory. Journal of Cognitive, Affective and Behavioral Neuroscience, 2, 109–120. Todd, J. J., & Marois, R. (2004). Capacity limit of visual short-term memory in human posterior parietal cortex. Nature, 428, 751–754. Todd, J. J., & Marois, R. (2005). Posterior parietal cortex activity predicts individual differences in visual short-term memory capacity. Cognitive, Affective, and Behavioral Neuroscience, 5, 144–155.
c29.indd 585
Trick, L. M., & Pylyshyn, Z. W. (1993). What enumeration studies can show us about spatial attention: Evidence for limited capacity preattentive processing. Journal of Experimental Psychology: Human Perception and Performance, 19, 331–351. Ungerleider, L. G., & Haxby, J. V. (1994). “What” and “where” in the human brain. Current Opinion in Neurobiology, 4, 157–165. Unsworth, N., & Engle, R. W. (2007). The nature of individual differences in working memory capacity: Active maintenance in primary memory and controlled search from secondary memory. Psychological Review, 114, 104–132. Vallar, G., & Baddeley, A. D. (1984). Fractionation of working memory: Neuropsychological evidence for a phonological short-term store. Journal of Verbal Learning and Verbal Behavior, 23, 151–161. Vallar, G., & Papagano, C. (2002). Neuropsychological impairments of verbal short-term memory. In A. D. Baddeley, M. D. Kopelman, & B. A. Wilson (Eds.), The handbook of memory disorders (2nd ed., pp. 249–270). Chichester, West Sussex, England: Wiley. Verhaeghen, P., & Basak, C. (in press). Aging and switching of the focus of attention in working memory: Results from a modified N-Back task. Quarterly Journal of Experimental Psychology. Verhaeghen, P., Cerella, J., & Basak, C. (2004). A working memory workout: How to expand the focus of serial attention from one to four items in 10 hours or less. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1322–1337. Vogel, E. K., & Machizawa, M. G. (2004, December 18). Neural activity predicts individual differences in visual working memory capacity. Nature, 426, 748–751. Vogel, E. K., Woodman, G. F., & Luck, S. J. (2006). The time course of consolidation in visual working memory. Journal of Experimental Psychology: Human Perception and Performance, 32, 1436–1451. Wager, T. D., & Smith, E. E. (2003). Neuroimaging studies of working memory: A meta-analysis. NeuroImage, 3, 255–274. Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89–104. Wheeler, M. E., Peterson, S. E., & Buckner, R. L. (2000). Memory’s echo: Vivid remembering reactivates sensory-specific cortex. Proceedings of the National Academy of Sciences, USA, 97, 11125–11129. Wheeler, M. E., Shulman, G. L., Buckner, R. L., Miezin, F. M., Velanova, K., & Peteresen, S. E. (2006). Evidence for separate perceptual reactivation and search processes during remembering. Cerebral Cortex, 16, 949–959. Wickens, D. D. (1970). Encoding categories of words: Empirical approach to meaning. Psychological Review, 77, 1–15. Wilken, P., & Ma, W. J. (2004). A detection theory account of change detection. Journal of Vision, 4, 1120–1135. Wilson, F. A. W., O’Scalaidhe, S. P., & Goldman-Rakic, P. S. (1993). Dissociation of object and spatial processing domains in primate prefrontal cortex. Science, 260, 1955–1958. Wixted, J. T. (2004). The psychology and neuroscience of forgetting. Annual Review of Psychology, 55, 235–269. Xu, Y., & Chun, M. M. (2006). Dissociable neural mechanisms supporting visual short-term memory for objects. Nature, 440, 91–95. Yantis, S., & Serences, J. T. (2003). Cortical mechanisms of space-based and object-based attentional control. Current Opinion in Neurobiology, 13, 187–193. Zucker, R. S., & Regehr, W. G. (2002). Short-term synaptic plasticity. Annual Review of Physiology, 64, 355–405.
8/17/09 2:27:15 PM
Chapter 30
Forgetting and Retrieval BRICE A. KUHL AND ANTHONY D. WAGNER
Retrieval of episodic memories—conscious memories of past events—often provides critical information that can shape current thought and behavior. Although successful remembering is generally thought of as far more desirable than forgetting, it is likely that forgetting is also an important component of an adaptive memory system (M. C. Anderson, 2003; Bjork, 1989; Schacter, 1999). To fully understand the functioning of episodic memory, it is important to consider both the situations and mechanisms that lead to successful remembering as well as those that contribute to forgetting. Indeed, the phenomena of remembering and forgetting are intimately related—that is, we often forget precisely because we have remembered some other information. What we ultimately remember and forget is influenced both by our prior mnemonic experiences as well as the functioning of neurobiological mechanisms that guide mnemonic retrieval. In particular, the frontal lobes—which are known to play an important role in goaldirected attention and behavior—are central to the ability to direct retrieval toward those memories that are relevant and away from those that are irrelevant. We consider two broad classes of forgetting and their corresponding relations to frontal lobe function. First, we review evidence that our ability to remember is often complicated by interference from competing memories, and that these situations (a) increase the likelihood of forgetting and (b) increase demands on the prefrontal cortex (PFC). Second, we consider situations in which our mnemonic activities require selecting against, or avoiding, particular memories, describing evidence that such acts of selection (a) increase the likelihood of later forgetting selected-against memories and (b) are supported by the PFC. We conclude by situating the relationship between the PFC and forgetting within the broader context of the PFC and the control of cognition and behavior.
GROSS ANATOMY AND CONNECTIVITY OF THE PREFRONTAL CORTEX In this chapter, we primarily focus on the role of the PFC in regulating episodic retrieval and forgetting. Thus, before considering specific classes of forgetting and their relation to the PFC, it is worth briefly describing the gross anatomy of the frontal lobes—namely, subregions within the PFC that putatively support distinct functional mechanisms. The prefrontal cortex is generally divided into ventrolateral, dorsolateral, frontopolar, and medial subregions (Figure 30.1). In the human, the ventrolateral PFC (VLPFC; Figure 30.1A) corresponds to the inferior frontal gyrus, which includes, from the caudal to rostral extent, inferior frontal pars opercularis (Brodmann Area [BA] 44), inferior frontal pars triangularis (BA 45), and inferior frontal pars orbitalis (an area Petrides & Pandya, 2002, term area 47/12). Although Petrides and Pandya (2002) refer to area 47/12 and BA 45 collectively as the mid-VLPFC, distinguishing these regions from caudally situated BA 44, in this review, we highlight functional dissociations between area 47/12 and area 45. Thus, we refer to inferior frontal pars orbitalis (area 47/12) as the anterior VLPFC, pars triangularis (BA 45) as the mid-VLPFC, and pars opercularis (BA 44) as the posterior VLPFC. The VLPFC is separated from the dorsolateral PFC (DLPFC) by the inferior frontal sulcus in humans (in monkeys, the principal sulcus marks this boundary). Although DLPFC has been used to refer to a broad range of lateral PFC regions, we use DLPFC to refer to the middle frontal gyrus. As we discuss, episodic retrieval and forgetting have been linked with activity in BA’s 46 and 9/46 (Figure 30.1A)—subregions of DLPFC that Petrides and Pandya (1999) refer to as mid-DLPFC. Rostral to DLPFC and VLPFC is the frontopolar cortex (FPC; BA 10; Figure 30.1A-B). A final area of interest for the present chapter is the anterior cingulate cortex (ACC; BA’s 24 and 32; Figure 30.1B), which is situated along the medial wall of the PFC, immediately superior to the corpus callosum. Importantly, PFC subregions are both interconnected and connected with posterior cortical sites, suggesting that
Supported by the National Institute of Mental Health (5R01MH080309 and 5R01-MH076932) and the Alfred P. Sloan Foundation. The authors thank Ben Levy for insightful discussion. 586
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c30.indd 586
8/18/09 6:27:43 PM
Interference and Memory Retrieval (A) 4
8B 8Ad
6
9 8Av
9/46d
9/46v 6 44
46
45A
45B
10 47/12
587
retrosplenial cortex, DLPFC may interact with hippocampal and parahippocampal structures (Petrides, 2005). The FPC is connected both with the DLPFC and VLPFC as well as with the superior temporal cortex, cingulate cortex, and retrosplenial cortex, suggesting that the FPC may be particularly well suited to incorporate diverse sources of information (Petrides, 2005; Petrides & Pandya, 2007). The ACC is also widely connected with cortical sites, including multiple lateral PFC sites (DLPFC, in particular), the posterior parietal cortex, and the medial temporal lobe cortex (Pandya, Van Hoesen, & Mesulam, 1981). Thus, whereas DLPFC and VLPFC have fairly distinct patterns of connectivity with the posterior cortical sites, the FPC and ACC may each interact with both DLPFC and VLPFC structures in coordinating goal-directed behavior.
(B) 8B
INTERFERENCE AND MEMORY RETRIEVAL
9
24
CC
32 10 25
14
Figure 30.1 A: Lateral view of the PFC and corresponding cytoarchitectonic areas. B: Medial view of the PFC. Note. (A) DLPFC ⫽ Areas 46 and 9/46. VLPFC ⫽ Areas 47/12, 45, and 44. FPC ⫽ Area 10. (B) ACC ⫽ Areas 32 and 24. From “Dorsolateral Prefrontal Cortex: Comparative Cytoarchitectonic Analysis in the Human and the Macaque Brain and Corticocortical Connection Patterns,” by M. Petrides and D. N. Pandya, 1999, European Journal of Neuroscience, 11, pp. 1011–1036. Copyright 1999 by Blackwell Publishing. Reprinted with permission.
PFC subregions are well equipped to coordinate diverse cognitive operations. For example, the VLPFC is strongly connected with cortical areas in the lateral and medial temporal lobe, including (but not limited to) the inferotemporal cortex, superior temporal cortex, and, more medially, the perirhinal and parahippocampal cortex (Petrides & Pandya, 2002). The DLPFC has substantial reciprocal connections with posterior parietal cortex, superior temporal cortex, retrosplenial cortex, anterior and posterior cingulate cortex, as well as connectivity with VLPFC (Petrides & Pandya, 1999). Notably, through its connections with the
c30.indd Sec1:587
Perhaps the most widely accepted and well-documented cause of forgetting is interference. Interference occurs whenever irrelevant memories compete with relevant memories (Mensink & Raaijmakers, 1988; for reviews, see M. C. Anderson & Spellman, 1995; Wixted, 2004). The extent to which interference contributes to forgetting is related both to the number of irrelevant memories that compete as well as the strength of these irrelevant memories. Most typically, interference is thought to occur during the act of retrieval, creating situations of retrieval competition. Retrieval competition has been particularly well studied in three classic behavioral paradigms. First, memories of past experiences often interfere with our ability to retrieve memories of more recent experiences—a situation termed proactive interference. Conversely, the ability to retrieve memories of past experiences is often subject to interference from more recent memories—retroactive interference. Finally, even when the order of learning is not relevant, the general principle that associates of a retrieval cue compete with each other during retrieval has been studied in the fan effect (J. R. Anderson, 1974). In the following sections, we briefly review first the classic behavioral evidence concerning these three situations of interference and then potential neurobiological mechanisms that serve to overcome interference. Classic Interference Phenomena Proactive interference (PI) and retroactive interference (RI) have been the subject of extensive behavioral research (for reviews, see M. C. Anderson & Spellman, 1995; Wixted, 2004), and have been best illustrated in classic A-B, A-C paradigms (Figure 30.2). In a standard A-B, A-C
8/18/09 6:27:43 PM
588
Forgetting and Retrieval
Time
Retroactive Interference A–B, A–C
A–B, Filler
Target list
Learn A–B
Learn A–B
Manipulation
Learn A–C
Filler
Worse
Better
Test for B: A⫺ ?
Time
Proactive Interference A–B, A–C
Filler, A–C
Manipulation
Learn A–B
Filler
Target list
Learn A–C
Learn A–C
Test for C: A⫺ ?
Worse
Better
Figure 30.2 Schematic of retroactive interference and proactive interference paradigms. Note. In both paradigms, the association between a cue (A term) and multiple associates (B & C terms) increases interference and, therefore, forgetting. In RI, the critical manipulation is whether an interfering second list is studied, whereas in PI the critical manipulation is whether an interfering prior list is studied.
paradigm, an initial list of A-B cue-associate pairs (e.g., SHOE-HOUSE) is studied, followed by a second list of A-C pairs in which some of the previously studied cues are paired with new associates (e.g., SHOE-ROPE). When memory for pairs from the second list is tested, the influence of proactive interference is reflected in poorer recall for those pairs that overlap with pairs from the first list, relative to pairs in the second list that are completely unrelated to pairs in the first list. Thus, learning of new information is impaired when previously learned information interferes. Retroactive interference, on the other hand, is evidenced by poorer recall of A-B pairs as a result of subsequently learning overlapping A-C pairs (e.g., memory for SHOE-HOUSE is impaired by learning SHOE-ROPE). Thus, in an A-B, A-C paradigm, either retroactive or proactive interference may occur, depending on which pairs (A-B or A-C) are tested. Several features of PI and RI are of note. First, the magnitude of interference-related forgetting observed depends on the extent to which retrieval cues reference items from both study lists—an observation made in the earliest RI studies (Müller & Pilzecker, 1900). For example, RI is greater when A-B study is followed by A-C study than C-D study (i.e., a new cue and new associate). Second, PI and RI are maximally observed at different points in time. Specifically, PI is maximal when the lag between A-C study and A-C recall is long, and may be negligible when the delay is very short. RI, on the other hand, is maximal when the delay between A-C study and A-B recall is short,
c30.indd Sec1:588
with the magnitude of RI decreasing as the delay increases (Postman, Stark, & Fraser, 1968). Importantly, both of these properties of PI/RI can be well explained in terms of retrieval competition that occurs during cued recall (McGeoch, 1942; Mensink & Raaijmakers, 1988). That is, an A-B, A-C paradigm elicits greater RI than an A-B, C-D paradigm because the A-B, A-C paradigm creates a situation in which a single retrieval cue (A) is linked to two associates—thereby enhancing retrieval competition and, therefore, the likelihood of forgetting. Similarly, changes in the relative magnitude of RI and PI at different delays can be explained in terms of changes in the relative salience of B and C terms and, therefore, changes in retrieval competition. For example, some models suggest differential decay rates for B and C terms following A-C study, meaning that the relative strengths of the associate terms change with time (J. R. Anderson, 1983b). Other models suggest that changes in the availability of contextual cues contribute to relative changes in the accessibility of B and C terms (Estes, 1955; Mensink & Raaijmakers, 1988). In either case, at very short delays following A-C study, A-C pairs are thought to be highly salient relative to A-B pairs, meaning that while RI will be high (if B terms are tested), PI will be low (if C terms are tested). At longer delays, the relative salience of A-C pairs decreases, thereby reducing RI, but increasing the potential for PI. Retrieval competition has also been studied in the context of the fan effect (J. R. Anderson, 1974). The classic finding in fan effect paradigms is that as the amount of related information stored in long-term memory grows, the time it takes to verify that one recognizes a particular piece of that information increases (similarly, increases in fan size may also decrease accuracy). In a typical fan effect task, subjects study a series of propositions (e.g., “A doctor is in the bank,” “A fireman is in the park,” “A lawyer is in the park,” etc.). Importantly, individual elements may appear in multiple propositions (e.g., “park” is associated with both “lawyer” and “fireman”). When elements are associated with multiple propositions (a “fan”), the time it takes to recognize a proposition containing those elements (i.e., “high fan” propositions) increases (J. R. Anderson, 1983a). The fan effect has proven to be a highly consistent finding and has inspired influential models of human memory (J. R. Anderson, 1976). According to the standard account of the fan effect, during recognition memory a finite amount of activation is shared between all the elements in a fan. When the fan size is high, relevant elements receive correspondingly less activation, reflecting increased competition, and retrieval time is therefore slowed (J. R. Anderson, 1976, 1983a). Thus far, we have considered a single mechanism— retrieval competition—to explain forgetting in RI and PI paradigms and to account for recognition memory slowing
8/18/09 6:27:44 PM
Interference and Memory Retrieval
in the fan paradigm. However, alternate accounts have been advanced for RI and PI (for reviews, see M. C. Anderson, 2003; Wixted, 2004). In an influential two-factory theory, Melton and Irwin (1940) argued that some factor in addition to response competition contributes to RI, noting that substantial RI is often observed even when there is little behavioral evidence that A-C pairs actually compete with retrieval of A-B pairs. They speculated that a second factor contributing to RI (in addition to response competition) is the unlearning, or direct weakening, of original (A-B) associations. While our focus on retrieval competition as the primary mechanism of interference-related forgetting reflects more recent arguments that classic interference phenomena can be fully accounted for by retrieval competition alone (Mensink & Raaijmakers, 1988), we later consider a mechanism of forgetting—inhibition—that bears many similarities to Melton and Irwin’s (1940) unlearning mechanism. Specifically, inhibition shares with unlearning the idea that irrelevant memories may be directly weakened. Neurobiological Mechanisms of Interference Resolution A hallmark of frontal lobe damage is increased distractibility or perseveration upon irrelevant information. Consistent with this general observation, frontal lobe patients suffer an exaggerated susceptibility to PI (e.g., Shimamura, Jurica, Mangels, Gershberg, & Knight, 1995; Smith, Leonard, Crane, & Milner, 1995). Specifically, whereas frontal lobe patients typically learn list 1 items (e.g., A-B pairs) as well as controls, after studying a second list, their recall for list 2 items (e.g., A-C pairs) is impaired, relative to controls. The selective impairment for A-C pairs indicates that frontal lobe patients are relatively unimpaired at encoding information when interference is not present, but are particularly impaired when prior learning interferes with memory for subsequently encountered information. Indeed, during A-C cued recall, frontal lobe patients often show a greater tendency to generate B terms (intrusions), highlighting the sensitivity of frontal lobe patients to competition from prior learning (Shimamura et al., 1995). While exaggerated susceptibility to PI has frequently been associated with frontal lobe damage, frontal lobe patients vary widely in the location and extent of their damage. As a result, initial studies of frontal lobe patients yielded considerable variability in the subregions of PFC implicated in resolving PI. For example, whereas some reports suggested a greater sensitivity to PI in patients with left frontal damage (Moscovitch, 1982), others revealed greater PI in patients with right frontal damage (Turner, Cipolotti, Yousry, & Shallice, 2007). Similarly, there were reports in which left and right frontal patients show comparable
c30.indd Sec1:589
589
increases in sensitivity to PI relative to controls (Smith et al., 1995), but also reports of frontal patients displaying relatively normal sensitivity to PI despite impairments on other “frontal tests” (Janowsky, Shimamura, Kritchevsky, & Squire, 1989). Thus, while initial studies of frontal patients highlighted that interference resolution likely depends on the integrity of the frontal lobes, this work yielded ambiguity regarding the specific PFC subregions that are critical for overcoming mnemonic competition. Progress on this important issue has greatly accelerated over the past decade, largely because the higher resolution of functional neuroimaging methods—positron emission tomography (PET) and functional magnetic resonance imaging (fMRI)—has enabled researchers to begin to examine whether interference resolution is differentially associated with functional responses in specific PFC subregions. As we next review, considerable neuroimaging evidence, accumulated over the past decade, now indicates that at least some forms of interference resolution are associated with activation in the left ventrolateral PFC (VLPFC; Figure 30.1A). Moreover, recent neuropsychological investigations of patients with damage that specifically includes the left VLPFC also highlight the necessity of this region for overcoming mnemonic competition. In a classic PET study of proactive interference (Dolan & Fletcher, 1997), subjects learned an initial set of word pairs (e.g., DOG-BOXER) followed by a second list that either contained completely new word pairs (e.g., CLOTHVELVET), previously studied word pairs (e.g., DOGBOXER), or word pairs that contained previously studied words paired with new associates (e.g., DOG-DALMATION or SPORTSMAN-BOXER). When list 2 contained completely new associates, relative to conditions that contained at least one old word, enhanced activation was observed in the hippocampus and medial temporal lobe cortex, suggesting that the medial temporal lobes preferentially respond to the novelty of to-be-learned information (Figure 30.3). In contrast, when list 2 contained previously studied words paired with new associates (a situation of interference equivalent to the A-C condition described previously), activation was observed in a region of the left lateral PFC that encompassed the mid/posterior VLPFC and DLPFC. Importantly, this left lateral PFC response was driven not by the novelty of individual words, but by the extent to which list 2 learning was complicated by interference from memory for list 1 pairs. Subsequent neuroimaging work provided additional evidence that the left VLPFC, in particular, plays a critical role in resolving PI. For example, in an fMRI study of PI during episodic encoding (Henson, Shallice, Josephs, & Dolan, 2002), activation in the left VLPFC decreased as a word pair (A-B) was repeatedly studied, but increased
8/18/09 6:27:45 PM
590
Forgetting and Retrieval (A)
Left Lateral PFC
(B)
94
Medial Temporal Lobes
84
93
83
92 82 91 81
90 89
80 New-New New-Old
Old-New
Figure 30.3 Activation in the left lateral PFC (A) and the medial temporal lobes (B) as a function of encoding condition. Note. “New-New” corresponds to encoding of a novel word pair (equivalent to an A-B pair in a PI design); “New-Old” and “Old-New” correspond to a word pair in which one member of the pair is novel and the other was previously studied with a different word (equivalent to an A-C pair); “Old-Old” corresponds to a word pair that is repeated, intact (equivalent to repeated exposure to an A-B term). The left lateral
when one of the pair members changed (A-C). Similarly, in another study (Fletcher, Shallice, & Dolan, 2000), left VLPFC activation increased when previously studied word pairs were rearranged, relative to their initial study configuration. While the relationship between PI and the left VLPFC has primarily been evidenced during encoding of A-C pairs, Henson et al. (2002) also observed greater activation in the left VLPFC, along with the anterior cingulate cortex (ACC), when retrieval occurred in the face of PI (neuroimaging data from the fan effect paradigm further implicate the left VLPFC in competitive retrieval, as described next). Studies of retrieval from semantic memory and working memory also implicate the left VLPFC in guiding interference-laden mnemonic processing. Specifically, neuroimaging studies of semantic retrieval have consistently found greater engagement of the left VLPFC—particularly the left mid-VLPFC—when retrieval involves selecting between competing alternatives (e.g., Badre, Poldrack, Pare-Blagoev, Insler, & Wagner, 2005; Thompson-Schill, D’Esposito, Aguirre, & Farah, 1997; for review, see Badre & Wagner, 2007). Moreover, when PFC damage includes the left mid-VLPFC, the ability to retrieve relevant semantic representations from among competitors is impaired (Martin & Cheng, 2006; Metzler, 2001; Thompson-Schill et al., 1998), establishing the necessity of this region for interference resolution during semantic retrieval. Similarly, within working memory, imaging studies have consistently implicated the left mid-VLPFC in overcoming PI that accumulates across trials (for review, see Jonides & Nee, 2006). Moreover, PFC lesions that include damage
c30.indd Sec1:590
Old-Old
New-New New-Old
Old-New
Old-Old
PFC is maximally engaged when the word pair being encoded partially overlaps with a previous pair (i.e., when interference is present). In contrast, medial temporal lobe activation is maximal when the word pair being encoded is completely novel. From “Dissociating Prefrontal and Hippocampal Function in Episodic Memory Encoding,” by R. J. Dolan and P. C. Fletcher, 1997, Nature, 388, pp. 582–585. Copyright 1997 by Macmillan Publishers. Adapted with permission.
to the left mid-VLPFC (Thompson-Schill et al., 2002) and focal transient disruption of the left mid-VLPFC with transcranial magnetic stimulation (Feredoes, Tononi, & Postle, 2006) impair working memory performance in the face of PI. Collectively, these convergent findings across episodic, semantic, and working memory contexts indicate that the left mid-VLPFC contributes to interference resolution. While the left mid-VLPFC appears to play a critical role in resolving interference, there remains the question of how, in mechanistic terms. A prominent hypothesis, derived primarily from neuroimaging and patient data, is that the left mid-VLPFC supports the selection of task-relevant representations when competition is present (Thompson-Schill et al., 1997, 1998). That is, when multiple semantic—or episodic—representations become simultaneously active, a left mid-VLPFC bias mechanism is posited to favor relevant representations over irrelevant representations (Badre & Wagner, 2007). Accordingly, when viewed through this light, many instances of forgetting may reflect failures of mnemonic selection, as opposed to retrieval, per se. Notably, within semantic and working memory paradigms, selection has typically been studied—and left mid-VLPFC activation has typically been observed—during retrieval (of either semantic information or working memory contents). Within episodic memory, however, left mid-VLPFC activation has most frequently been observed in PI paradigms during A-C encoding. Thus, it has been argued that A-C encoding engages the same selection mechanism that is observed during semantic and working memory retrieval (Henson et al., 2002). However, it should be noted that left mid-VLPFC activation during encoding
8/18/09 6:27:45 PM
Inhibition as a Cause of Forgetting
might also be recast in terms of retrieval-related activation. That is, A-C associations may become differentiable from A-B associations through an elaborative encoding process in which semantic properties unique to A-C associations are selectively favored during A-C study—a process that would amount to competitive semantic retrieval. In either case, competition from irrelevant associations drives left mid-VLPFC activation during A-C encoding. During episodic retrieval, left mid-VLPFC engagement has also been observed when competition is present. In particular, the link between left VLPFC engagement and retrieval competition has been well established in studies of the fan effect. In fan paradigms, the increase in reaction time that is associated with “high fan” recognition is thought to directly correspond to prolonged engagement of mechanisms that guide retrieval in the face of competition (Sohn, Goode, Stenger, Carter, & Anderson, 2003). Consistent with this perspective, a pair of fMRI studies revealed that high fan, relative to low fan, recognition is associated with increased engagement of a region of the left lateral PFC, inclusive of the left mid-VLPFC (Sohn et al., 2003, 2005). This neural correlate of the fan effect provides a compelling link between recent neuroimaging work and classic interference theory, indicating that direct manipulations of retrieval competition increase the engagement of the left mid-VLPFC. Moreover, left midVLPFC engagement has been observed in other situations of competitive episodic retrieval, such as when the retrieval task requires recollection of specific (criterial) details of an encoding event (Dobbins, Foley, Schacter, & Wagner, 2002; Dobbins & Wagner, 2005; Kostopoulos & Petrides, 2003; Lundstrom, Ingvar, & Petersson, 2005). Summary Initial observations of increased sensitivity to interference following frontal lobe damage have now been complemented by substantial evidence that the left mid-VLPFC, in particular, plays a fundamental role in resolving interference. From a mechanistic perspective, the left midVLPFC is thought to resolve interference by selecting goal-relevant representations in the face of competition from irrelevant representations. This putative selection mechanism—and the many situations in which left midVLPFC-mediated selection has been observed—accords well with the perspective from classic interference theory that forgetting is well accounted for in terms of retrieval competition. In other words, retrieval competition powerfully influences the likelihood of forgetting, and it is in precisely these situations of enhanced retrieval competition that left mid-VLPFC selection resolves interference.
c30.indd Sec2:591
591
INHIBITION AS A CAUSE OF FORGETTING In the previous section, we highlighted the potential for retrieval competition from irrelevant memories to obscure access to currently relevant memories and thereby produce retrieval failures, or forgetting. However, overcoming competition from irrelevant memories can also have consequences for what is remembered in the future. That is, when competing memories are selected against, there is a decreased likelihood that these memories will later be remembered (if they later become relevant). This form of forgetting is related to retrieval competition—it is a reaction to, and consequence of, competition from irrelevant memories—but the mechanism of forgetting is thought to reflect the direct weakening, or inhibition, of competing memories. We next consider two situations in which the relationship between forgetting and memory inhibition has been studied: (1) when the act of remembering a target memory requires selecting against closely related, but irrelevant, memories, and (2) when there is an explicit intention to forget or to keep out of mind individual memories or sets of memories. In each case, we consider the behavioral evidence supporting the occurrence of inhibition as well as the neurobiological mechanisms through which inhibition may occur. Retrieval-Induced Forgetting Competition that is present during the act of retrieval can compromise successful retrieval of target memories. Although we previously emphasized the demand to resolve competition such that successful retrieval, or selection, may occur, it has also been argued that retrieval competition is resolved through the inhibition of those memories that compete with the target memories (M. C. Anderson, Bjork, & Bjork, 1994; M. C. Anderson & Spellman, 1995; for reviews, see M. C. Anderson, 2003; Levy & Anderson, 2002). Functionally, the inhibition of irrelevant, competing memories is thought to be adaptive in that it reduces competition during the retrieval of target memories (M. C. Anderson, 2003; Bjork, 1989). However, to the extent that previously irrelevant memories later become relevant, the inhibition that they suffered increases the likelihood that they will be forgotten (for review, see Levy & Anderson, 2002). That the retrieval of target memories can produce forgetting of related memories has been termed retrievalinduced forgetting and has been demonstrated in a variety of situations (for review, see Levy & Anderson, 2002). In a standard retrieval-induced forgetting paradigm, participants study a series of cue-associate pairs with multiple associates studied with each cue (e.g., “FRUIT-banana,”
8/18/09 6:27:45 PM
592
Forgetting and Retrieval
“FRUIT-apple,” “DRINK-whiskey,” “DRINK-scotch”). After study, participants engage in selective retrieval practice of some of the associates of some of the cues. For example, participants might receive “FRUIT-a_” as a probe to remember “apple.” Typically, half of the associates of half of the cues are practiced, three times each, in this manner. Finally, all associates—both practiced and unpracticed—are tested in a final, cued recall phase where cues are presented along with the first letters of individual associates. Not surprisingly, practiced associates (e.g., “apple”; referred to as RP⫹ items) are better remembered during the final test than unpracticed associates (Figure 30.4). However, some of the unpracticed associates are related to practiced associates (e.g., “banana”; RP⫺ items), whereas other unpracticed associates are related to a cue for which none of the associates were practiced (e.g., “scotch” is related to “DRINK,” but none of the associates of “DRINK” receive practice; NRP items). Of critical interest, RP⫺ items—the associates that are related to practiced items—are more poorly remembered than NRP (baseline) items (Figure 30.4). In other words, practice retrieving “apple” can make it more difficult to remember “banana”—evidence for retrieval-induced forgetting. This forgetting is thought to occur precisely because “banana” is related to “apple”—that is, during retrieval of “apple,” “banana” competes and is subject to inhibition as a means of reducing this competition. This inhibition is manifested, behaviorally, in an increased rate of forgetting. That RP⫺ items are more likely to be forgotten than NRP items does not, on its own, indicate that RP⫺ items are necessarily inhibited. Instead, given that RP⫺ items are tested using the same cues (e.g., “FRUIT-”) as RP⫹ items,
Fruit
it is possible that the strengthening of RP⫹ items creates enhanced retrieval competition during RP⫺ recall, thereby blocking or occluding access to RP⫺ items (Mensink & Raaijmakers, 1988). Evidence of memory inhibition comes from the critical observation that retrieval-induced forgetting also occurs even when RP⫹ items are tested using novel cues (e.g., “MONKEY-b” for “banana”; Aslan, Bäuml, & Pastotter, 2007; Johnson & Anderson, 2004; Levy, McVeigh, Marful, & Anderson, 2007; MacLeod & Saunders, 2005; Saunders & MacLeod, 2006) or even when RP⫺ items are tested in simple item recognition tests (Hicks & Starns, 2004). Importantly, both of these tests avoid the problem of retrieval competition between RP⫺ and RP⫹ items, as the cue that they share in common is eliminated during the test procedure. This property of retrieval-induced forgetting is referred to as cue-independence. Further evidence for memory inhibition comes from the finding that retrieval-induced forgetting is most likely to occur when mnemonic competition is high. Specifically, if competing memories are weak they are less likely to be forgotten (inhibited); whereas competing memories that are strong are more likely to be forgotten (M. C. Anderson et al., 1994; Bäuml, 1998). Similarly, if retrieval practice of RP⫹ items is replaced with noncompetitive extra study exposures, forgetting of “competitors” (i.e., RP⫺ items) does not occur (M. C. Anderson, Bjork, & Bjork, 2000). Together, these data provide strong support for the argument that retrieval-induced forgetting is a response to retrieval competition—a property we refer to as competitiondependence. Thus, the observation that retrieval-induced forgetting is cue-independent provides important evidence that competing memories are actually inhibited, while the observation that retrieval-induced forgetting is competition-dependent provides a constraint on when inhibition should occur.
Drink
Neurobiology of Retrieval-Induced Forgetting
73
38
50
50
Apple
Banana
Whiskey
Practiced (RP⫹)
Competitor (RP⫺)
Scotch
Unpracticed (NRP)
Figure 30.4 Schematic of retrieval-induced forgetting. Note. Practiced items (RP⫹) are typically better remembered than baseline (NRP) or competing (RP⫺) items (numbers reflect percentage recall). Critically, RP⫺ items are typically more poorly recalled than NRP items. The recall impairment for RP⫺ items, relative to NRP items, reflects the magnitude of retrieval-induced forgetting. From “Rethinking Interference Theory: Executive Control and the Mechanisms of Forgetting,” by M. C. Anderson, 2003, Journal of Memory and Language, 49, pp. 415–445. Copyright 2003 by Elsevier Press. Adapted with permission.
c30.indd Sec2:592
Previously, we discussed the role of PFC in guiding retrieval in the face of competition. With respect to retrieval-induced forgetting, there is an additional phenomenon to explain: the weakening or inhibition of competing memories. On the one hand, inhibition may be a by-product of PFC control mechanisms that guide attention toward task-relevant representations—a form of biased competition (Miller & Cohen, 2001). On the other hand, inhibition may be a distinct form of control, implemented by an independent PFC control mechanism that directly weakens competing representations (M. C. Anderson et al., 2004; Levy & Anderson, 2002). Although current evidence does not clearly favor one of these possibilities over the other, both perspectives emphasize that PFC influences what is retrieved and what
8/18/09 6:27:46 PM
Inhibition as a Cause of Forgetting
is inhibited; we therefore review general evidence that the PFC is engaged during retrieval in situations that ultimately result in inhibition. The key behavioral properties of retrieval-induced forgetting are well captured in a detailed neural network model developed by Norman, Newman, and Detre (2007). Central to the model is an algorithm in which oscillations in memory activation levels allow for the identification of target memories that are weak and competitors that are strong. Although the details of the model are beyond the scope of this chapter, it is of note that the model involves feedback mechanisms through which weak targets can be strengthened and strong competitors can be weakened. Of particular interest, the model does not contain a layer representing the contribution of the PFC. Rather, inhibition is explained in terms of local learning through feedback within memory-dedicated systems (i.e., within the medial temporal lobes). As long as competition exists, feedback mechanisms will punish competing memories. The model accounts for cue-independent forgetting in that individual items that compete for retrieval are directly weakened, and it accounts for competition-dependence in that competing items are only weakened if they become active (i.e., if they compete) during target retrieval. While the Norman et al. (2007) model does not contain a layer representing PFC—and therefore does not explain inhibition, itself, in terms of PFC cognitive control operations—the authors argue that the PFC nonetheless plays an important role in retrieval-induced forgetting. By their view, the critical role of the PFC is that it supports the selection of relevant memories—a function that is particularly needed when competition is high. More specifically, they suggest that when a retrieval cue leads to the co-activation of both relevant and irrelevant memories, competition occurs, which is detected by ACC. The ACC then triggers the engagement of other PFC mechanisms that selectively increase the activation of goal-relevant memories, or increase attention to goal-relevant features, thereby resolving competition. By detecting competition and guiding retrieval toward target representations, the PFC can select target memories to be strengthened and, as a consequence, the PFC influences which memories are weakened. Thus, the Norman et al. (2007) model explains inhibition as a by-product of PFCmediated biased competition (Miller & Cohen, 2001). The relationship between retrieval competition, inhibition, and the PFC was recently addressed in an fMRI study of retrieval-induced forgetting (Kuhl, Dudukovic, Kahn, & Wagner, 2007) that focused on the neural responses within the PFC during selective retrieval practice (i.e., across the three retrieval practice attempts of each RP⫹ item). Of critical interest was whether PFC engagement across retrieval practice is related to the inhibition of competing (RP⫺) memories, as revealed by behavioral performance on the
c30.indd Sec2:593
593
final test of all items. As Norman et al. (2007) suggest, the PFC should be differentially necessary when competition is high. Thus, the PFC should be maximally engaged during initial retrieval practice attempts (i.e., before targets are strengthened and competitors are weakened), with PFC engagement decreasing as targets are repeatedly practiced and competitors are suppressed. Consistent with this prediction, Kuhl et al. (2007) observed robust decreases in PFC engagement during repeated (third) relative to initial (first) retrieval practice trials. To directly test for a relationship between these decreases in PFC engagement and the phenomenon of competitor weakening (inhibition), the relative magnitude of competitor weakening was computed for each participant [(NRP accuracy—RP⫺ accuracy)/NRP accuracy] and then regressed upon the magnitude of PFC disengagement that each participant displayed across retrieval practice trials. If the weakening of competing memories reduces the demands on PFC, then the decrease in PFC engagement across retrieval practice trials should be positively correlated with the amount of competitor weakening. Indeed, such a relationship was observed in two PFC foci: the ACC and right anterior VLPFC (Figure 30.5). The finding that ACC disengagement was related to the weakening of competing items is consistent with the hypothesis of Norman et al. (2007) that the ACC should serve to detect competition between target and competing memories, and is also consistent with a much broader literature implicating the ACC in detecting conflict between competing representations (Botvinick, Braver, Barch, Carter, & Cohen, 2001; Braver, Barch, Gray, Molfese, & Snyder, 2001; MacDonald, Cohen, Stenger, & Carter, 2000; van Veen & Carter, 2002). In other words, the relative strength of target versus competing memories should increase as a function of retrieval practice repetition, meaning that with successive retrieval practice repetitions, to the extent that competitors are successfully weakened, there should be less retrieval competition and thus less ACC engagement. The right anterior VLPFC was also clearly sensitive to the weakening of competing memories, with this sensitivity potentially taking two forms. On the one hand, the right anterior VLPFC may serve to increase activation of target memories or features of target memories—a function Norman et al. (2007) ascribe to PFC—with this function maximally required when competition is highest. On the other hand—and in contrast to the role of the PFC suggested by Norman et al. (2007)—the right anterior VLPFC may serve to directly inhibit competing memories. While these possibilities are difficult to disambiguate, we later return to potential mechanistic contributions of the right anterior VLPFC.
8/18/09 6:27:46 PM
594
Forgetting and Retrieval
(A)
0.6
1.9
1st retrieval 3rd retrieval
0.4
r ⫽ 0.67*
⫺0.8 ⫺30% 1.9
0%
30%
60%
n ⫽ 19
1.0
0.1
⫺0.8 ⫺30%
0.0 ⫺0.2
High
0.6
*
Low
n ⫽ 19
0.3
0.0 r ⫽ 0.74* 0%
30%
Suppression Score
Figure 30.5 Neural activation reductions in the ACC (A) and the right anterior VLPFC (B) during retrieval practice correlated with behavioral measure of retrieval-induced forgetting. Note. Subjects that showed the greatest magnitude of suppression (of RP⫺ items) displayed the greatest reductions in anterior cingulate cortex
It remains ambiguous whether the right anterior VLPFC and ACC directly or indirectly contribute to the forgetting of competing memories, however, it also might be asked whether the forgetting observed by Kuhl et al. (2007) is best explained in terms of inhibition. As described previously, support for inhibition comes from evidence that retrieval-induced forgetting is cue-independent and competition-dependent. With respect to cue-independence, the key feature is that the forgetting of competing memories should not be accounted for in terms of strengthened, practiced memories interfering with RP⫺ recall at test. Consistent with this prediction, the decreased engagement of the ACC and right anterior VLPFC across retrieval practice trials—which was correlated with RP forgetting— was not correlated with RP⫹ strengthening, suggesting that it was, in fact, the weakening of competing memories and not the strengthening of practiced memories, that reduced demands on these PFC subregions during retrieval practice. With respect to competition-dependence, it should be predicted that, if the ACC indexes competition, the initial engagement of the ACC should be a signal that competition is present, thereby triggering competitor inhibition. Indeed, those participants that showed the most retrievalinduced forgetting demonstrated greater initial engagement of the ACC during retrieval practice. In support of
c30.indd Sec2:594
*
0.2 0.1
Integrated % signal change
(B)
1st ⫺ 3rd Retrieval practice (integrated % signal change)
1.0
60%
⫺0.3 High
Low
Suppressor Subgroup
and right anterior VLPFC activation during retrieval practice. From “Decreased Demands on Cognitive Control Reveal the Neural Processing Benefits of Forgetting,” by B. A. Kuhl, N. M. Dudukovic, I. Kahn, and A. D. Wagner, 2007, Nature Neuroscience, 10, p. 911. Reprinted with permission.
the claim that the ACC was driven by mnemonic competition and that mnemonic competition triggered inhibition, it was also observed that initial hippocampal activation was positively correlated with both initial ACC activation and the magnitude of inhibition. Thus, engagement of the hippocampus likely reflected successful retrieval of both target and competing memories, with robust hippocampal activation perhaps signaling inefficient, competition-laden retrieval—a situation that triggers competitor inhibition. The competition-dependent role of the PFC in retrievalinduced forgetting is also supported by a recent event-related potential (ERP) study (Johansson, Aslan, Bäuml, Gabel, & Mecklinger, 2007). In this study, ERPs were compared during selective retrieval practice versus a control condition in which retrieval practice was replaced by extra study exposures. As noted earlier, behavioral data indicate that retrieval-induced forgetting is not observed when retrieval practice is replaced by extra study (M. C. Anderson et al., 2000), with the explanation being that extra study exposures are noncompetitive, or are at least much less competitive than retrieval practice. Replicating this dissociation, Johansson et al. (2007) observed that retrieval practice resulted in subsequently lower recall of competing memories than did extra study exposures. At the neural level, ERPs associated with retrieval practice were more positive-going than ERPs associated with extra study, with this
8/18/09 6:27:46 PM
Inhibition as a Cause of Forgetting
difference restricted to frontal electrode sites. Critically, the magnitude of this difference in frontal electrodes between retrieval practice and extra study was greater for participants who showed the most retrieval-induced forgetting, relative to those that showed the least retrieval-induced forgetting. These data are consistent with the theme that retrieval competition is associated with both the engagement of the PFC and the inhibition of competing memories. Moreover, these data, like those reported by Kuhl et al. (2007), demonstrate a coupling between the PFC’s response to competition and the inhibition of competing memories. While these findings support the relationship between the PFC and retrieval-induced forgetting, several researchers have also attempted to establish whether intact PFC functioning is necessary for retrieval-induced forgetting to occur. For example, retrieval-induced forgetting has been probed both in patients with frontal lobe damage as well as in older adults (a population in which frontal lobe dysfunction is common, e.g., Moscovitch & Winocur, 1992; Raz et al., 1997). In one such study, Conway and Fthenaki (2003) compared patients with frontal lobe damage (either left or right lateral PFC damage) to control participants. Although the PFC has frequently been implicated in inhibitory control, the frontal patients displayed a normal pattern of retrieval-induced forgetting, relative to controls. From these data, Conway and Fthenaki argued that retrievalinduced forgetting reflects a form of unintentional inhibition and that intact PFC functioning is not necessary for producing such inhibition. Paralleling these findings, Aslan and colleagues (2007) observed normal retrieval-induced forgetting in older adults, compared with younger adults. Although these observations perhaps suggest that normal PFC functioning is not necessary for retrieval-induced forgetting to occur, there are several issues complicating this conclusion. As discussed earlier, in a standard retrieval-induced forgetting study, inhibition is not the only potential cause of forgetting. That is, if the test procedure allows the strengthened, practiced memories (RP⫹) to interfere with retrieval of competing memories (RP⫺), then retrieval-induced forgetting can be explained simply in terms of retrieval competition that arises at test. Moreover, as Norman et al. (2007) note, the PFC may be particularly important for resolving competition during the test phase of a retrieval-induced forgetting study, given that retrieval cues are linked to multiple associates and damage to the PFC is known to increase sensitivity to retrieval competition. While the potential contamination of retrieval competition during the test phase can be eliminated if an independent-probe test procedure is used (M. C. Anderson & Spellman, 1995), Conway and Fthenaki (2003) did not use such a procedure. Thus, their observation of normal retrieval-induced forgetting
c30.indd Sec2:595
595
in frontal patients may simply reflect robust sensitivity to retrieval competition at test, as opposed to the actual inhibition of competing memories. Aslan and colleagues (2007), however, used both a standard test procedure as well as an independent probe test procedure, with older adults showing normal retrieval-induced forgetting in both cases. Although this result is intriguing, and, at first pass, may seem consistent with the hypothesis from Norman et al. (2007) that the PFC does not directly support inhibition, Aslan and colleagues did not explicitly address frontal lobe integrity among the older adults that were tested. Thus, it is unclear that these older adults suffered from any PFC dysfunction. Indeed, it is noteworthy that the older and younger adults tested by Aslan and colleagues demonstrated equivalent retrieval practice success. Given that the PFC is known to make necessary contributions to selective retrieval (Badre & Wagner, 2007; Dobbins & Wagner, 2005), this finding raises the possibility that the older adults tested may have had little, if any, PFC dysfunction. Summary Recent neurobiological evidence supports the claim that the PFC plays an important role in overcoming competition during selective retrieval and influencing what ultimately becomes inhibited. Moreover, the PFC is engaged in response to the presence of competition and the PFC directly benefits from the inhibition of competing memories. Thus, neurobiological evidence supports both behavioral evidence concerning when inhibition should occur (M. C. Anderson & Spellman, 1995; Levy & Anderson, 2002) as well as theoretical explanations of why inhibition is adaptive (M. C. Anderson, 2003; Bjork, 1989). Although progress in understanding the functional neurobiology of forgetting has been made, a fundamental ambiguity that awaits further clarification concerns the precise mechanism through which inhibition occurs. As noted earlier, inhibition may occur because: (a) the PFC biases competition toward (selects) relevant memories (Miller & Cohen, 2001) and, as a consequence, competing memories are inhibited; or (b) the PFC directly weakens competing memories (Levy & Anderson, 2002). Disambiguating these possibilities is particularly difficult because both hypotheses predict that PFC function will be related to the phenomena of inhibition and selection. For example, if inhibition is a consequence of PFC selection, then damage to the PFC should impair the ability to select target memories and, as a consequence, competitors should not be inhibited. On the other hand, if the PFC directly contributes to inhibition as a means of facilitating selective retrieval (Levy & Anderson, 2002), then damage to the PFC should impair the ability to inhibit irrelevant memories, which should, as a consequence, compromise the ability to select target
8/18/09 6:27:47 PM
596
Forgetting and Retrieval
memories. Thus, by either account, damage to the PFC should disrupt both the inhibition of irrelevant memories as well as the selection of relevant memories. One approach to distinguish between the mechanisms of selection and inhibition is to examine whether distinct PFC subregions contribute to each. As will be recalled from the previous section, mnemonic selection has repeatedly been associated with the left mid-VLFPC (Badre & Wagner, 2007). In the study by Kuhl et al. (2007), right anterior VLPFC engagement, but not left mid-VLPFC engagement, was correlated with the inhibition of competing memories. Although this may suggest a dissociation between the left mid-VLPFC (selection) and right anterior VLPFC (inhibition), there remain alternative explanations. For example, the right anterior VLPFC may support the allocation of attention toward properties of the retrieval cue—a form of attentional selection—which is particularly necessary when competition is high. By this view, the right anterior VLPFC would support a form of selection that is distinct from the selection supported by the left midVLPFC, but would not directly support inhibition. As we describe in the following sections, there is, in fact, some evidence in support of a selective-attention account of the right anterior VLPFC. However, given the limited data at present, mechanistic dissociations between the left midVLFPC and right anterior VLPFC remain tentative. Stopping Retrieval While attempts to remember a target memory can trigger inhibition of competing memories, it has also been argued that inhibition can occur as a result of deliberate attempts to forget something or even deliberate attempts to simply keep something out of mind. For example, Bjork (1970) describes the predicament of a short-order cook, for whom it is highly advantageous to forget an order once it is complete. The advantage to forgetting a completed order, of course, is that it reduces confusion (proactive interference) when trying to remember a current order. The situation of deliberately trying to forget, or discard, something that has already been learned has been studied using the directed forgetting paradigm. Directed forgetting studies are generally divided into two main classes. In the first type, the list method, there are typically two lists of stimuli, studied one after the other. In some cases, or for some participants, there is an instruction immediately following list 1 learning (and before list 2 learning) to forget the entire list that was just studied. After list 2 learning, memory is assessed for both list 2 items and list 1 items. When participants are instructed to forget list 1, there are typically two results of interest, relative to when a forget instruction was not issued: (1) recall of list 1 items is worse, and (2) recall of list 2 items is better.
c30.indd Sec2:596
The impaired recall of list 1 items suggests that recall for already learned material can be volitionally influenced, whereas the improved recall of list 2 items suggests that proactive interference can be reduced, as would be the goal of the short-order cook. The second procedure used in directed forgetting studies is the item method, in which individual items (e.g., single words) are presented one at a time, followed by an instruction to either remember or forget the item. Importantly, the remember/forget instruction typically does not appear until after the relevant item has disappeared, thus ensuring that the item is at least initially encoded. In item method directed forgetting studies, forget items are, again, more poorly recalled than remember items. Although the two methods of directed forgetting are seemingly similar, the forgetting that is observed (of forget items) may have different causes. In the item method, evidence suggests that remember items benefit from preferential encoding, relative to forget items, with inhibition thought to play little role (Basden, Basden, & Gargano, 1993). In other words, remember items likely benefit from the remember instruction, but it is not clear that forget items actually suffer from the forget instruction. In the list method, however, preferential encoding in the control condition does not seem to account for the forgetting of list 1 items in the forget condition (Basden et al., 1993; Geiselman, Bjork, & Fishman, 1983). Rather, forgetting in the list method following a forget instruction has been explained in terms of either inhibition (e.g., Bjork, 1989) or an internal context change in response to the forget instruction (Sahakyan & Kelley, 2002). Although these two accounts of list-method directed forgetting are not mutually exclusive, the contextual change account has proven to hold substantial explanatory power (Sahakyan, Delaney, & Waldum, 2008). Although directed forgetting has received considerable attention given its potential application to the control of real-life memories, the mechanistic ambiguity concerning the phenomenon creates challenges for studying memory inhibition. By contrast, a more recently developed paradigm—the Think/No-Think paradigm (M. C. Anderson & Green, 2001)—has allowed for a more direct assessment of the relationship between memory control and inhibition. In the Think/No-Think paradigm, participants first study a series of cue-associate pairs (e.g., “ordeal-roach,” “journey-pants”) and are trained to recall the associate word (e.g., “roach”) when presented with the cue (e.g., “ordeal”; Figure 30.6A). Next, participants engage in the Think/NoThink phase, in which they are presented with cues (the left-hand member of a cue-associate pair; e.g., “ordeal-”) from some of the previously studied pairs. For some of these cues, participants are instructed to retrieve (Think) of the corresponding associate. For other cues, participants are instructed to prevent the corresponding associate from
8/18/09 6:27:47 PM
Inhibition as a Cause of Forgetting
597
(A)
No-Think Think Baseline
TNT Phase
Ordeal-Roach
Ordeal
Ordeal
Insect r
Steam-Train
Steam
Steam
Vehicle t
Jaw
Candy g
Jaw-Gum
Same Probe
(B) Percent Recalled
Test Phase Same Probe Independent Probe
Study
Independent Probe
95
90
85
No-Think
No-Think
Think
Think
80 0
1
8
16
0
1
8
Number of Repetitions
entering awareness (No-Think). Critically, participants are instructed that it is not enough to simply withhold a response on No-Think trials; rather, they are instructed to do their best to completely avoid thinking of the associate. Think and No-Think cues are repeated a varying numbers of times (e.g., 0, 1, 8, or 16 repetitions for each cue). Importantly, some of the cues never appear in the Think/ No-Think phase, functioning as baseline items (0 repetitions). Finally, recall of all associates (Think, No-Think, Baseline) is tested. Cued recall reveals that Think items are, not surprisingly, better remembered than No-Think items (Figure 30.6B). This result is essentially equivalent to the comparison of remember versus forget conditions in an item-method– directed forgetting study. However, to identify whether No-Think items actually suffered a cost, test-phase recall performance for No-Think items is compared to recall of Baseline items (i.e., items that were initially studied and trained, but that did not appear during the Think/No-Think phase). The Baseline condition provides a critical comparison condition (one that is not present in directed forgetting studies) for assessing whether No-Think items actually suffer a cost. That is, if No-Think instructions impair memory for No-Think items, then these items should be more poorly recalled than Baseline items. This is what is typically observed, with memory for No-Think items decreasing as a function of the number of No-Think repetitions that an item received (Figure 30.6B; M. C. Anderson & Green, 2001). Although a cued-recall impairment for No-Think items relative to Baseline items is suggestive of memory inhibition, it is alternatively possible that the impaired recall of No-Think items simply reflects retrieval competition (interference) that arises at test—a concern that we considered above with respect to retrieval-induced forgetting. In
c30.indd Sec2:597
16
Figure 30.6 A: Outline of the Think/ No-Think paradigm. B: Recall performance at test for “Think” items increases as a function of “Think” repetitions. Note. (A) Critically, during the Think/No-Think phase, subjects are cued to think of the corresponding associate for “Think” items, but to avoid thinking of the response for “No-Think” items. Adapted from M. C. Anderson et al. (2004). (B) Recall performance for “No-Think” items decreases as a function of “NoThink” repetitions. This pattern is apparent both in the Same Probe and Independent Probe tests. From “Suppressing Unwanted Memories by Executive Control,” by M. C. Anderson and C. Green, 2001, Nature, 410, pp. 366–369. Copyright 2001 by Macmillan Publishers. Adapted with permission.
retrieval-induced forgetting, the concern over an interference explanation is perhaps more obvious, given that retrieval practice involves strengthening RP⫹ items that share a retrieval cue with RP⫺ items. In the Think/NoThink paradigm, however, it is possible that when presented with No-Think trials, participants direct their thought away from the trained associate by thinking of something else; with repetition, this new, self-generated associate may be strengthened, relative to the originally trained associate. Accordingly, at test, the cue may elicit this self-generated memory, which would interfere with target recall. As with retrieval-induced forgetting, the independent probe technique has been applied to the Think/No-Think paradigm in order to establish whether inhibition has actually occurred. For example, rather than testing the associate “roach” with the original cue “ordeal,” a new, independent probe such as “Insect-r” can be used. Critically, below-baseline forgetting of No-Think items is evident when independent probes are used at test (Figure 30.6; M. C. Anderson & Green, 2001; M. C. Anderson et al., 2004). Thus, although the Think/No-Think paradigm bears a procedural similarity to directed forgetting, the forgetting observed in the TNT paradigm has a clearer mechanistic cause—namely, deliberate attempts to keep a memory out of mind when presented with a reminder can result in inhibition of that memory. It should be noted, however, that, to the extent that participants approach No-Think trials in the Think/ No-Think paradigm by actively remembering something else, the Think/No-Think paradigm and retrievalinduced forgetting may reduce to a common phenomenon. Consistent with this view, it has been demonstrated that inhibition in the Think/No-Think paradigm is most likely to occur when participants approach No-Think trials by generating diversionary thoughts (Hertel & Calcaterra, 2005).
8/18/09 6:27:47 PM
598
Forgetting and Retrieval
Neurobiological Mechanisms of Stopping Retrieval Given that the directed forgetting paradigm was developed 3 decades prior to the Think/No-Think paradigm, there have been considerably more attempts to understand the neurobiological basis of directed forgetting than inhibition in Think/ No-Think. However, because the mechanisms of directed forgetting are more ambiguous, we briefly review several examples of the potential relationship between the PFC and directed forgetting before more fully considering the role of the PFC in the context of Think/No-Think paradigms. Zacks, Radvansky, and Hasher (1996) compared directed forgetting between older and younger adults across multiple experiments, using both item- and list-method procedures, and consistently observed that older adults were poorer at directed forgetting than young adults, consistent with the hypothesis of an inhibitory deficit associated with aging. Specifically, relative to their baseline retrieval rate, older adults were more likely than young adults to retrieve items that had previously received a forget instruction. Although suggestive of an inhibitory deficit, there are, as the authors note, alternative accounts. For example, older adults might have been poorer at encoding remember/ forget instructions, and/or as the experiment progressed, older adults may have had greater difficulty keeping track of which items were supposed to be remembered versus forgotten, potentially leading to inadvertent rehearsal of forget items. Thus, while the impairment of older adults in this context is in contrast to normal retrieval-induced forgetting among older adults (Aslan et al., 2007), it is not clear what accounts for this dissociation. Directed forgetting has also been examined in frontal patients, but with somewhat variable results. For example, Conway and Fthenaki (2003) found impaired directed forgetting among frontal patients using both list and item method designs, with the impairment restricted to those patients with right frontal damage. However, Andrés, Van der Linden, and Parmentier (2007) reported normal item-method directed forgetting among frontal patients. Unfortunately, given the variability in the size and location of lesions across these studies, it is difficult to reconcile the discrepancies in the data or to draw conclusions about the mechanisms involved. Rather, it seems that, as with aging, frontal lobe damage may, at least in some cases, disrupt directed forgetting. Finally, item-method directed forgetting has also been assessed using both ERPs and fMRI. In an ERP study, Paz-Caballero, Menor, and Jimenez (2004) observed an early (100 to 200 ms) frontal positivity for forget instructions, relative to remember instructions, that was only observed for those participants that showed a high amount of directed forgetting. The authors suggest that this frontal
c30.indd Sec2:598
positivity may reflect the engagement of the PFC in order to inhibit or stop processing of forget items. Although this interpretation involves an inhibitory component, it does not demand that forget items themselves are inhibited; rather, it could simply be that the processing of forget items is discontinued. Thus, this interpretation is compatible with the argument that item-method directed forgetting reflects preferential encoding of remember items. By contrast, Wylie, Foxe, and Taylor (2008) used an item-method– directed forgetting paradigm with fMRI and found that the right anterior VLPFC was more active for forget items that were actually forgotten, whereas a reverse pattern was observed for items that received a remember instruction. The authors argue that the positive relationship between the right anterior VLPFC and the forgetting of forget items suggests an active mechanism of forgetting, challenging the selective rehearsal account of item-method– directed forgetting. As previously described, activation in the right anterior VLPFC was also correlated with the forgetting of competing memories in the context of retrieval-induced forgetting (Kuhl et al., 2007), perhaps suggesting a common mechanistic contribution across these two contexts. While the discussed ERP and fMRI studies of itemmethod–directed forgetting suggest an active mechanism is involved in stopping retrieval and, potentially, in inhibiting competing memories, these possibilities have been more directly assessed in a pair of fMRI studies using the Think/No-Think paradigm. These studies used emotionally neutral word pairs (M. C. Anderson et al., 2004) or emotionally valenced images (Depue, Curran, & Banich, 2007), and yielded several convergent outcomes. A key theoretical claim of M. C. Anderson’s is that inhibition reflects the engagement of active control processes supported by the PFC (Levy & Anderson, 2002). Thus, in each study it was predicted that No-Think trials would not simply reflect the failure to engage retrieval mechanisms, but rather that No-Think trials would engage PFC control mechanisms to a greater extent than Think trials. M. C. Anderson et al. (2004) observed greater activation during No-Think versus Think trials in several PFC subregions, including bilateral DLPFC, VLPFC (inclusive of right anterior VLPFC), and ACC. In contrast, Think trials were associated with greater activation in the hippocampus—consistent with the role of the hippocampus in retrieving episodic memories (e.g., Eldridge, Knowlton, Furmanski, Bookheimer, & Engel, 2000; Kirwan & Stark, 2004). Similarly, Depue et al. (2007) observed greater No-Think than Think activation in the right DLPFC, right frontopolar cortex, and right anterior VLPFC; greater Think than No-Think activation was again observed in the hippocampus. Thus, with respect to the contrast of No-Think versus Think, both studies revealed activation in the right DLPFC and right anterior VLPFC.
8/18/09 6:27:47 PM
Inhibition as a Cause of Forgetting (A)
599
(B) 100 No-Think 95 Percent Recalled
L
Think
90
85
80
⫹28
c30.indd Sec2:599
75
1st 2nd 3rd 4th DLPFC Activation Quartile (No-Think⬎Think)
Figure 30.7 A: Activation in the bilateral DLPFC during “NoThink” trials was positively correlated with the inhibition score (i.e., forgetting of No-Think items at test). B: Participants who displayed the greatest DLPFC activation during No-Think trials were characterized by lower recall accuracy for No-Think items at test, relative to Baseline items.
Note. From “Neural Systems Underlying the Suppression of Unwanted Memories,” by M. C. Anderson et al., 2004, Science, 303, pp. 232–235. Copyright 2004 by the American Association for the Advancement of Science. Adapted with permission.
Strikingly, both studies also found that the engagement of the PFC during No-Think trials was related to the fate of the to-be-avoided memories. Specifically, the magnitude of activation in the DLPFC (bilateral in M. C. Anderson et al., 2004; Figure 30.7; right lateralized in Depue et al., 2007) positively correlated with the magnitude of inhibition (forgetting) of No-Think items. These data indicate that the DLPFC is recruited during attempts to stop retrieval, and that this recruitment is associated with a cost for those memories that are avoided. Within the hippocampus, an intriguing pattern of data was observed. During Think trials, both Depue et al. (2007) and M. C. Anderson et al. (2004) observed that the hippocampus tended to be more active for items that were later remembered, relative to those that were later forgotten. By contrast, during No-Think trials, M. C. Anderson et al. (2004) reported a trend toward greater hippocampal activation for No-Think items that were later forgotten. This finding of greater hippocampal activation for No-Think items later forgotten compared to those later remembered was particularly robust among those participants who exhibited the most inhibition. If hippocampal activation is typically associated with remembering, then why is greater hippocampal activation on No-Think trials associated with forgetting? M. C. Anderson et al. suggest that such activation may reflect momentary intrusions of the to-be-avoided memories, noting that hippocampal activation during No-Think trials was also correlated with DLPFC engagement. Consistent with this interpretation, Depue et al. (2007) reported that hippocampal activation tended to
decrease across repetitions of No-Think items (presumably reflecting a practice-related decrease in intrusions), but increased across repetitions of Think trials. Moreover, this decrease in hippocampal activation across No-Think repetitions was apparent to a greater degree for the items later forgotten than those later remembered. Together, these data suggest that hippocampal activation during No-Think trials may reflect inadvertent remembering, thereby triggering DLPFC-mediated control that results in the eventual inhibition of intruding memories. As such, these data are consistent with the competitiondependent property of retrieval-induced forgetting (i.e., that competition triggers inhibition) and are potentially compatible with the observation by Kuhl et al. (2007) that greater hippocampal activation during initial retrieval practice attempts was associated with greater inhibition of competing memories. While M. C. Anderson et al. (2004) and Depue et al. (2007) found compelling evidence that the DLPFC was related to memory inhibition in Think/No-Think paradigms, Kuhl et al. (2007) observed a relationship between the right anterior VLPFC (and ACC) and memory inhibition in retrieval-induced forgetting. Although this apparent discrepancy in the foci of lateral PFC activations may, at first pass, suggest different mechanisms of inhibition in the two paradigms, there is a notable difference in the analyses reported by Kuhl et al. (2007) and those reported by M. C. Anderson et al. (2004) and Depue et al. (2007). Specifically, M. C. Anderson et al. and Depue et al. found that DLPFC activation, collapsed across all No-Think repetitions,
8/18/09 6:27:48 PM
600
Forgetting and Retrieval
predicted memory inhibition, whereas Kuhl et al. (2007) found that activation changes in right anterior VLPFC activation (i.e., repetition-related reductions) predicted memory inhibition. Although M. C. Anderson and colleagues (2004) did not consider their data as a function of repetition, Depue and colleagues (2007) separately considered activation in each of four quartiles (each quartile contained three repetitions of Think/No-Think items). Importantly, right anterior VLPFC activation during No-Think trials tended to decrease across repetitions—indeed, this region was engaged, above Baseline, only during No-Think trials in the first two quartiles. Although Dupue et al. did not report whether the magnitude of this decrease was related to the magnitude of inhibition, the data are at least consistent with the view that right anterior VLPFC engagement is decreasingly necessary as No-Think items are inhibited. By contrast, right DLPFC activation did not decrease across quartiles; in fact, right DLPFC only displayed above-Baseline activation during No-Think trials in the last three quartiles. Moreover, a negative correlation was observed between DLPFC and hippocampal activation that was maximal during the last quartile. Thus, while DLPFC activation was correlated with memory inhibition and hippocampal activation, the temporal profile of DLPFC activation raises interesting questions about its mechanistic contribution. If intrusions during No-Think trials are most likely to occur during initial No-Think attempts, and these intrusions trigger DLPFC-mediated inhibition, as argued by M. C. Anderson and colleagues (2004), then why is the DLPFC most active during later repetitions, relative to initial repetitions? Moreover, why are hippocampal and DLPFC activation uncorrelated during initial No-Think repetitions (when intrusions are presumably highest), but strongly negatively correlated during late repetitions (when intrusions are presumably low)? These two aspects of the data seem to indicate that DLPFC engagement is highest when the demand for inhibition is actually lowest. Although not discussed by Depue and colleagues (2007), perhaps the increase in DLPFC activation across repetitions, and the increasingly negative relationship between the DLPFC and the hippocampus, reflects a practice-related improvement in the ability to engage the DLPFC. That is, during initial No-Think attempts, there may be a failure to engage the DLPFC to inhibit No-Think items; with practice, the DLPFC is successfully recruited and this is reflected in the down-regulation of the hippocampus. Importantly, this view suggests that DLPFC engagement is not an obligatory response to competition, but may be flexibly engaged to regulate competition. Regardless of why DLPFC engagement onsets later than the right anterior VLPFC, the dissociation between
c30.indd Sec2:600
these regions is intriguing, particularly in light of evidence implicating each of these regions in other contexts that putatively involve memory inhibition. Moreover, it is also of note that the left mid-VLPFC, which has repeatedly been implicated in resolving mnemonic competition (for review, see Badre & Wagner, 2007), has not been implicated in the inhibition of episodic memories using either retrieval-induced forgetting or Think/No-Think paradigms. Although additional work is clearly necessary in order to better elucidate the relationship between these various PFC control mechanisms and the mechanisms of selective retrieval and inhibition, in the next section we attempt to synthesize the evidence reviewed thus far, situating this evidence in the broader context of how the PFC contributes to selective attention and goal-oriented behavior.
PREFRONTAL CORTEX CONTRIBUTIONS TO RETRIEVAL AND FORGETTING Although our treatment of forgetting is grouped into two main themes—interference and inhibition—it should be clear that these are not two, independent causes of forgetting. Rather, the presence of competition can directly interfere with retrieval, thereby causing forgetting, but competition can also trigger the inhibition of competing memories, again contributing to forgetting. In other words, both forms of forgetting are ultimately related to the presence of competition and the mechanisms through which competition is resolved. Understanding the way in which competition is resolved is not, of course, a question that is specific to the domain of memory, as several influential models of PFC function are principally focused on mechanisms of competition resolution (e.g., Desimone & Duncan, 1995; Miller & Cohen, 2001; Shimamura, 2000). Thus, understanding the control processes that guide retrieval and forgetting should benefit from a consideration of the ways in which the PFC guides attention and goaldirected behavior. In this final section, we briefly consider how attention and cognitive control may be implemented through coordinated, but distinct, contributions from the ACC, DLPFC, and VLPFC. By some accounts, attentional control may be implemented via two distinct frontoparietal networks (Corbetta & Shulman, 2002). At a first level, attention-grabbing changes in sensory stimuli, across multiple modalities, tend to activate a network of ventral fronto-parietal regions, with the right VLPFC perhaps the most frequently activated PFC subregion (e.g., Downar, Crawley, Mikulis, & Davis, 2000; for review, see Corbetta & Shulman, 2002). For example, right anterior VLPFC activation has been associated with the reorienting of attention in response to, and in order to overcome,
8/18/09 6:27:48 PM
Prefrontal Cortex Contributions to Retrieval and Forgetting
distraction (Weissman, Roberts, Visscher, & Woldorff, 2006). This ventral fronto-parietal attentional system has been dissociated from a dorsal fronto-parietal system that is thought to support top-down orienting of attention, perhaps integrating bottom-up inputs with attentional task sets (Corbetta & Shulman, 2002). Although the frontal component of this dorsal system most frequently involves the frontal eye fields, the DLPFC may also be a component of this same system, particularly when considering attentional control outside the domain of visual attention (e.g., Luks, Simpson, Dale, & Hough, 2007). For example, in a now classic study, the role of the DLPFC in implementing control in a modified Stroop task was contrasted with that of the ACC (MacDonald et al., 2000). Critically, during task preparation, the DLPFC, but not the ACC, was modulated by the task instruction. During the trial itself, the ACC—but not the DLPFC—was modulated by the level of conflict (greater ACC engagement for incongruent versus congruent trials). Conceptually, similar dissociations between the DLPFC and ACC have since been reported (e.g., Weissman, Warner, & Woldorff, 2004), and from these and other observations, it has been argued that the DLPFC supports the top-down implementation of control. Thus, with respect to attentional control, the VLPFC appears to be engaged in response to distracting or unexpected stimuli or events and serves to reorient attention. In a complementary manner, the DLPFC appears to play a critical role in volitionally engaging attention; this topdown allocation of attention may occur in preparation for a demanding cognitive task, but may also occur during task execution, to the extent that attended information interacts with task goals (Corbetta & Shulman, 2002). Distinctions between the VLPFC and DLPFC have also been drawn in other domains, where a putatively hierarchical relationship between the VLPFC and DLPFC has often been emphasized. For example, with respect to the use of rules, it has been argued that the VLPFC supports the retrieval and maintenance of task rules, whereas the DLPFC may support flexible rule use or rule selection (for review, see Bunge, 2004). This view is supported by evidence that the VLPFC tends to be continuously engaged during rule maintenance, whereas the DLPFC tends to be engaged in preparation for a response (Bunge, 2004). Within the context of working memory paradigms, the DLPFC has frequently been implicated in response selection and top-down control, as opposed to simply maintaining information (e.g., Rowe, Toni, Josephs, Frackowiak, & Passingham, 2000; for review, see Curtis & D’Esposito, 2003). The higher-order role of the DLPFC in working memory has been contrasted with the role of the VLPFC, which is thought to support retrieval or simple maintenance of information (D’Esposito et al., 1998; D’Esposito, Postle,
c30.indd Sec3:601
601
Ballard, & Lease, 1999; Petrides, 1996). For example, the VLPFC is engaged during rote rehearsal and during elaborative rehearsal that requires the manipulation or updating of working memory contents, whereas the DLPFC is selectively engaged by elaborative rehearsal (Wagner, Maril, Bjork, & Schacter, 2001). Moreover, DLPFC activation may lag VLPFC activation, consistent with the idea that the DLPFC operates on the products of information maintained/retrieved by VLPFC (Wagner et al., 2001). Similarly, within episodic memory, the VLPFC has been implicated in maintaining and elaborating on retrieval cues, whereas DLPFC has been implicated in monitoring the products of retrieval and their relation to decision rules (Dobbins et al., 2002; Dobbins & Wagner, 2005). Returning to the theme of this chapter, a central question is how do selective retrieval and forgetting relate to these PFC processing distinctions? As we review, retrieval competition has been associated with the engagement of the VLPFC—both the left mid-VLPFC (Badre & Wagner, 2007; Thompson-Schill et al., 1997) and the right anterior VLPFC (Kuhl et al., 2007). However, the VLPFC has also been implicated in stopping retrieval (M. C. Anderson et al., 2004; Depue et al., 2007; Wylie et al., 2008), suggesting that VLPFC is engaged in response to competition from irrelevant memories, rather than remembering, per se. Indeed, it is a critical point that VLPFC engagement appears to be more tightly coupled with retrieval competition than with the actual phenomenon of retrieval. For example, repeated successful retrieval of the same information—which is associated with behavioral facilitation—is associated with robust decreases in the engagement of the bilateral VLPFC, but relatively little modulation of the DLPFC; in contrast, the actual phenomenon of retrieval success is associated with robust engagement of the DLPFC, but more limited activation of the VLPFC (Kuhl et al., 2007). Similarly, when task demands explicitly require stopping the act of retrieval, the right anterior VLPFC is engaged during initial attempts, but is less engaged with practice, presumably reflecting decreasing competition from to-be-avoided memories; DLPFC engagement, on the other hand, does not decrease across repeated attempts to stop retrieval, and may even tend to increase (Depue et al., 2007). These dissociations between the VLPFC and DLPFC are potentially compatible with dual-system theories of attention (Corbetta & Shulman, 2002). As discussed, the VLPFC is thought to support reflexive orienting to distracting stimuli. Compatible with this perspective, in the context of mnemonic control, competing memories may serve as distracting representations that help reorient attention via VLPFC engagement. The DLPFC, on the other hand, may support the top-down allocation of attention. In situations of mnemonic control, it may be that the DLPFC
8/18/09 6:27:48 PM
602
Forgetting and Retrieval
is not directly engaged in response to mnemonic competition, but rather is engaged to help bias mnemonic processing such that mnemonic goals are achieved. For example, the DLPFC may evaluate retrieval products with respect to task goals (Dobbins et al., 2002; Henson, Rugg, Shallice, & Dolan, 2000), and may therefore be sensitive to retrieval success. Alternatively, or additionally, the DLPFC may implement attentional biases that, once in place, effectively reduce mnemonic competition. Although a distinction between the VLPFC and DLPFC based on reflexive versus top-down control, respectively, may hold some explanatory power, it should be noted that the left VLPFC has also been implicated in implementing top-down control during retrieval (e.g., Badre et al., 2005). Thus, further evidence is necessary in order to better specify the mechanistic distinctions between VLPFC and DLPFC control processes and their relation to mnemonic processing. Although the distinction between the VLPFC and DLPFC has been of particular interest in theories of PFCmediated control, it is worth emphasizing that these regions (a) act in concert with other prefrontal structures (e.g., ACC and frontopolar cortex), and (b) can likely be further subdivided into distinct functional units. With respect to other PFC control mechanisms, the ACC may support an initial component of cognitive control, in that it can detect competition between multiple, coactive representations (Botvinick et al., 2001; Braver et al., 2001; van Veen & Carter, 2002). Importantly, ACC engagement has frequently been shown to correlate with DLPFC engagement (Badre & Wagner, 2004; Bunge, Burrows, & Wagner, 2004; Kondo, Osaka, & Osaka, 2004), leading to the hypothesis that ACC-mediated competition detection triggers DLPFCmediated control. Such couplings have been observed in the context of competitive remembering (Bunge et al., 2004; Kuhl et al., 2007), with one possibility being that the computation performed by the DLPFC, in response to ACC signaling, is to increase activation of goal-relevant memories (Miller & Cohen, 2001). The frontopolar cortex, on the other hand, may be situated at the top of the PFC processing hierarchy (Koechlin & Summerfield, 2007), coordinating VLPFC/DLPFC operations with specific subgoals (Braver & Bongiolatti, 2002). Consistent with a supervisory role of the frontopolar cortex, initial attempts to stop retrieval result in coupled activation between the right anterior VLPFC and frontopolar cortex, whereas later attempts are associated with coupling between the DLPFC and frontopolar cortex (Depue et al., 2007). Finally, while the organizing principles of the VLPFC and DLPFC that we consider here may be useful in terms of constraining hypotheses of how the PFC implements control, both the VLPFC and DLPFC can likely be further decomposed into distinct functional units (e.g., Badre
c30.indd Sec3:602
et al., 2005; Dobbins et al., 2002; Gold et al., 2006). For example, within the VLPFC, the left mid-VLPFC has been implicated in selecting between multiple, active representations, whereas the left anterior VLPFC has been implicated in controlled retrieval of semantic information through direct interaction with posterior semantic stores (for review, see Badre & Wagner, 2007). In other words, there are likely multiple ways in which the VLPFC responds to competition and multiple ways in which the DLPFC coordinates mnemonic processing. Future work will undoubtedly advance understanding of both the specific mechanisms supported by PFC subregions as well as the way in which these mechanisms act in concert such that mnemonic competition is resolved.
SUMMARY In this chapter, we highlighted the interrelated nature of remembering and forgetting, and the substantial impact that prefrontal function has on each. Specifically, the prefrontal cortex serves to guide retrieval toward goal-relevant memories and away from those memories that prove irrelevant. These prefrontal-mediated operations have important consequences both for what we presently remember as well as what we later forget. Moreover, multiple, functionally distinct prefrontal subregions are involved in coordinating these mnemonic operations, likely reflecting the engagement of broader cognitive control mechanisms that allow for the flexible allocation of attention. Accordingly, a complete telling of the story of forgetting and remembering will ultimately entail full specification of the many ways in which the frontal lobes shape acts of retrieval.
REFERENCES Anderson, J. R. (1974). Retrieval of propositional information from longterm memory. Cognitive Psychology, 6, 451–474. Anderson, J. R. (1976). Language, memory, and thought. Hillsdale, NJ: Erlbaum. Anderson, J. R. (1983a, April 1). Retrieval of information from long-term memory. Science, 220, 25–30. Anderson, J. R. (1983b). A spreading activation theory of memory. Journal of Verbal Learning and Verbal Behavior, 22, 261–295. Anderson, M. C. (2003). Rethinking interference theory: Executive control and the mechanisms of forgetting. Journal of Memory and Language, 49, 415–445. Anderson, M. C., Bjork, E. L., & Bjork, R. A. (2000). Retrieval-induced forgetting: Evidence for a recall-specific mechanism. Psychonomic Bulletin and Review, 7, 522–530. Anderson, M. C., Bjork, R. A., & Bjork, E. L. (1994). Remembering can cause forgetting: Retrieval dynamics in long-term memory. Journal
8/18/09 6:27:49 PM
References of Experimental Psychology: Learning, Memory, and Cognition, 20, 1063–1087.
Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual Review of Neuroscience, 18, 193–222.
Anderson, M. C., & Green, C. (2001, March 15). Suppressing unwanted memories by executive control. Nature, 410, 366–369.
D’Esposito, M., Aguirre, G. K., Zarahn, E., Ballard, D., Shin, R. K., & Lease, J. (1998). Functional MRI studies of spatial and nonspatial working memory. Cognitive Brain Research, 7, 1–13.
Anderson, M. C., Ochsner, K. N., Kuhl, B., Cooper, J., Robertson, E., Gabrieli, S. W., et al. (2004, January 9). Neural systems underlying the suppression of unwanted memories. Science, 303, 232–235. Anderson, M. C., & Spellman, B. A. (1995). On the status of inhibitory mechanisms in cognition: Memory retrieval as a model case. Psychological Review, 102, 68–100. Andrés, P., Van der Linden, M., & Parmentier, F. B. (2007). Directed forgetting in frontal patients’ episodic recall. Neuropsychologia, 45, 1355–1362. Aslan, A., Bäuml, K. H., & Pastotter, B. (2007). No inhibitory deficit in older adults’ episodic memory. Psychological Science, 18, 72–78. Badre, D., Poldrack, R. A., Pare-Blagoev, E. J., Insler, R. Z., & Wagner, A. D. (2005). Dissociable controlled retrieval and generalized selection mechanisms in ventrolateral prefrontal cortex. Neuron, 47, 907–918. Badre, D., & Wagner, A. D. (2004). Selection, integration, and conflict monitoring: Assessing the nature and generality of prefrontal cognitive control mechanisms. Neuron, 41, 473–487. Badre, D., & Wagner, A. D. (2007). Left ventrolateral prefrontal cortex and the cognitive control of memory. Neuropsychologia, 45, 2883–2901. Basden, B. H., Basden, D. R., & Gargano, G. J. (1993). Directed forgetting in implicit and explicit memory tests: A comparison of methods. Journal of Experimental Psychology: Learning, Memory, and Cognition, 19, 603–616. Bäuml, K. H. (1998). Strong items get suppressed, weak items do not: The role of item strength in output interference. Psychonomic Bulletin and Review, 5, 459–463. Bjork, R. A. (1970). Positive forgetting: The noninterference of items intentionally forgotten. Journal of Verbal Learning and Verbal Behavior, 9, 255–268. Bjork, R. A. (1989). Retrieval inhibition as an adaptive mechanism in human memory. Hillsdale, NJ, and England: Erlbaum. Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108, 624–652. Braver, T. S., Barch, D. M., Gray, J. R., Molfese, D. L., & Snyder, A. (2001). Anterior cingulate cortex and response conflict: Effects of frequency, inhibition and errors. Cerebral Cortex, 11, 825–836. Braver, T. S., & Bongiolatti, S. R. (2002). The role of frontopolar cortex in subgoal processing during working memory. NeuroImage, 15, 523–536. Bunge, S. A. (2004). How we use rules to select actions: A review of evidence from cognitive neuroscience. Cognitive, Affective, and Behavioral Neuroscience, 4, 564–579. Bunge, S. A., Burrows, B., & Wagner, A. D. (2004). Prefrontal and hippocampal contributions to visual associative recognition: Interactions between cognitive control and episodic retrieval. Brain and Cognition, 56, 141–152. Conway, M. A., & Fthenaki, A. (2003). Disruption of inhibitory control of memory following lesions to the frontal and temporal lobes. Cortex, 39, 667–686. Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed and stimulus-driven attention in the brain. Nature Reviews Neuroscience, 3, 201–215. Curtis, C. E., & D’Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. Trends in Cognitive Sciences, 7, 415–423. Depue, B. E., Curran, T., & Banich, M. T. (2007, July 13). Prefrontal regions orchestrate suppression of emotional memories via a two-phase process. Science, 317, 215–219.
c30.indd Sec4:603
603
D’Esposito, M., Postle, B. R., Ballard, D., & Lease, J. (1999). Maintenance versus manipulation of information held in working memory: An eventrelated fMRI study. Brain and Cognition, 41, 66–86. Dobbins, I. G., Foley, H., Schacter, D. L., & Wagner, A. D. (2002). Executive control during episodic retrieval: Multiple prefrontal processes subserve source memory. Neuron, 35, 989–996. Dobbins, I. G., & Wagner, A. D. (2005). Domain-general and domainsensitive prefrontal mechanisms for recollecting events and detecting novelty. Cerebral Cortex, 15, 1768–1778. Dolan, R. J., & Fletcher, P. C. (1997, August 7). Dissociating prefrontal and hippocampal function in episodic memory encoding. Nature, 388, 582–585. Downar, J., Crawley, A. P., Mikulis, D. J., & Davis, K. D. (2000). A multimodal cortical network for the detection of changes in the sensory environment. Nature Neuroscience, 3, 277–283. Eldridge, L. L., Knowlton, B. J., Furmanski, C. S., Bookheimer, S. Y., & Engel, S. A. (2000). Remembering episodes: A selective role for the hippocampus during retrieval. Nature Neuroscience, 3, 1149–1152. Estes, W. K. (1955). Statistical theory of spontaneous recovery and regression. Psychological Review, 62, 145–154. Feredoes, E., Tononi, G., & Postle, B. R. (2006). Direct evidence for a prefrontal contribution to the control of proactive interference in verbal working memory. Proceedings of the National Academy of Sciences, USA, 103, 19530–19534. Fletcher, P. C., Shallice, T., & Dolan, R. J. (2000). “Sculpting the response space”: An account of left prefrontal activation at encoding. Neuroimage, 12, 404–417. Geiselman, R. E., Bjork, R. A., & Fishman, D. L. (1983). Disrupted retrieval in directed forgetting: A link with posthypnotic amnesia. Journal of Experimental Psychology: General, 112, 58–72. Gold, B. T., Balota, D. A., Jones, S. J., Powell, D. K., Smith, C. D., & Andersen, A. H. (2006). Dissociation of automatic and strategic lexicalsemantics: Functional magnetic resonance imaging evidence for differing roles of multiple frontotemporal regions. Journal of Neuroscience, 26, 6523–6532. Henson, R. N., Rugg, M. D., Shallice, T., & Dolan, R. J. (2000). Confidence in recognition memory for words: Dissociating right prefrontal roles in episodic retrieval. Journal of Cognitive Neuroscience, 12, 913–923. Henson, R. N., Shallice, T., Josephs, O., & Dolan, R. J. (2002). Functional magnetic resonance imaging of proactive interference during spoken cued recall. Neuroimage, 17, 543–558. Hertel, P. T., & Calcaterra, G. (2005). Intentional forgetting benefits from thought substitution. Psychonomic Bulletin and Review, 12, 484–489. Hicks, J. L., & Starns, J. J. (2004). Retrieval-induced forgetting occurs in tests of item recognition. Psychonomic Bulletin and Review, 11, 125–130. Janowsky, J. S., Shimamura, A. P., Kritchevsky, M., & Squire, L. R. (1989). Cognitive impairment following frontal lobe damage and its relevance to human amnesia. Behavioral Neuroscience, 103, 548–560. Johansson, M., Aslan, A., Bäuml, K. H., Gabel, A., & Mecklinger, A. (2007). When remembering causes forgetting: Electrophysiological correlates of retrieval-induced forgetting. Cerebral Cortex, 17, 1335–1341. Johnson, S. K., & Anderson, M. C. (2004). The role of inhibitory control in forgetting semantic knowledge. Psychological Science, 15, 448–453. Jonides, J., & Nee, D. E. (2006). Brain mechanisms of proactive interference in working memory. Neuroscience, 139, 181–193.
8/18/09 6:27:49 PM
604
Forgetting and Retrieval
Kirwan, C. B., & Stark, C. E. (2004). Medial temporal lobe activation during encoding and retrieval of novel face-name pairs. Hippocampus, 14, 919–930. Koechlin, E., & Summerfield, C. (2007). An information theoretical approach to prefrontal executive function. Trends in Cognitive Sciences, 11, 229–235. Kondo, H., Osaka, N., & Osaka, M. (2004). Cooperation of the anterior cingulate cortex and dorsolateral prefrontal cortex for attention shifting. Neuroimage, 23, 670–679. Kostopoulos, P., & Petrides, M. (2003). The mid-ventrolateral prefrontal cortex: Insights into its role in memory retrieval. European Journal of Neuroscience, 17, 1489–1497. Kuhl, B. A., Dudukovic, N. M., Kahn, I., & Wagner, A. D. (2007). Decreased demands on cognitive control reveal the neural processing benefits of forgetting. Nature Neuroscience, 10, 908–914. Levy, B. J., & Anderson, M. C. (2002). Inhibitory processes and the control of memory retrieval. Trends in Cognitive Sciences, 6, 299–305. Levy, B. J., McVeigh, N. D., Marful, A., & Anderson, M. C. (2007). Inhibiting your native language: The role of retrieval-induced forgetting during second-language acquisition. Psychological Science, 18, 29–34. Luks, T. L., Simpson, G. V., Dale, C. L., & Hough, M. G. (2007). Preparatory allocation of attention and adjustments in conflict processing. Neuroimage, 35, 945–958. Lundstrom, B. N., Ingvar, M., & Petersson, K. M. (2005). The role of precuneus and left inferior frontal cortex during source memory episodic retrieval. Neuroimage, 27, 824–834. MacDonald, A. W., III, Cohen, J. D., Stenger, V. A., & Carter, C. S. (2000, June 9). Dissociating the role of the dorsolateral prefrontal and anterior cingulate cortex in cognitive control. Science, 288, 1835–1838. MacLeod, M. D., & Saunders, J. (2005). The role of inhibitory control in the production of misinformation effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 964–979. Martin, R. C., & Cheng, Y. (2006). Selection demands versus association strength in the verb generation task. Psychonomic Bulletin and Review, 13, 396–401. McGeoch, J. A. (1942). The psychology of human learning: An introduction. New York: Longmans, Green. Melton, A. W., & Irwin, J. M. (1940). The influence of degree of interpolated learning on retroactive inhibition and the overt transfer of specific responses. American Journal of Psychology, 3, 173–203. Mensink, G., & Raaijmakers, J. G. (1988). A model for interference and forgetting. Psychological Review, 95, 434–455. Metzler, C. (2001). Effects of left frontal lesions on the selection of context-appropriate meanings. Neuropsychology, 15, 315–328. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24, 167–202. Moscovitch, M. (1982). Multiple dissociations of function in amnesia. In L. Cermak (Ed.), Human memory and amnesia (pp. 337–370). Hillsdale, NJ: Erlbaum. Moscovitch, M., & Winocur, G. (1992). The neuropsychology of memory and aging. In F. I. M. Craik & T. A. Salthouse (Eds.), The handbook of aging and cognition (pp. 315–372). Hillsdale, NJ: Erlbaum. Müller, G. E., & Pilzecker, A. (1900). Experimentalle Beitrage zur Lehre com Gedachtnis. Zeitschrift fur Psychologie, 1, 1–300. Norman, K. A., Newman, E. L., & Detre, G. (2007). A neural network model of retrieval-induced forgetting. Psychological Review, 114, 887–953. Pandya, D. N., Van Hoesen, G. W., & Mesulam, M. M. (1981). Efferent connections of the cingulate gyrus in the rhesus monkey. Experimental Brain Research, 42, 319–330. Paz-Caballero, M. D., Menor, J., & Jimenez, J. M. (2004). Predictive validity of event-related potentials (ERPs) in relation to the directed forgetting effects. Clinical Neurophysiology, 115, 369–377.
c30.indd Sec4:604
Petrides, M. (1996). Specialized systems for the processing of mnemonic information within the primate frontal cortex. Philosophical Transactions of the Royal Society of London, B: Biological Sciences, 351, 1455–1461. Petrides, M. (2005). Lateral prefrontal cortex: Architectonic and functional organization. Philosophical Transactions of the Royal Society of London, B: Biological Sciences, 360, 781–795. Petrides, M., & Pandya, D. N. (1999). Dorsolateral prefrontal cortex: Comparative cytoarchitectonic analysis in the human and the macaque brain and corticocortical connection patterns. European Journal of Neuroscience, 11, 1011–1036. Petrides, M., & Pandya, D. N. (2002). Comparative cytoarchitectonic analysis of the human and the macaque ventrolateral prefrontal cortex and corticocortical connection patterns in the monkey. European Journal of Neuroscience, 16, 291–310. Petrides, M., & Pandya, D. N. (2007). Efferent association pathways from the rostral prefrontal cortex in the macaque monkey. Journal of Neuroscience, 27, 11573–11586. Postman, L., Stark, K., & Fraser, J. (1968). Temporal changes in interference. Journal of Verbal Learning and Verbal Behavior, 7, 672–694. Raz, N., Gunning, F. M., Head, D., Dupuis, J. H., McQuain, J., Briggs, S. D., et al. (1997). Selective aging of the human cerebral cortex observed in vivo: Differential vulnerability of the prefrontal gray matter. Cerebral Cortex, 7, 268–282. Rowe, J. B., Toni, I., Josephs, O., Frackowiak, R. S., & Passingham, R. E. (2000, June 2). The prefrontal cortex: Response selection of maintenance within working memory? Science, 288, 1656–1660. Sahakyan, L., Delaney, P. F., & Waldum, E. R. (2008). Intentional forgetting is easier after two “shots” than one. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34, 408–414. Sahakyan, L., & Kelley, C. (2002). A contextual change account of the directed forgetting effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 1064–1072. Saunders, J., & MacLeod, M. D. (2006). Can inhibition resolve retrieval competition through the control of spreading activation? Memory and Cognition, 34, 307–322. Schacter, D. L. (1999). The seven sins of memory: Insights from psychology and cognitive neuroscience. American Psychologist, 54, 182–203. Shimamura, A. P. (2000). The role of the prefrontal cortex in dynamic filtering. Psychobiology, 28, 207–218. Shimamura, A. P., Jurica, P. J., Mangels, J. A., Gershberg, F. B., & Knight, R. T. (1995). Susceptibility to memory interference effects following frontal lobe damage: Findings from tests of paired-associate learning. Journal of Cognitive Neuroscience, 7, 144–152. Smith, M. L., Leonard, G., Crane, J., & Milner, B. (1995). The effects of frontal- or temporal-lobe lesions on susceptibility to interference in spatial memory. Neuropsychologia, 33, 275–285. Sohn, M. H., Goode, A., Stenger, V. A., Carter, C. S., & Anderson, J. R. (2003). Competition and representation during memory retrieval: Roles of the prefrontal cortex and the posterior parietal cortex. Proceedings of the National Academy of Sciences, USA, 100, 7412–7417. Sohn, M. H., Goode, A., Stenger, V. A., Jung, K. J., Carter, C. S., & Anderson, J. R. (2005). An information-processing model of three cortical regions: Evidence in episodic memory retrieval. Neuroimage, 25, 21–33. Thompson-Schill, S. L., D’Esposito, M., Aguirre, G. K., & Farah, M. J. (1997). Role of left inferior prefrontal cortex in retrieval of semantic knowledge: A reevaluation. Proceedings of the National Academy of Sciences, USA, 94, 14792–14797. Thompson-Schill, S. L., Jonides, J., Marshuetz, C., Smith, E. E., D’Esposito, M., Kan, I. P., et al. (2002). Effects of frontal lobe damage on interference effects in working memory. Cognitive, Affective, and Behavioral Neuroscience, 2, 109–120.
8/18/09 6:27:49 PM
References Thompson-Schill, S. L., Swick, D., Farah, M. J., D’Esposito, M., Kan, I. P., & Knight, R. T. (1998). Verb generation in patients with focal frontal lesions: A neuropsychological test of neuroimaging findings. Proceedings of the National Academy of Sciences, USA, 95, 15855–15860.
c30.indd Sec4:605
605
Weissman, D. H., Roberts, K. C., Visscher, K. M., & Woldorff, M. G. (2006). The neural bases of momentary lapses in attention. Nature Neuroscience, 9, 971–978. Weissman, D. H., Warner, L. M., & Woldorff, M. G. (2004). The neural mechanisms for minimizing cross-modal distraction. Journal of Neuroscience, 24, 10941–10949.
Turner, M. S., Cipolotti, L., Yousry, T., & Shallice, T. (2007). Qualitatively different memory impairments across frontal lobe subgroups. Neuropsychologia, 45, 1540–1552.
Wixted, J. T. (2004). The psychology and neuroscience of forgetting. Annual Review of Psychology, 55, 235–269.
van Veen, V., & Carter, C. S. (2002). The anterior cingulate as a conflict monitor: FMRI and ERP studies. Physiology and Behavior, 77, 477–482.
Wylie, G. R., Foxe, J. J., & Taylor, T. L. (2008). Forgetting as an active process: An fMRI investigation of item-method-directed forgetting. Cerebral Cortex, 18, 670–682.
Wagner, A. D., Maril, A., Bjork, R. A., & Schacter, D. L. (2001). Prefrontal contributions to executive control: FMRI evidence for functional distinctions within lateral Prefrontal cortex. Neuroimage, 14, 1337–1347.
Zacks, R. T., Radvansky, G., & Hasher, L. (1996). Studies of directed forgetting in older adults. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 143–156.
8/18/09 6:27:50 PM
Chapter 31
Emotional Modulation of Learning and Memory LARRY F. CAHILL
This chapter addresses our current understanding of the brain/body mechanisms subserving the influence of emotional arousal on long-term memory. The focus is on memory enhancement for acute, emotionally stressful events in healthy males and females. A great deal of evidence from both animal and human subject studies converges on the conclusion that endogenous stress hormones, released during and after emotionally arousing events, interact with the amygdala (an almond-shaped structure in the medial temporal lobe) to influence memory storage processes for the emotional events that occur in other brain regions (McGaugh, 2004). This mechanism is thought to provide an evolutionarily adaptive way to adjust memory strength to memory importance. It is also increasingly clear that biological sex can influence neuronal function from the level of single neurons in vivo to the level of functioning humans (Cahill, 2006). Relatively recent research is beginning to reveal influences of sex on neural mechanisms of emotionally influenced memory. These discoveries challenge a fundamental, but generally unexamined assumption on which most research in this field has been built, namely, that the sex of the subjects tested will not significantly influence experimental findings. They also have substantial clinical implications for a host of disorders, such as posttraumatic stress disorder (PTSD) and clinical depression. We present some of the recent evidence regarding sex influences, and consider the implications for future research. This work already forces the conclusion that studies of emotional memory (at least involving human subjects) may no longer safely assume that subject sex will not significantly influence experimental findings, hence conclusions about brain mechanisms.
BASOLATERAL AMYGDALA: THE BRAIN’S FOCAL POINT FOR MODULATION OF MEMORY Central to our current understanding of brain mechanisms of emotional memory is the amygdala complex, a collection of nuclei in the medial temporal lobe implicated in both producing and reacting to the body’s stress response. A central concept developed by McGaugh and colleagues (McGaugh, 2004; McGaugh, Cahill, & Roozendaal, 1996) holds that the amygdala modulates the storage of different forms of memory, in particular conscious (“declarative”) memory, and does so via extensive interactions with the endogenous stress response produced by emotionally salient events. One may immediately grasp the plausibility of this view of amygdala function by first considering the anatomical connectivity of the primate amygdala. A meta-analysis of cortico-cortical connectivity in the monkey by Young and Scannell (1994) revealed a quite striking and unique aspect of the amygdala, in particular its basolateral nuclei. This analysis demonstrated convincingly that the primate basolateral amygdala region possesses an extremely widespread and unique pattern of connectivity with the cortex. The overwhelming majority of these connections with the cortex are amygdalofugal, that is, from the amygdala to the cortical regions. Thus, the amygdala is evidently extremely well suited to exert a diffuse, modulatory influence on cortical function. Its anatomical architecture also belies the simplistic, albeit popular notion that the amygdala possesses an “in” door through which Pavlovian stimuli enter, and an “out” door through which “emotional memories” made in the amygdala leave (LeDoux, 2000). Across many species, learning tasks, and laboratories, stimulation of the amygdala (and in particular its basolateral complex) potently modulates—enhances or
This study supported by an NIMH RO1 57508 to L.C.
606
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c31.indd 606
8/18/09 6:28:29 PM
Amygdala-Based Emotional Modulation of Attention
impairs—memory storage processes. Most often, stimulation has been given immediately after learning, allowing the conclusion that the effects of the stimulation on memory resulted from an influence on consolidation processes. Evidence also indicates that the amygdala’s ability to modulate memory consolidation depends crucially on endogenous stress hormones. For example, amygdala stimulation may improve or impair memory storage depending on adrenal gland function (McGaugh, 2004). It is remarkably consistent that across many laboratories and learning paradigms essentially all peripherally administered drugs and hormones require the basolateral amygdala to affect memory. This is among the best-supported conclusions in all of the neurobiology of learning and memory. Lesions or functional inactivation of the key amygdala nuclei (the basolateral amygdala) block the memory enhancing and impairing effects of all drugs and hormones tested to date. Even the amnesia induced by some general anesthetics is blocked by lesions of the basolateral amygdala (Alkire, Vazdarjanova, Dickinson-Anson, White, & Cahill, 2001). If a major amygdala function is to interact with endogenous stress hormones to influence memory, then we should find a disproportionate effect of amygdala lesions in learning situations that are relatively arousing, that is, stress response activating. We examined this possibility in a study involving lesions of the amygdala in rats (Cahill & McGaugh, 1990). The results from this study suggested that amygdala lesions impaired memory only in the relatively arousing circumstances, leading us to conclude that “the degree of arousal produced by the unconditioned stimulus, and not the aversive nature per se, determines the level of amygdala involvement (in a learning situation). The amygdala appears to participate in learning especially when the reinforcement is of a highly arousing nature.” Importantly, this conclusion has been confirmed by four human brain imaging studies that examined responses of the amygdala in human subjects to stimuli that varied across the arousing (arousing-calming) and valence dimensions (pleasant-unpleasant) (Anderson et al., 2003; Kensinger & Corkin, 2004; Lewis, Critchley, Rotshtein, & Dolan, 2006; Small et al., 2003). In all four studies, the amygdala responded selectively to the arousing qualities of the stimuli. Thus, both animal and human subject work converge on the view that arousal (sympathetic activation), and not valence or any particular emotion such as fear, is critical to amygdala activation. Although the amygdala appears to be required for enhanced memory for arousing events, evidence indicates it is not always the site of storage of such memories. This fact is made most clear by experiments involving stimulation of the amygdala immediately after animals are trained
c31.indd Sec1:607
607
in tasks known to be dependent on other brain regions, namely the hippocampus and caudate nucleus (McGaugh, 2004; Packard, Cahill, & McGaugh, 1994). These studies show that postlearning stimulation of the amygdala can modulate memory for hippocampus-dependent and caudate-dependent tasks. Crucially, however, it fails to do so if the relevant “downstream” structure (hippocampus for a hippocampal task, caudate for a caudate task) is simultaneously inactivated. Furthermore, inactivation of the amygdala prior to retrieval testing does not affect performance in these tasks. Taken together, these findings argue that the amygdala acts in the period soon after an emotionally arousing event to modulate the storage of memories in other brain regions, such as the hippocampus and caudate. Thus, extensive research involving both animal and human subjects converges on a neurobiological mechanism by which emotional arousal sculpts the contents of memory (McGaugh, 2004). Endogenous stress hormones, released during and after an emotionally arousing event, influence the postevent storage of memory by actions requiring the amygdala. Without an amygdala, modulation of memory by stress hormones fails to occur. The amygdala in turn does not necessarily serve as a site of storage of memory, but as a modulator of memory processing occurring in other brain regions. This postlearning memory modulating mechanism can serve as an evolutionarily adaptive way to create memory strength that is, in general, proportional to memory importance.
AMYGDALA-BASED EMOTIONAL MODULATION OF ATTENTION In addition to its well-established, postlearning (“offline”) modulatory role in memory, evidence suggests that the focusing of attention during emotional events (“online” modulation) also involves the BLA (see Anderson, 2005; Anderson & Phelps, 2001). Anderson and colleagues have found, for example, that whereas in healthy subjects presentation of an emotional stimulus reduces the “attentional blink” (a reduced awareness of a stimulus presented in the same location and very shortly after another stimulus), this effect fails to occur in patients with bilateral amygdala damage. Indeed, there is a striking asymmetry in amygdalo-fugal and amygdalo-petal projections, whereby the amygdala sends many more projections throughout neocortical regions, including perceptual cortices, than it receives. It has been known since the early 1950s that stimulation of the amygdala activates the cortical EEG as effectively as does the reticular activating system. The amygdala also has projections to subcortical structures
8/18/09 6:28:29 PM
608
Emotional Modulation of Learning and Memory
important for attention and eye movements—the thalamic pulvinar nucleus and the superior colliculus, and receives input from sensory thalamic nuclei. This arrangement is consistent with the view that the amygdala receives a relatively rapid and coarse representation of the state of the world and, via its robust feedback projections to the cortex, alters the perception of stimuli, upregulating cortical encoding of stimuli of emotional importance. It has been speculated that these stimuli may then be tagged in some way that allows the relatively sluggish, postlearning hormonal responses to selectively enhance consolidation of the tagged information (Cahill, Gorski, & Le, 2003).
AMYGDALA ACTIVITY AND EMOTIONAL MEMORY IN HUMANS—EMERGENCE OF SEX EFFECTS If the amygdala functions, at least in part, to modulate the storage of memory for emotional events, then it should be possible to detect a relationship between the degree to which the amygdala is activated in response to emotional events, and the degree to which those events are subsequently recalled. Such a relationship is now well established. Cahill et al. (1996) scanned healthy male subjects with PET for regional cerebral glucose while they viewed either a series of relatively emotionally arousing (negative) films, or a matched but much more emotionally neutral set of films, and examined memory for the films 3 weeks later. The results showed that right hemisphere amygdala activity while viewing the emotional films correlated significantly with long-term recall, but not with recall of emotionally neutral films. Several other laboratories have now confirmed this finding, providing additional support for the view that the amygdala plays a special, presumably modulatory, role in memory storage for emotional events, as predicted by animal research. However, each human imaging study contained unexplained hemispheric asymmetries in the amygdala relationship to subsequent memory for emotional material. It was at this point that I observed that studies reporting amygdala effects predominantly or exclusively on the right side of the brain involved only male subjects, whereas those studies reporting amygdala effects predominantly or exclusively on the left side of the brain involved only female subjects, raising the possibility that subject sex determined, at least in part, the hemispheric lateralization of amygdala function. But because the studies differed along many other dimensions as well (e.g., type of scanning, type of to-be-remembered material), this conclusion was clearly speculative.
c31.indd Sec1:608
Sex-Related Hemispheric Lateralization of the Amygdala Relationship to Long-Term Memory for Emotional Events We sought to determine whether subject sex was influencing lateralization of the amygdala relationship to long-term memory for emotional material by directly comparing activity in the brains of men and women within a single study. In our first study (Cahill et al., 2001), 11 men and 11 women received two PET scans for regional cerebral glucose utilization—one while watching a series of emotionally arousing films clips, another while watching a series of more emotionally neutral clips. Memory for the films was assessed in a surprise free recall test 3 weeks later. The results showed that a large area of right, but not left hemisphere amygdala activity was significantly related to enhanced memory for the emotional film clips in men. Yet, in women, a large area of left, and not right, hemisphere amygdala activity related to enhanced memory for the emotional films. Canli, Desmond, Zhao, and Gabrieli (2002) confirmed this sex-related lateralization in an fMRI study of amygdala function. Subjects in this study were scanned while viewing a series of emotionally arousing or neutral slides. Activity of the right, and not left, amygdala in males related significantly to memory for the most emotional slides, whereas activity of the left, and not right, amygdala related to memory for the emotional slides in women. Canli et al. (2002) observed in addition that “both correlations were so robust that they were present even with multiple comparisons across the brain and without selecting the amygdala as a region of interest.” Perhaps the most compelling demonstration of a sexrelated hemispheric lateralization to date comes from an fMRI study by Cahill, Uncapher, Kilpatrick, Alkire, and Turner (2004), who also employed fMRI to study amygdala activity at encoding and subsequent memory for emotional images. Consistent with the previous studies, these authors report that activity of the right hemisphere amygdala was significantly more related to subsequent memory for the emotional images in men than in women, but activity of the left hemisphere amygdala was significantly more related to subsequent memory for the emotional images in women than in men (see Figure 31.1). Unlike the studies just mentioned, Cahill et al. (2004) also documented a significant crossover interaction between the variables of hemisphere and sex in the amygdala relationship to memory for emotional material. A fourth study directly comparing amygdala function in men and women further documents the sex-related lateralization (Mackiewicz, Sarinopoulos, Cleven, & Nitschke, 2006). These investigators argue that the effect was evident for more ventral amygdala aspects, which correspond largely to the
8/18/09 6:28:30 PM
Amygdala Activity and Emotional Memory in Humans—Emergence of Sex Effects
Men ⬎ Women
L R
L Women ⬎ Men R
11
⫺9
23 ⫺14
⫺25 ⫺19
4
Z
Figure 31.1 Sex-related hemispheric lateralization of amygdala function in long-term memory for emotionally arousing films. Note. Activity of the right, and not left, amygdala in males while viewing emotionally arousing films related significantly to memory for the films 2 weeks later. Activity of the left, and not right, amygdala in women related significantly to memory for the films. From “Sex-Related Hemispheric Lateralization of Amygdala Function in Emotionally-Influenced Memory: An fMRI Investigation,” by L. Cahill, M. Uncapher, L. Kilpatrick, M. Alkire, and J. Turner, 2004, Learning and Memory, 11, pp. 261–266. Copyright 2004 by Cold Spring Harbor Laboratory Press. Reprinted with permission.
basolateral nuclei. As discussed earlier, it is the basolateral nuclei that animal research indicates is key to the amygdala’s modulatory role in memory. Thus, a sex-related hemispheric lateralization of amygdala function with respect to long-term memory for emotional events is now evident across many studies of amygdala function from many laboratories, including four studies that have directly compared amygdala function in men and women in this context. The sex-related hemispheric lateralization of amygdala function in emotional memory raises a simple question: What does it mean? Before addressing that question, it is helpful to consider the profound importance of the sex influence issue for neuroscience. Sex Influences on Brain Function More Generally Considered Biological sex influences brain function to a far greater extent than neuroscience has recognized to date (Cahill,
c31.indd Sec2:609
609
2006). Pronounced neurobiological differences between males and females are increasingly reported outside of the traditional domain of reproduction. Sex differences exist in every brain lobe, including in cognitive brain regions such as the amygdala, hippocampus, and even the neocortex. The advent of modern imaging techniques has revealed sex-related differences in brain correlates of emotional processing, facial processing, working memory, auditory processing, and language processing, to name just a few. Even the cellular correlates of neuronal death in cell culture differ depending on whether the neurons were derived from male or female brains (Li et al., 2005). Sex-related differences also exist in stress hormone function. As one example, Wolf, Schommer, Hellhammer, McEwen, and Kirschbaum (2001) reported a negative correlation between a cortisol response to a stressor and subsequent memory in a sample of men and women, but found further that this effect resulted from a highly significant correlation found only in men. As conceptual blinders that sex does not matter fall off, more investigators are looking, and finding, sex influences on memory and its neural correlates. Three examples: First, Milad and colleagues (2006) determined whether sex differences exist in the acquisition and extinction of Pavlovian fear conditioning in healthy men and women. Acquisition was significantly faster in men than in women. During extinction, no overall sex difference was found, but the menstrual cycle significantly influenced the rate of extinction in females. A second example: Yonker and colleagues (2003) examined sex influences on episodic memory in healthy subjects using a series of tasks. They reported a large (Cohen’s d > .70) overall female advantage in performance on these episodic memory tests. Interestingly, and in contrast to the conditioning study just mentioned, this advantage appeared to be unrelated to circulating levels of estradiol. Third, in a series of recent studies employing null mutant and microarray genetic methods, Mizuno and colleagues have reported “male specific” molecular mechanisms within the hippocampus related to memory formation (see Mizuno et al., 2007). Male and female mice learning the same tasks appear not to be using the same molecular neural machinery to do so. Clearly grasping the impressive quantity and diversity of sex influences on brain function should, by itself, challenge investigators to carefully consider potential sex influences on emotional memory, the issue to which we now return. Sex Difference in Human Amygdala Functional Connectivity at Rest Returning to the issue of the amygdala and neural mechanisms of emotional memory, we next wondered whether the sex-difference in amygdala function in response to
8/18/09 6:28:30 PM
610
Emotional Modulation of Learning and Memory
33 Z ⫺33
Figure 31.2 Amygdala seed voxels displaying significant sexrelated differences in amygdala functional connectivity during resting conditions. Note. Left amygdala areas reveal greater functional connectivity in women than in men. Right amygdala areas reveal greater functional connectivity in men than in women. From “Sex-Related Differences in Amygdala Functional Connectivity during Resting Conditions,” by Kilpatrick L. A., Zald D. H., Pardo J. V., & L. F. Cahill, 2006, Neuroimage, 30, pp. 452– 461. Copyright 2006 by Elsevier Press. Reprinted with permission.
emotional stimuli arose, at least in part, from a preexisting sex difference in the functional connectivity of the amygdala at rest, before stimulation. We studied the patterns of functional covariance between the left and right hemisphere amygdalae and the rest of the brain in a large sample of men and women given blood-flow PET scans while resting with their eyes closed (Kilpatrick, Zald, Pardo, & Cahill, 2006). The results of this analysis revealed that activity of the right hemisphere amygdala covaried to a much larger extent with other brain regions in men than it did in women; conversely, activity of the left hemisphere amygdala covaried with other brain regions far more in women than in men (Figure 31.2). Consistent with findings from several earlier investigations, no difference existed between the sexes in the overall levels of amygdala activity; rather, the sexes differed in the pattern of amygdala connectivity with the rest of the brain. Thus, it appears from these findings that the sex-related hemispheric lateralization of amygdala function in relation to memory for emotional stimuli results in part from a preexisting sex difference in the amygdala’s functional connectivity. The findings of Kilpatrick et al. (2006) also indicate that sex can no longer be ignored by any investigators of human amygdala function, since pronounced sex differences in its function presumably must exist in all experimental situations. Potential Relationship of the Sex-Related Amygdala Hemispheric Specialization to Hemispheric Global/Local Processing Bias We also sought to better understand what the sex-related lateralization of amygdala function may mean by integrating
c31.indd Sec2:610
it with other knowledge about hemisphere lateralization of function. In particular, a good deal of evidence suggests that the two cerebral hemispheres differentially process more global versus local aspects of a stimulus or scene. Evidence from a variety of sources indicates that the right hemisphere is biased toward the processing of more global, holistic aspects of a stimulus or scene, while the left hemisphere is biased toward more local, finer detail processing of the same stimulus or scene (Beeman & Bowden, 2000; Fink et al., 1996; Fink, Marshall, Halligan, & Dolan, 1999). Combining our evidence of a sex-related hemispheric laterality of amygdala function in memory for emotional material (males/right, females/left) with the view that the hemispheres differentially process global versus local information (holistic/right, detail/left) allowed us to posit that there may exist a sex-related difference in the effects of a -adrenergic blockade on emotional memory. We know from animal research that amygdala function is impaired by beta-blockers, drugs that induce blockade of -adrenergic receptors. We also know, on anatomical grounds, that each amygdala largely modulates its own hemisphere. Hence, we reasoned that a beta-blocker, by impairing amygdala function, might impair the presumed modulatory effect of the right hemisphere amygdala on the more global processing of the right hemisphere in men, thereby reducing their memory for the more global (central) aspects of an emotional story. Similarly, we reasoned that the same betablocker might impair the presumed modulatory effect of the left hemisphere amygdala on the more local processing of the left hemisphere, thereby reducing memory for the details of the same emotional story in women. To test this hypothesis, we then reanalyzed published data from two studies demonstrating an impairing effect of β-adrenergic blockade on memory for an emotionally arousing story (Cahill & van Stegeren, 2003; Figure 31.3). Note in particular the results for story phase 2 (P2 on the X-axis) in which the emotional story elements were introduced (concerning severe injuries to a small boy in an accident while his mother watched), and for which the hypothesis at issue most clearly holds. The P2 results reveal a double dissociation of gender and type of to-be-remembered information (central versus peripheral) on propranolol’s impairing effect on memory: Propranolol significantly impaired P2 memory of central information in men but not women, yet impaired P2 memory of peripheral detail in women but not men. These results are consistent with the hypothesis that, under emotionally arousing conditions, activation of right amygdala/hemisphere function produces a relative enhancement of memory for central information in males, and activation of left amygdala/hemisphere function in females produces a relative enhancement of memory for peripheral details in women.
8/18/09 6:28:33 PM
Amygdala Activity and Emotional Memory in Humans—Emergence of Sex Effects Men 100
Mean % Correct (⫹/⫺ SEM)
Figure 31.3 Recognition test scores for the three-phase emotional story phase.
Women Central Information
(A)
100
*
90
90
80
80
70
70
60
60 50
50 P1
P2
P3
P1
P2
P3
Placebo Propranolol
Mean % Correct (⫹/⫺ SEM)
(B)
Peripheral Detail 70
70
60
60
50
50
40
40
30
Note. A: Values for questions defined as pertaining to central information. B: Values for questions defined as pertaining to peripheral detail. Values represent mean percentage correct (±SEM) on the recognition test in each experimental group. P1, P2, P3 indicate story phases 1, 2, 3, respectively. Emotional story elements were introduced in P2. From “SexRelated Impairment of Memory for Emotional Events with -Adrenergic Blockade,” by L. Cahill and A. van Stegeren, 2003, Neurobiology of Learning and Memory, 79, pp. 81–88. Copyright 2003 by Elsevier Press. Reprinted with permission. * p < .01 placebo compared with corresponding P2 propranolol group (posthoc, two-tailed, unpaired t-test comparison).
*
30 P1
P2
P3
P1
Uncovering Sex Influences on Emotional Memory Because the assumption that subject sex will not significantly influence findings, hence conclusions, is increasingly viewed by investigators as questionable at best, many are beginning to explicitly examine the issue in studies of emotional memory. For example, Gasbarri, Arnone, Lucchese, Pacitti, and Cahill (2007) examined EEG responses to emotional and neutral stimuli in healthy men and women. The P300 response was assessed from electrodes located over the left and right hemispheres as men and women viewed images taken from the International Affective Picture System. The results showed that, for the negative (and presumably most arousing) slides, the P300 was greater when recorded over the left hemisphere in women than it was in men. Conversely, the P300 was greater when recorded over the right hemisphere in men than it was in women. This pattern (women left/men right) parallels that observed in earlier studies (described previously) regarding the amygdala. Additionally, they suggest that sex-related differences in how the brain processes memory for emotional events begin within 300 ms of the onset of an emotional event. Other work examined the effects of a postlearning stressor (cold pressor stress, CPS, induced by forearm immersion in ice water) on memory consolidation. In one study (Andreano & Cahill, 2006), subjects received CPS or a control procedure immediately after hearing a short
c31.indd Sec3:611
611
P2
P3
story. Memory for the story was assessed in an incidental, free recall test 1 week later. CPS produced a retrograde enhancing effect on memory in men, but not in women, despite having produced a similar cortisol response in both groups. Ongoing work in our laboratory is examining menstrual cycle influences on the mnemonic effects of CPS. But again, the findings force the important conclusion that subject sex cannot be assumed not to matter any longer in studies of emotional memory. As a final example, we are also uncovering menstrual cycle influences on rumination that occurs after subjects view emotional films. In as yet unpublished work, we find that women who view emotionally graphic films while in the luteal phase of their menstrual cycle ruminate significantly more on the films than do women who view them while in the follicular phase of the cycle. Furthermore, progesterone levels are significantly positively correlated with rumination. In the same study (Ferree & Cahill, in preparation), we also demonstrate a highly significant relationship between the degree of postevent rumination after viewing arousing films and the strength of subsequent memory strength for the films. These new findings are thus suggesting that rumination occurring after emotionally arousing events may help strengthen memory for that event, and that this process is accelerated in women who experience the emotional event with high levels of progesterone.
8/18/09 6:28:35 PM
612
Emotional Modulation of Learning and Memory
SUMMARY There is now very strong evidence from converging animal and human subject work that endogenous stress hormones, released during and after emotionally arousing events, interact with the amygdala (especially its basolateral nuclei) to modulate memory consolidation processes. It is becoming increasingly clear that, while these neural events may be broadly similar in men and women, they also differ in significant ways that can no longer be avoided by our field. Understanding disorders of emotional memory such as PTSD with established sex differences in their incidence and/or nature requires that we better understand sex influences on emotional memory in our basic science. We suggest that greater attention to potential sex influences is also likely to better inform future studies of “flashbulb memory.”
REFERENCES Alkire, M., Vazdarjanova, A., Dickinson-Anson, H., White, N. S., & Cahill, L. (2001). Selective basolateral amygdala lesions block propofol-induced amnesia. Anesthesiology, 95, 708–715. Amaral, D. G., & Price, J. L. (1984). Amygdalo-cortical projections in the monkey (Macaca fascicularis). Journal of Comparative Neurology, 230, 465–496. Anderson, A. K. (2005). Affective influences on the attentional dynamics supporting awareness. Journal of Experimental Psychology: General, 134, 258–281. Anderson, A. K., Christoff, K., Stappen, I., Panitz, D., Ghahremani D. G., Glover, G., et al. (2003). Dissociated neural representations of intensity and valence in human olfaction. Journal of Neuroscience, 6, 96–202. Anderson, A., & Phelps, E. (2001, May 17). Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature, 411, 305–309. Andreano, J. M., & Cahill, L. L. (2006). Glucocorticoid release and memory consolidation in men and women. Psychological Science, 17, 466–470. Beeman, M. J., & Bowden, E. M. (2000). The right hemisphere maintains solution-related activation for yet-to-be-solved problems. Memory and Cognition, 28, 1231–1241. Cahill, L. (2006). Why sex matters for neuroscience. Nature Reviews: Neuroscience, 7, 477–484.
Cahill, L., Uncapher, M., Kilpatrick, L., Alkire, M., & Turner, J. (2004). Sex-related hemispheric lateralization of amygdala function in emotionally-influenced memory: An fMRI investigation. Learning and Memory, 11, 261–266. Cahill, L., & van Stegeren, A. (2003). Sex-related impairment of memory for emotional events with -adrenergic blockade. Neurobiology of Learning and Memory, 79, 81–88. Canli, T., Desmond, J. E., Zhao, Z., & Gabrieli, J. D. (2002). Sex differences in the neural basis of emotional memories. Proceedings of the National Academy of Sciences, USA, 99, 10789–10794. Canli, T., Zhao, Z., Brewer, J., Gabrieli, J. D., & Cahill, L. (2000). Eventrelated activation in the human amygdala associates with later memory for individual emotional experience. Journal of Neuroscience, 20, RC99. Fink, G. R., Halligan, P. W., Marshall, J. C., Frith, C. D., Frackowiak, R. S., & Dolan, R. J. (1996, August 15). Where in the brain does visual attention select the forest and the trees? Nature, 382, 626–628. Fink, G. R., Marshall, J. C., Halligan, P. W., & Dolan, R. J. (1999). Hemispheric asymmetries in global/local processing are modulated by perceptual salience. Neuropsychologia, 37, 31–40. Gasbarri, A., Arnone, B., Lucchese, F., Pacitti, F., & Cahill, L. (2007). Sex - related hemispheric laterality of emotional picture processing: An event - related potential study. Brain Research, 1139, 178 – 186. Kensinger, E. A. & Corkin S. (2004). Two routes to emotional memory: Distinct neural processes for valence and arousal. Proceedings of the National Academy of Sciences, USA, 101, 3310–3315. Kilpatrick, L. A., Zald, D. H., Pardo, J. V., & Cahill, L. F. (2006). Sexrelated differences in amygdala functional connectivity during resting conditions. Neuroimage, 30, 452–461. LeDoux, J. (2000). Emotion circuits in the brain. Annual Review of Neuroscience, 23, 155–184. Lewis, P. A., Critchley, H. D., Rotshtein, P., & Dolan, R. J. (2006, May 22). Neural correlates of processing valence and arousal in affective words. Cerebral Cortex, 17(3), 742-748. Li, H., Pin, S., Zeng, Z., Wang, M. M., Andreasson, K. A., & McCullough, L. D. (2005). Sex differences in cell death. Annals of Neurology, 58, 317–321. Mackiewicz, K. L., Sarinopoulos, I., Cleven, K. L., & Nitschke, J. B. (2006, September 8). The effect of anticipation and the specificity of sex differences for amygdala and hippocampus function in emotional memory. Proceedings of the National Academy of Sciences, USA, 103, 14200–14205. McGaugh, J. L. (2004). The amygdala modulates the consolidation of memories of emotionally arousing experiences. Annual Review of Neuroscience, 27, 1–28.
Cahill, L., Gorski, L., & Le, K. (2003). Enhanced human memory consolidation with post-learning stress: Interaction with the degree of arousal at encoding. Learning and Memory, 10, 270–274.
McGaugh, J. L., Cahill, L., & Roozendaal, B. (1996). Involvement of the amygdala in memory storage: Interaction with other brain systems. Proceedings of the National Academy of Sciences, USA, 93, 13508–13514.
Cahill, L., Haier, R. J., Fallon, J., Alkire, M. T., Tang, C., Keator, D., et al. (1996). Amygdala activity at encoding correlated with long-term, free recall of emotional information. Proceedings of the National Academy of Sciences, USA, 93, 8016–8021.
Milad, M.R., Goldstein, J.M., Orr, S.P., Wedig, M.M., Klibanski, A., Pitman, R.K. & Rauch, S.L. (2006) Fear conditioning and extinction: influence of sex and menstrual cycle in healthy humans. Behavioral Neuroscience. 120(6), 1196–203.
Cahill, L., Haier, R. J., White, N. S., Fallon, J., Kilpatrick, L., Lawrence, C., et al. (2001). Sex-related difference in amygdala activity during emotionally influenced memory storage. Neurobiology of Learning and Memory, 75, 1–9.
Mizuno, K., Antunes-Martins, A., Ris, L., Peters, M., Godaux, E., & Giese, K. P. (2007). Calcium/calmodulin kinase kinase beta has a malespecific role in memory formation. Neuroscience, 145, 393–402.
Cahill, L., & McGaugh, J. L. (1990). Amygdaloid complex lesions differentially affect retention of tasks using appetitive and aversive reinforcement. Behavioral Neuroscience, 104, 532–543.
c31.indd Sec3:612
Packard, M., Cahill, L., & McGaugh, J. L. (1994). Amygdala modulation of hippocampal-dependent and caudate nucleus-dependent memory processes. Proceedings of the National Academy of Sciences, USA, 91, 8477–8481.
8/18/09 6:28:35 PM
References
c31.indd Sec4:613
613
Small, D. M., Gregory, M. D., Mak, Y. E., Gitelman, D., Mesulam, M. M., & Parrish, T. (2003, August 14). Dissociation of neural representation of intensity and affective valuation in human gustation. Neuron, 39(4), 701–711.
Wolf, O. T., Schommer, N. C., Hellhammer, D. H., McEwen, B. S., & Kirschbaum, C. (2001). The relationship between stress induced cortisol levels and memory differs between men and women. Psychoneuro endocrinology, 26, 711–720.
Young, M. P., & Scannell, J. W. (1994). Analysis of connectivity: Neural systems in the cerebral cortex. Reviews in the Neurosciences, 5, 227–250.
Yonker, J. E., Eriksson, E., Nilsson, L. G., & Herlitz, A. (2003). Sex differences in episodic memory: Minimal influence of estradiol. Brain and Cognition, 52(2), 231–238.
8/18/09 6:28:35 PM
c31.indd Sec4:614
8/18/09 6:28:35 PM
Chapter 32
Evaluative Processes GARY G. BERNTSON, GREG J. NORMAN, AND JOHN T. CACIOPPO
evaluative systems and the increasing complexity of these networks at higher levels of the neuraxis that can sustain at least partially independent activations. Such patterns allow for more flexible outputs, such as cautious approach during anxiety-like states (see Chapter 36), capable of developing over different temporal dimensions. We review evidence that evaluative processes are well conserved throughout ontogeny and phylogeny, represented throughout multiple levels of the neuraxis, and organized along a cardinal dimension of evaluative bivalence (i.e., approach vs. avoidance, positivity vs. negativity).
Natural selection has tailored the computational capacities of the brain to promote survival and maximize reproduction. This evolutionary pressure has led to the ability to quickly evaluate situations in which an organism must delineate between hostile and hospitable stimuli and select appropriate responses. The behavioral output of such evaluations may manifest in approach or avoidance dispositions that promote survival and minimize negative consequences. Although approach and avoidance dispositions often synergistically promote a common behavioral outcome, at times they may come into conflict (e.g., tolerating an unpalatable taste in order to obtain nutrients). Moreover, evaluative processes are represented in distributed systems at multiple levels of the neuraxis, and this multiple-level processing may also give rise to conflicts (e.g., suppressing pain-withdrawal reflexes to remove an embedded sliver). Despite potential complexities of central evaluative substrates, behavioral manifestations are constrained—an organism cannot simultaneously approach and avoid a goal object. Such physical constraints may belie the underlying structure of central evaluative processes and have led to theoretical models, typically based on behavioral measures that characterize evaluative processes as points along a bipolar (positive to negative) dimension of valence (Osgood, Suci, & Tannenbaum, 1957; Posner, Russell, & Peterson, 2005; Russell, 2003; Watson, Wiese, Vaidya, & Tellegen, 1999). This is often considered to be mediated by a single neural integrator responsible for valence integration (Allport, 1935). Although useful in many contexts, models of evaluative processes that assume reciprocity among positive and negative valence and homogeneity of neural substrates are likely too simplistic. Based on evolutionary, neurobiological, and psychological considerations, Cacioppo and Berntson (1994; Cacioppo, Gardner, & Berntson, 1997; Larsen, McGraw, & Cacioppo, 2001) have proposed a more complex, bivariate space model of evaluative processes. This model recognizes distinct positive and negative evaluative systems that can function in a reciprocal or coactive fashion (e.g., in ambivalence) and embraces the multiple-level representations of
LEVELS OF ORGANIZATION IN THE NERVOUS SYSTEM Levels of Evaluative Function: Lower-Level and Spinal Reflexes Spinal reflexes are among the lowest levels of organization in the central nervous system, and their relative simplicity allows for fast and efficient adaptive response to environmental stimuli. Although capable of operating independently of higher levels, spinal reflexes also provide critical functional support for higher-level functions, an issue to which we return later. In his treatise The Integrative Action of the Nervous System (1906), Sir Charles Sherrington detailed spinal organizations that contribute to postural regulation and provide the basic neurological support for locomotion. He also described spinal substrates for basic, low-level evaluative reactions. Among the most salient of spinal reflexes is the flexor (pain) withdrawal reflex, which represents a primitive but effective evaluative mechanism for protection against noxious or injurious stimuli. Nociceptive signals carried by somatosensory afferents activate flexor neuron pools via interneuron circuits within the spinal cord, resulting in flexor withdrawal responses (Craig, 2003; Lundberg, 1979; Sandrini et al., 2005; Schouenborg, Holmberg, & Weng, 1992). 617
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c32.indd 617
8/19/09 4:11:53 PM
618
Evaluative Processes
Although appetitive reflexes may be less obvious than aversive reflexes at the level of the spinal cord, primitive approach/engagement dispositions are also apparent in spinal extensor reflexes. Sherrington (1906) described extensor thrust reflexes to Palmer contact that represent low-level reflexive dispositions promoting contact and engagement with the external environment. These approach/engagement reflexes are supplemented by suckling and ingestive reflexes of brain stem origin, which are considered later in this chapter. At a trivial level, flexor and extensor reflexes promote diametrically opposing motoric dispositions. The spinal circuits for these reflexes are distinct and separately organized, and they include differences in peripheral sensory receptors, afferent axonal populations, central interneuronal pathways, and motoneuron output pools. This is not to say that flexor/extensor reflexes are entirely independent. Although the primary neural circuits underlying flexor and extensor reflexes are parallel and distinct, there are rich interactions among these networks—an organizational pattern that Sherrington referred to as the alliance of reflexes. Examples include the crossed-extension reflex, in which activation of the flexor reflex in one limb is associated with a reflex extension of the opposite limb. Sherrington also described interactions among networks for opponent flexor and extensor reflexes for a given limb as a pattern of reciprocal innervation. Reciprocal innervation is the property by which spinal reflex networks that activate a specific outcome (e.g., limb flexion) also tend to inhibit opponent (e.g., extensor) muscles, which synergistically promote the target response. These organizational patterns are not unique to spinal circuits but represent general neuroarchitectural features that may inform the operations of higher-level systems as well. Behavioral manifestations of the principle of reciprocal innervation, for example, can be seen even at a cognitive level. One example comes from the cognitive dissonance literature, where the mere selection of an item from among several choices results in increased cognitive valuation of the chosen item and concurrent devaluation of the nonselected items (Aronson & Carlsmith, 1963; Egan, Santos, & Bloom, 2007). The integrative outputs of spinal approach/withdrawal circuits may provide a basic model for understanding higher-level evaluative processes. For example, flexor withdrawal and extensor approach reflexes are not symmetrical in strength because flexor withdrawal reflexes are significantly more potent than their antagonistic extensor (approach) reflexes and recover more rapidly after spinal transection. As is considered later, asymmetric strength of evaluative systems is also apparent at higher levels of the neuraxis where avoidance reactions (anxiety, fear) tend to
c32.indd 618
have a stronger hold on affect when compared to approach reactions (incentive, reward). This makes adaptive sense because a single failure of the avoidance system can lead to subsequent injury or death. Natural selection may have tuned the avoidance system for preferential control of behavior. The bias toward avoidance reactions represents a occurring theme at all levels of the neuraxis and has been termed the negativity bias (Cacioppo & Berntson, 1999; Cacioppo, Larsen, Smith, & Berntson, 2004). Despite this negativity bias, flexor/withdrawal reflexes are not always dominant over their opponent processes because extensor/approach reflexes can take precedence over withdrawal processes at lower levels of stimulation or activation. This disposition toward approach behaviors in the context of low levels of activation has been termed the positivity offset (Cacioppo & Berntson, 1999; Cacioppo et al., 2004) and characterizes the operations of evaluative processes at multiple levels of the neuraxis. As we consider later, the asymmetry of neurobehavioral dispositions can lead to a context-dependent outcome because approach dispositions may predominate at lower levels of evaluative activation but can be trumped by avoidance or withdrawal (negativity bias) at higher levels of evaluative activation. Spinal flexor and extensor reflexes have separate, although interacting, circuitries and thus can operate in parallel within the constraints of those neural interactions. Despite this underlying bivalence, the behavioral output of opponent extensor/flexor networks may lie along a bipolar continuum from flexion to extension, the output being constrained by the mechanical coupling of the extensor and flexor muscles around a specific point of articulation at a joint. Neural Hierarchies Multilevel perspectives of neuronal organization have been emphasized by scientists and philosophers alike, among the more influential of whom was the nineteenthcentury neurologist John Hughlings Jackson. In his essay “Evolution and Dissolution of the Nervous System,” Jackson (1884/1958) laid the groundwork for multilevel characterization of neuronal organization. Jackson argued that the evolutionary emergence of higher levels of neuronal organizations does not involve a replacement or displacement of lower levels. Rather, evolutionary development entails a re-representation and elaboration of functions at progressively higher levels of the nervous system. Although rostral levels were thought to be characterized by elaborate networks capable of more sophisticated functions, they were not seen to replace lower levels but in fact remain highly dependent on lower neuraxial substrates. For example, the critical spinal networks and related locomotor reflexes for stepping constitute essential lower
8/19/09 4:11:53 PM
c32.indd 619
Out/Input
Out/Input
Out/Input
Out/Input
Broad/flexible
High
Integrative capacity
Output repertoire Limited/rigid
Out/Input
619
Low
Processing mode
Hierarchical/Heterarchical Organization
Parallel/rapid
processing circuits that support outputs from higher motor systems. In Jackson’s view, the proper interpretation of the consequences of brain injuries is that these injuries are not optimally defined by the functions that are lost but rather in the reversion (dissolution) of those functions to lower levels of neural organization. It is now apparent that the neuraxis is replete with hierarchical organizations composed of simple reflexlike circuits at the lowest levels, such as the brain stem and spinal cord, and neural networks for more integrative computations at higher levels (for reviews, see Berntson, Boysen, & Cacioppo, 1993; Berntson & Cacioppo, 2000; Berridge, 2004). The relatively simple neural circuitry characteristic of lower levels of the neuraxis is essential for survival because it allows for rapid computations and subsequent motor outputs. The adaptive function of such circuits is obvious because it may be more important in some circumstances to perform a rapid but imperfect response rather than a more elaborate and protracted performance that may produce a more elaborate outcome. The additional time consumed by such processes could lead to a negative outcome. As environmental challenges grow increasingly complex, more integrated neuronal processing may be more adaptive, and higher level analytical and response mechanisms may come into play. Moreover, learned anticipatory processes may promote more strategic avoidance of adaptive challenges prior to their occurrence. The increasing amount of information that must be processed and integrated by progressively higher-level systems may lead to neurocomputational bottlenecks that require a slower and more serial mode of processing. Based on hierarchical interconnections, higher-level systems may depend heavily on lower-level systems for the transmission and preliminary processing and filtering of afferent sensory and perceptual data and for the implementation of sensorimotor subroutines that support executive outputs. The advantages and disadvantages associated with higher-level (integrative, flexible, but capacity-limited) and lower-level (rapid, efficient, but rigid) processing were a likely source of evolutionary pressure for the preservation of lower-level substrates, despite higher-level elaborations and re-representations (Berntson & Cacioppo, in press). Together these interacting hierarchical structures allow neural systems to rapidly respond through low-level processing (e.g., pain-withdrawal reflexes), whereas more rostral neural substrates permit a more elaborate response over time and allow for evaluation of future strategies and subsequent consequences. Hierarchical representations do not merely reflect theoretical models or cognitive curiosities but are empirically documented by neuroanatomical and functional analyses of neural systems throughout the brain (Berntson et al., 1993; Figure 32.1).
Serial/slow, capacity-limited
Levels of Organization in the Nervous System
Figure 32.1 Hierarchical and heterarchical organizations. Note: A heterarchy differs from a hierarchy (illustrated by solid arrows) by the additional presence of long ascending and descending pathways that span intermediate levels (dashed arrows). Properties of the levels in both classes of organizations lie along the illustrated continua of processing mode, integrative capacity, and output repertoire. Heterarchical organizations have greater integrative capacity and output flexibility because the long ascending and descending projections provide inputs and outputs that are not constrained by intermediate levels.
Neural Heterarchies Additional neuroarchitectural complexities exist beyond strict hierarchical organization patterns because long descending pathways exist that bypass intermediate levels and directly synapse onto lower levels of the neuraxis (Porter, 1987; Wakana, Jiang, Nagae-Poetscher, Zijl, & Mori, 2004). This type of organization is documented by the existence of direct, long descending projections from higher neuraxial systems to lower motor neurons, effectively bypassing intermediate levels. In addition to the well-known anatomy of somatomotor systems (Porter, 1987; Wakana et al., 2004), this pattern of organization is also apparent in the autonomic nervous system (Berntson & Cacioppo, 2000). As illustrated in Figure 32.2, for example, the baroreflex is a tightly organized brain stem–mediated reflex system that serves to maintain blood pressure homeostasis. Increases in blood pressure activate specialized cardiovascular mechanoreceptors, which then feed back into brain stem reflex circuitry, leading to reciprocal increases in vagal cardiac output and decreases in sympathetic cardiac and vascular tone. These responses collectively lead to decreases in heart rate, cardiac output, and vascular tone, which synergistically compensate for the blood pressure perturbation. In contrast to this lower-level, homeostatic reflex regulation, higher-level systems (e.g., with even mild psychological stress) are capable of overriding the baroreflex and yielding concurrent increases in
8/19/09 4:11:54 PM
620
Evaluative Processes Baroreceptor Afferents NTS
Parasympathetic Motor Neurons
Cvlm
nA DMX
IML
Rvlm (PGi)
Sympathetic Motor Neurons
Heart CAs Vasculature
blood pressure and heart rate. This nonhomeostatic modulation of cardiovascular may arise in part from descending inhibition of brain stem baroreflex networks. It also likely reflects the actions of long descending projections from higher neurobehavioral substrates that bypass intermediate reflex circuits and project monosynpatically to lower autonomic source nuclei (see Figure 32.3). As a result, cortical and limbic structures are able to bypass intermediate hierarchical elements and directly control lower levels (see Berntson et al., 1994). The presence of long ascending and descending pathways in neural organizational patterns, combined with lateral interconnections between levels, has previously been described as a neural heterarchy (see Berntson et al., 1993; Berntson & Cacioppo, 2000). Heterarchical organization patterns have the components of hierarchical systems, as higher levels are in continuous communication with lower-level systems via intermediate levels, but they have the additional capacity to interact over widely separated levels via direct connections. Direct neuronal projections from higher brain systems to lower-level systems allow for manifestations of higher computational re-representative networks that are not constrained by intermediate-level organizations. This affords cognitive and behavioral flexibility when needed but also allows for intermediate-level processing when necessary. The multiple levels of organization and associated functional flexibility come with a disadvantage because a heterarchical organization opens the possibility for functional conflicts between distinct levels of processing (e.g., when an organism must inhibit pain
c32.indd 620
Figure 32.2 (Figure C. 34 in color section) Summary of brain stem systems underlying the baroreceptor cardiac reflex. Note: Baroreceptor afferents project to nucleus tractus solitarius (NTS), which in turn leads to activation of parasympathetic motor neurons in the nucleus ambiguus (nA) and dorsal motor nucleus of the vagus (DMX). The NTS also activates the caudal ventrolateral medulla (Cvlm), which in turn inhibits the rostral ventrolateral medulla (Rvlm), leading to a withdrawal of excitatory drive on the sympathetic motor neurons in the intermediolateral cell column of the spinal cord (IML). CAs ⫽ Catecholamines; PGi ⫽ Nucleus paragigantocellaris (coextensive with Rvlm).
withdrawal to achieve a higher-order goal). We return to this issue later. Levels of Evaluative Function: Intermediate Levels—Decerebration Although primitive approach/withdrawal dispositions are represented at spinal levels, they are substantially developed and elaborated at brain stem levels. Classical demonstrations of the functional capacity of brain stem networks come from studies of experimental isolation of the brain stem and spinal cord (i.e., decerebration) and from tragic cases of human decerebration (Berntson & Micco, 1976; Berntson, Tuber, Ronca, & Bachman, 1983; Harris, Kelso, Flatt, Bartness, & Grill, 2006; Ronca, Berntson, & Tuber, 1986; Tuber, Berntson, Bachman, & Allen, 1980; Yates, Jakus, & Miller, 1993). Although acute postsurgical somatomotor rigidity historically obscured the behavioral capacities of the experimental decerebrate, with longer survival times and the resolution of this rigidity a great deal of organizational capacity is apparent at brain stem levels (Bard & Macht, 1958; Berntson & Micco, 1976; Norman, Buchwald, & Villablanca, 1977). Decerebrate animals, for example, can right themselves and locomote; eat and drink on encountering appropriate goal objects; groom; and display aggressive, defensive, and escape behaviors to noxious stimuli (see Adams, 1979; Berntson & Micco, 1976; Norman et al., 1977). Considerable functional capacity is also apparent in tragic cases of human decerebration (anencephaly and
8/19/09 4:11:54 PM
Levels of Organization in the Nervous System
621
BF
LC Baroreceptor Afferents NTS
Parasympathetic Motor Neurons
Cvlm
nA DMX
IML
Rvlm (PGi)
Sympathetic Motor Neurons
Figure 32.4 Neurological status of a decerebrate infant.
Heart CAs Vasculature
Figure 32.3 (Figure C. 33 in color section) Expansion of the baroreflex circuit of Figure 32.2 to illustrate the ascending and descending pathways to and from rostral neural areas such as the medial prefrontal cortex, hypothalamus, and amygdala. Note: Ascending systems include routes from the rostral ventrolateral medulla (Rvlm) and the nucleus of the tractus solitarius (NTS) to the locus coeruleus (LC) noradrenergic system and indirectly to the basal forebrain (BF) cortical cholinergic system. CAs = Catecholamines; Cvlm = Caudal ventrolateral medulla; DMX = Dorsal motor nucleus of the vagus; IML = Intermediolateral cell column of the spinal cord; nA = Nucleus ambiguus; PGi = Paragigantocullar nucleus (partially coextensive with Rvlm).
hydranencephaly), generally resulting from a failure of cell migration early in neurodevelopment. Although these infants generally do not survive for more than a few weeks after birth, they show a relatively intact array of infantile reflexes, including flexor and extensor reflexes, stepping reflexes, and a wide range of brain stem reflexes including tonic neck reflexes and suckling reflexes, among others. Figure 32.4 illustrates transillumination of the scalp and a representative CAT scan from a decerebrate human infant. Despite the virtual lack of any neural tissue above the diencephalon, this infant showed basic manifestations of evaluative processing. In addition to displaying typical pain-withdrawal reflexes, she would fuss and cry in
c32.indd 621
Note: Top: Results of transillumination of the head (viewed posteriorly— tip of left ear on leftward side). The dark region toward the base of the head is the cerebellum; note that there is little to occlude the light above that level. Bottom: Results of CAT scan at four horizontal planes of the head (front of the head is up) from dorsal (left) to ventral (right). Light areas indicate more radiodense areas such as bone and neural tissue, dark regions more radiolucent areas. Note the clear appearance of the skull but the absence of brain tissue on the left. In the two middle planes, the cerebellum is apparent posteriorly (bottom of the cranial vault). In the lowest plane (right), diencephalic and other brain stem tissue is present.
response to noxious stimuli and could be quieted and comforted with contact and rocking. This infant also showed typical appetitive responses and would suckle and ingest milk sufficient to maintain body weight. It is worth noting that brain stem neurobehavioral substrates do not entail a mere assemblage of rigidly regulated and tightly organized reflex networks because both decerebrate animals (Mauk & Thompson, 1987; Norman et al., 1977) and humans (Berntson et al., 1983; Tuber et al., 1980) display neural plasticity and associative learning. Intake/Rejection Responses and Taste Hedonics Among the more thoroughly studied brain stem evaluative processes are those supporting approach/avoidance action dispositions related to taste hedonics. Similar to the organization of the spinal cord, the neuroarchitectures underlying approach vs. avoidance dispositions appear to be relatively independent and under separate control in brain stem circuitry (Berntson et al., 1993; Berridge & Grill,
8/19/09 4:11:55 PM
622
Evaluative Processes
1984; Steiner, Glaser, Hawilo, & Berridge, 2001). Taste hedonics and associated intake/rejection responses offer a prime example of brain stem evaluative systems. Orofacial displays to taste, represented by stereotyped, reflex-like negative rejection/ejection responses to aversive stimuli (gaping, tongue protrusion) and positive intake responses (smiling, licking, swallowing) are well conserved in mammals. Such responses can be seen early in development and are readily apparent in decerebrate organisms. The positive and negative responses to gustatory stimuli mirror the evaluative reflexes of the spinal cord in that they reflect opposing patterns of approach/avoidance dispositions. Similar to spinal reflexes, the behavioral output of these systems cannot be interpreted as lying along a single bipolar continuum extending from approach (highly positive) to avoidance (highly negative). Although this depiction can be useful, it belies the underlying complexity of hedonic processes because experimental evidence suggests that gustatory approach/withdrawal systems are partially independent and do not converge on a single hedonic integrator (Berridge & Grill, 1984). Just as a person can tighten extensor and flexor muscles simultaneously, intake and rejection responses are not incompatible and can become coactive. For example, although the probability of rejection responses to a glucose solution increases following the addition of a bitter compound, this can occur without a reciprocal reduction in probability of intake responses. Similarly, increasing both bitter and sweet concurrently leads to increases in both intake and rejection responses. Thus, it is clear that taste preference, as measured by behavioral consumption and represented on a bipolar scale, does not always represent the underlying bivariate hedonic state. This does not rule out interaction between the approach/avoidance responses, but suggests that the mixing positive and negative valences of hedonic stimuli do not simply yield a null average of the two or a state of indifference (Berridge & Grill, 1984). Gustatory approach/avoidance responses are represented by distinct positive and negative hedonic dimensions that conform to the positivity offset and negativity bias as described previously. Gustatory evaluative processes mediated by brain stem systems are more complex than their behavioral output (total intake continuum), and knowledge of this fact facilitates a more accurate description of evaluative processes based on the underlying bivariate substrates. Levels of Function: Higher-Level Rerepresentations As we move to the highest levels of the neuraxis, the rerepresentation and elaboration of evaluative processes becomes ever more apparent, and neuron-organizational
c32.indd 622
complexity expands dramatically. The brain stem and spinal cord are highly sensitive to aversive and hedonic stimuli and can yield appropriate behavioral responses, but this so-called reptilian brain (MacLean, 1985) lacks much of the behavioral flexibility and adaptability characteristic of intact organisms. Although decerebrates may ingest palatable foods, they do not display typical goal-seeking behavior in the absence of a food stimulus but rather are prisoners of the momentary stimulus or environmental context (see Berntson et al., 1993; Berntson & Micco, 1976). Decerebrates lack the flexibility and variety of behavior seen in intact animals because of the devolution of the nervous system to its more primitive representations. It is not until the development of the paleomammalian brain (limbic system and archicortex) and the neomammalian brain (neocortex) that we see the full evolution and elaboration of evaluative processes (MacLean, 1985). It is with the development of rostral brain structures that we begin to see the emergence of goal-directed behaviors that reflect anticipatory processes and expectancies that liberate the organism from the immediate exigencies of this stimulus or that. In view of the expanding complexity of rostral evaluative substrates, it seems unlikely that these networks would simplify from the basic bivariate evaluative structure of lower substrates to become a single bipolar hedonic integrator. In contrast, with the expanding cognitive and computational complexity of evaluative processes at higher neuraxial levels, there is a parallel expansion of the complexity of the underlying mediating neural systems. Higher evaluative processes entail planning, strategizing, and engaging in anticipatory processes that can require access to associative networks, attentional and computational resources, and so on. Moreover, whereas lower evaluative substrates may entail simple approach/withdrawal dispositions, higher motivational processes become further differentiated and nuanced. Berridge (1996) characterized the “liking” aspects of motivation as those that entail the hedonic and response-eliciting properties of a stimulus or motivational context. These are apparent in the orofacial intake/ ingestive responses to positive hedonic tastes as described previously for the decerebrate organism. The decerebrate, however, largely lacks what Berridge termed the wanting aspects of motivation, which entail an attentional focus on, and goal-seeking behaviors directed toward, a desired stimulus, state, or context. This latter aspect of evaluative processes is heavily dependent on the increased computational capacity of higher levels of the neuraxis and is mediated by more elaborate neural circuitry. It should not be surprising that the neuroarchitecture of higher evaluative processes entails more complex and distributed networks that are not as readily dichotomized into positive and negative substrates as is the case with
8/19/09 4:12:00 PM
Levels of Organization in the Nervous System
lower level representations. Indeed, many computational, attentional, and memorial processes may be commonly deployed for positive and/or negative evaluative processing. Moreover, the further development and elaboration of evaluative systems, such as that between “liking” and “wanting,” may entail added neuroanatomical complexity. Historically, the nucleus accumbens (nACC) has been depicted as a neural integrator of reward and positive hedonic states (Berridge & Grill, 1984; Hoebel, Rada, Mark, & Pothos, 1999; Koob, 1992). In the 1940s, Robert Heath, working on psychiatric patients with indwelling electrical brain stimulators, showed that patients would report pleasurable states and would self-administer stimulation to various brain regions, especially areas in and around the nACC (Heath, 1972). More recently, electrical stimulation of the nACC has been reported to elicit a smile associated with euphoric responses (Okun et al., 2004). It is now clear that nearly all rewarding stimuli or positive hedonic states are associated with dopamine release in the nACC, and lesions or blockage of dopamine receptors in the nACC reduces rewards and positive hedonics (Hoebel et al., 1999; Robinson & Berridge, 2003; Wise, 2006; see also Chapter 40). In this regard, the nACC contrasts with the amygdala, which has generally been implicated in fear conditioning, negative affect, and aversive states (see Chapter 39), a topic to which we return. Although these findings are consistent with a differentiation of positive and negative neural substrates at higher levels of the neuraxis, similar to that seen at lower levels, there are added complexities in higher substrates. The nACC, in fact, may not be a simple monolithic reward integrator. Recent work has suggested important phenomenological and computational distinctions within the nACC. For example, the liking (positive hedonic effect, reward) and wanting (incentive salience, goal striving) aspects of hedonic states are mediated by distinct anatomical regions of the nACC (Berridge, 1996; Pecina & Berridge, 2005). Moreover, negative stimuli may also activate the nACC, and other distinct areas may be involved in suppression of negative evaluative processing (Pecina & Berridge, 2005). These complexities caution against the overly simplistic ascription of discrete neural loci to the mediation of complex neuropsychological phenomena. Nevertheless, there remain clear differentiations between higher neural substrates mediating positive and negative evaluative processes. A hemispheric lateralization of positive and negative evaluative processes has been reported, with the right hemisphere implicated more in negative affective processing or avoidance dispositions and the left hemisphere involved more in positive affect or approach dispositions (Cacioppo & Gardner, 1999; Davidson, 1990; Harmon-Jones, Vaughn,
c32.indd 623
623
Mohr, Sigelman, & Harmon-Jones, 2004). For example, positive affective stimuli induce greater activation in the left hemisphere (Canli, Desmond, Zhao, Glover, & Gabrieli, 1998; Davidson, 1998, 2004; Lee et al., 2004; Nitschke, Sarinopoulos, Mackiewicz, Schaefer, & Davidson, 2006; Pizzagalli, Sherwood, Henriques, & Davidson, 2005), and patients with damage to the left hemisphere have a higher probability of experiencing depression and overall negative affect (Davidson, 1998). Similarly, facial expression and reaction time data suggest a left hemisphere predominance for positive affect and a greater right hemisphere representation for negative affect (Davidson, Shackman, & Maxwell, 2004; Root, Wong, & Kinsbourne, 2006). The relative right hemispheric bias for withdrawal/avoidance reactions may be related to the right lateralization of visceral/nociceptive afferent projections (Craig, 2005) and is consistent with the finding that left insula stimulation gives rise to parasympathetic cardiac activation whereas right insula stimulation induces sympathetic activation (Oppenheimer, 1993, 2006). Furthermore, within-hemisphere differentiation is also apparent in cortical representations. Pleasantness rating of odors, for example, was related to the degree of medial orbitofrontal activation as measured by fMRI, whereas unpleasantness was more related to activation of the dorsal anterior cingulate (Grabenhorst, Rolls, Margot, da Silva, & Velazco, 2007). Similarly, deciding on the lesser of two punishments yielded greater activation in the dorsal anterior cingulate, whereas deciding between the larger of two rewards yielded greater activation in the ventromedial prefrontal cortex (Blair et al., 2006). The amygdala has been especially implicated in fear and negative affect since the classic studies of Walter Rudolf Hess (1954) on brain stimulation in the waking animal. The amygdala appears to be a critical nodal point in subcortical circuits that allow for rapid detection and response to threat and for the learning of fear-related cues (LeDoux, 1996; Öhman & Mineka, 2001). These circuits allow for more elaborate processing of threat-related cues than do lower-level brain stem substrates but remain highly efficient because they can operate without the need for extensive cortical processing (Larson et al., 2006; LeDoux, 1996; Öhman & Mineka, 2001; Tooby & Cosmides, 1990). Although the amygdala may also participate in classical thalamo-cortical-limbic circuits, direct thalamo-amygdala pathways are a sufficient substrate for fear reactions and simple fear conditioning, providing for a “quick and dirty transmission route” (LeDoux, 2000). The thalamoamygdala subcortical circuit may support simple fear conditioning and fear reactions in the absence of awareness (“blindsight”) following visual cortical injuries (see De Gelder, Vroomen, Pourtois, & Weiskrantz, 1999; Pegna,
8/19/09 4:12:00 PM
624
Evaluative Processes
Visual Cortex
Thalamus
Amygdala
Brain Stem
Defensive Responses Thalamo-Cortico-Amygdala Pathway Thalamo-Amygdala Pathway
Figure 32.5 Schematic representation of the classical thalamocortical visual pathway, where afferent information is conveyed to the cortex via the relay nucleus of the thalamus (lateral geniculate nucleus). Note: Also illustrated is an alternative thalamo-amygdala route that can bypass the cortex and mediate rapid fear and defensive responses to certain classes of aversive stimuli (see LeDoux, 2003).
Khateb, Lazeyras, & Seghier, 2005; Weiskrantz, 1986). In contrast, relational learning (e.g., contextual conditioning) and the processing of more complex threat-related cues may be more dependent on higher-level cortical processing (Berntson, Sarter, & Cacioppo, 1998; see also Chapter 39). Recent research supports this heterarchical organization showing that auditory fear conditioning induces plasticity in amygdala neurons prior to apparent changes in cortical areas, suggesting that early plasticity in amygdala neurons results from direct thalamo-amygdala projections (Öhman & Mineka, 2001; Quirk, Armony, & LeDoux, 1997; Quirk, Repa, & LeDoux, 1995; see Figure 32.5). The more direct, efficient, but relatively limited direct thalamo-amygdala and the more elaborate, integrative, and flexible thalamo-cortical-amygdala circuits represent distinct heterarchical levels of processing. Fear versus Anxiety Fear is a reaction to an explicit threatening stimulus, with escape or avoidance the outcome of increased cue proximity (see Chapter 49). Anxiety is a more general state of distress, typically longer lasting, prompted by less explicit or more generalized cues, and involving physiological arousal but often without organized functional behavior (Berntson et al., 1998; Lang, Davis, & Ohman, 2000). The amygdala appears to be especially critical for simple fear conditioning and fear potentiation of startle (LeDoux, 2003; Phelps & LeDoux, 2005; Walker, Toufexis, & Davis, 2003). Inactivation of the lateral nucleus of the
c32.indd 624
amygdala, for example, blocks the conditioned fear response in rats and attenuates fear-potentiated startle (LeDoux, 2003; Walker et al., 2003). Conversely, although the amygdala may play a role in anxiety-like responses, inactivation of the central nucleus of the amygdala does not attenuate anxiety-like behavior in mice (Walker & Davis, 1997). Rather, the bed nucleus of the stria terminalis and the medial prefrontal cortex may be more specifically involved in anxiety-like reactions to longer-lasting, more generalized threat cues. Lesions of the bed nucleus of the stria terminalis, for example, disrupt light-induced startle potentiation (which has been suggested to be a model for anxiety) but largely spare simple, conditioned fear-potentiated startle (Walker & Davis, 1997; Walker et al., 2003). Furthermore, lesions of the basal forebrain cortical cholinergic pathway or its termination in the medial prefrontal cortex disrupt anxiety-like responses but spare simple fear conditioning (Berntson et al., 1998; Hart, Sarter, & Berntson, 1999). Whereas cortical systems may not be necessary in explicit fear responses, they appear to be critical for the processing of more complex stimuli and for contextual fear conditioning (Knox & Berntson, 2006; LeDoux, 2000; Phillips & LeDoux, 1992; Stowell, Berntson, & Sarter, 2000). Mental imagery or anticipation of aversive or anxiogenic contexts induces activation in the bed nucleus of the stria terminalis as well as in cortical areas, including the medial prefrontal cortex and the anterior cingulate cortex (Kosslyn et al., 1996; Shin et al., 2004; Straube, Mentzel, & Miltner, 2007). Gray and McNaughton (2000) incorporated much of this information into a two-dimensional defense system model that makes clear anatomical, behavioral, and functional distinctions between fear and anxiety. The first dimension is a qualitative distinction between systems controlling defensive avoidance (fear) and defensive approach (anxiety). The two states often display opposite characteristics—fear produces speed toward or away from a stimulus whereas anxiety produces slowness, caution, and deliberation (see also Chapter 36). The second dimension is based on functional and organizational properties inherent to the neuroarchitectural substrates involved in the two qualitative distinctions. These distinctions are characterized in a hierarchical manner whereby substantial overlap between the two systems exists at caudal levels (periaqueductal gray and medial hypothalamus). As one moves rostrally, some differentiation may emerge (e.g., anterior cingulate for defensive avoidance and posterior cingulate for defensive approach), but the more significant perspective concerns the level of requisite processing (see Chapter 36). This is consistent with the present heterarchical model. The multiple heterarchical levels represent at least partially distinct processing substrates and may function in
8/19/09 4:12:01 PM
Levels of Organization in the Nervous System
partial independence from other levels (Berntson et al., 1998). This is an issue to which we return (see “Multilevel Organization and Its Conflicts”). Generally, however, different levels are in constant reciprocal communication with one another and are capable of shifting from approach to avoidance defensive strategies at a moment’s notice and displaying coactivity of substrates (Gray & McNaughton, 2000). The multiplicity in heterarchical levels may preclude simple isomorphic mappings between affect in the psychological domain and neural substrates in the biological domain (Berntson, 2006). The complexity in brain– behavior mapping in affective processes is illustrated by recent findings on the role of the amygdala. The Amygdala The amygdala is one of the most well-studied neural structures. It has been the subject of neuroscientific as well as psychological research for decades and is central to many theories of affect and evaluative processing. In general accord with animal studies, imaging studies in humans have reported amygdala activation during emotion, especially with negative emotions (Critchley et al., 2005; Irwin et al., 1996; Sabatinelli, Bradley, Fitzsimmons, & Lang, 2005; Zald & Pardo, 1997), and patients with amygdala damage show attenuated negative affect (Tranel, Gullickson, Koch, & Adolphs, 2006) and deficits in emotional memory (Buchanan, Tranel, & Adolphs, 2006; LaBar & Cabeza, 2006; Phelps, 2006; Phelps & LeDoux, 2005). Although the amygdala has been implicated in a range of processes extending from fear conditioning to emotional memory to aversive reactions, the precise role of this structure has not been fully clarified. This issue was pursued in a recent study of patients with amygdala damage (Berntson, Berchara, Damasio, Tranel, & Cacioppo, 2007). Participants rated a set of images from the International Affective Picture System (Lang, Bradley, & Cuthbert, 1999) on perceived valence (extending from highly positive to highly negative picture content) and on affective intensity (i.e., how aroused the images made them feel). As illustrated in Figure 32.6, patients with damage to the amygdala were comparable on their ratings of valence of the picture content to persons in a norm group and to control patients with lesions that spared the amygdala. Patients were quite capable of recognizing and appropriately labeling positive and negative aspects of the stimuli. When compared to other groups, however, amygdala lesion patients significantly differed on their ratings of emotional arousal or intensity (see Figure 32.6). Control patients and the norm group showed the expected increases in arousal ratings as the images approached either positive or negative extremes. Amygdala patients also displayed an increase in arousal to the more positive images. They did
c32.indd 625
625
not, however, show a parallel arousal gradient to negative stimuli. Although the amygdala patients clearly recognized and labeled the negative images, they did not display the expected affective response. These findings are in agreement with a previously reported double dissociation between cognitive and affective processes in brain-damaged patients (Bechara et al., 1995). Consistent with the animal literature, a patient with amygdala damage failed to develop a typical conditioned autonomic response to a conditioned stimulus that was paired with a loud noise, despite the fact that this patient acquired declarative knowledge about the relation between the conditioned stimulus and the noise. This parallels the dissociation between the cognitive and arousal dimensions in the affective picture task of the Berntson et al. (2007) study. In contrast, a patient with damage to the hippocampus (sparing the amygdala) developed a conditioned autonomic response to the conditioned stimulus but could not cognitively describe the experimental contingencies (Bechara et al., 1995). These dissociations between cognitive knowledge and affective/autonomic responses reflect the multiple levels at which evaluative processing can occur. They also document the further differentiation between dimensions of evaluative processing, even within a given valence, at higher neural levels. In view of this elaboration and differentiation, it is highly unlikely the basic delineation between positive/approach and negative/withdrawal dispositions would devolve into a single affective continuum. Although there may be a perceived continuum between positive and negative affect, this perception may not accurately reflect the distinct neural substrates for these affective dimensions. In a recent fMRI study, Grabenhorst et al. (2007) reported that pleasant (jasmine) and unpleasant (indole) odors resulted in similar activations in primary olfactory areas (pyriform cortex), with these activations being correlated with odor intensity. The pleasant and unpleasant odors, however, differentially activated other distinct brain regions (e.g., medial orbitofrontal cortex and dorsal anterior cingulate cortex, respectively). Although a mixture of the two odors was rated as pleasant, it continued to show distinct activations in both the medial orbitofrontal cortex (where activations were correlated with pleasantness) and the anterior cingulate (where activations were correlated with unpleasantness). The authors concluded, “Mixtures that are found pleasant can have components that are separately pleasant and unpleasant, and the brain can separately and simultaneously represent the positive and negative hedonic value.” (p. 13532). Differential activation of positive and/or negative evaluative substrates may guide behavior even in the absence
8/19/09 4:12:04 PM
626
Evaluative Processes
(A) Amygdala Lesion
(A)
Arousal and Stimulus Categories 8 7
(B) Contrast Lesion
Arousal Rating
6 5 4 3 2 Amyg 1
Cnt Norm
0 Very Pos.
Mod. Pos.
Neutral
Mod. Neg.
Very Neg.
Stimulus (B)
Valence and Stimulus Categories 4 Amyg 3
Positivity Ratings
Cnt Norm
Figure 32.6 A: Lesions and arousal and valence ratings in the evaluative picture-rating study; B: Mean (SEM) arousal (I) and valence (II) ratings across stimulus categories for patients with amygdala lesions compared with the clinical contrast group and normative control data. Note: (A) (I) Illustrative bilateral lesion of the amygdala secondary to herpes simplex encephalitis. (II) Example of one of the smaller lesions in the lesion contrast group that spared the amygdala. All groups effectively discriminated the stimulus categories and applied valance ratings accordingly. (B) All groups also displayed comparable arousal functions to positive stimuli, but the amygdala group showed diminished arousal selectively to the negative stimuli. Neg = Negative; Pos = Positive. From “Amygdala Contribution to Selective Dimensions of Emotion,” by G. G. Berntson, A. Bechara, H. Damasio, D. Tranel, and J. T. Cacioppo, 2007, Social Cognitive and Affective Neuroscience, 2, pp. 123–129, pp. 3 & 5. Reprinted with permission.
of awareness. This is consistent with a report that a patient with damage to the primary gustatory cortex (and other areas) was unable to recognize or even distinguish sweet (positive) from saline (aversive) solutions and would drink either avidly. When given a choice between the two, however, this patient would consistently choose the sweet solution, although he could not explain why (Adolphs, Tranel, Koenigs, & Damasio, 2005). In this case, higher-level substrates for cognitive recognition and labeling were disrupted, but lower heterarchical systems were able to guide behavioral choice in the absence of cognitive awareness.
c32.indd 626
Valence Rating
2 1 0 ⫺1 Negativity Ratings ⫺2 ⫺3 ⫺4 Very Pos.
Mod. Pos.
Neutral
Mod. Neg.
Very Neg.
Stimulus
MULTILEVEL ORGANIZATIONS AND THEIR CONFLICTS Evaluative processes evidence a cardinal feature of bivalence in their functional architecture and are represented at multiple levels of the neuraxis. Although these bivalent substrates may interact, they retain at least some degree of independence and separability. Substrates at differing levels of the neuraxis also interact in a heterarchical fashion, but they, too, entail at least partially distinct organizations with differential processing capacities and
8/19/09 4:12:04 PM
differential access to sensory, perceptual, memorial, and cognitive information. The lowest heterarchical levels provide for rapid, albeit rather inflexible, information processing and adaptive reflexive reactions. Higher levels are capable of broader integration of information, expanded neural computations, and a richer and more flexible array of actions and outputs. An important question concerns the determinants of the level or levels of processing that are deployed in a given situation. In their elaboration likelihood model, Cacioppo and Petty (1984; Petty, Cacioppo, Strathman, & Priester, 2005) distinguish between what they term central and peripheral routes to persuasion and attitude dispositions. The central route is characterized by higher-level cognitive deliberation, whereas the peripheral route is less processing dependent and entails appeal to authority, reliance on preexisting biases or prejudices, and so on. In this model, important determinants of which route will predominate are the availability information as well as cognitive resources and motivation for deliberative consideration that are necessary to support the central, as opposed to the peripheral, processing route. Processing may also occur at multiple levels, with the output or action reflecting some aggregate manifestation or the predominance of one or another level. As discussed previously, higher neural systems can inhibit or override lower-level substrates, but more typically, complex interactions and recurrent processing may occur across levels. In their iterative processing model, Cunningham and Zelazo (2007) propose recurrent, reciprocal communications across processing levels. In this scheme, lower level substrates may provide affectively laden information regarding the valance and the arousal dimensions of a particular stimulus or context to higher evaluative processing substrates, which in turn can then modulate lower-level processing systems. In some cases, the bivalent organization and multiple levels in evaluative processing substrates may lead to conflicts. The coexistence of both positive and negative attributes to an object or outcome does not necessarily result in a neutral dispositional state, as might be implied by a bipolar evaluative model. Ambivalence is not the simple equivalent of indifference. Ambivalence may reflect the coactivation of both positive and negative evaluations. In his classic studies on conflict, Neal Miller (1959, 1961) used behavioral measures (e.g., running speed or the strength of pull on a tether to approach a reward or avoid a noxious stimulus) to assess motivational dispositions in rats. A typical gradient of an approach disposition to a food reward is illustrated in Figure 32.7 as a function of the proximity of the animal to the goal box. Similarly
c32.indd 627
ce
an
h
roac
App
oid Av
627
Magnitude
Multilevel Organizations and Their Conflicts
Distance from Goal
Figure 32.7 Miller ’s (1959, 1961) approach/avoidance conflict. Note: Approach (solid line) and avoidance (dashed line) gradients as a function of distance from the goal. Goal items include food (positive incentive) and shock (negative incentive). The avoidance gradient has a steeper slope and predominates as the goal box is approached (negativity bias), whereas at more remote loci, the approach gradient is higher than the avoidance gradient (positivity offset). The intersection of the gradients represents the maximal conflict point, where approach and avoidance dispositions are equivalent.
illustrated is the avoidance disposition away from a shock grid at the goal box, as measured independently. Miller generally observed that the slope of the avoidance gradient tended to be steeper than that of the approach gradient, so that at a distance from the goal, the approach disposition was greater than the avoidance disposition, and vice versa at proximate locations. The two motivational dispositions (approach and avoidance) were then invoked simultaneously, by the presence of both the food and the shock grid. This introduced what Miller termed an approach/avoidance conflict. The animal would approach the goal box if placed remotely in the apparatus, but as it approached the goal, the relative strength of the avoidance disposition increased (see Figure 32.7) and the approach disposition was overcome by avoidance. At that point, the animal was in what Miller referred to as a stable conflict. Any further approach would lead to an increment in avoidance, and any movement away would lead to a relative predominance of approach. Indeed, animals showed agitation and vacillation at an intermediate distance from the goal, and that point could be predicted by the relative magnitudes of the approach and avoidance gradients as measured independently. For Miller ’s rats, the aggregate effect of a positive motivation and a concurrent negative disposition was not evaluative neutrality and sanguinity—it was ambivalence, agitation, and vacillation. Conceptually similar findings have emerged from studies on humans (Larsen, McGraw, Mellers, & Cacioppo, 2004). Good outcomes that could have been better (i.e., disappointing wins) and bad outcomes that could have been worse (i.e., relieving losses) are rated by participants toward the middle or neutral point of bipolar emotion scales (i.e., ratings along a positive to negative continuum). This
8/19/09 4:12:16 PM
628
Evaluative Processes
might suggest that such outcomes are associated with the absence of affect or indifference. If participants are presented with continuous unipolar measures of positive and negative affect, however, a very different picture emerges. When rating positive and negative separately, participants indicate the coactivation of both positive and negative affect. The participants are not indifferent—they are ambivalent (from the Latin, “both valences or vigors”). There is a conceptual parallel in this study with Miller ’s (1959, 1961) rats. Although behavioral or measurement constraints may make it appear that positive and negative evaluative dispositions lie on a continuum, these appearances may belie the underlying bivalence of the neurobehavioral substrates. In the Miller study, physical constraints precluded the concurrent motor expressions of approach and avoidance. A rat cannot simultaneously approach and avoid the same place at the same time, although it may serially express the underlying bivalent affect states in its vacillation around the equilibrium point of two opposing evaluative dispositions. In the Larsen, McGraw, Mellers, & Cacioppo (2004) study, behavioral constraints were not imposed because the two (positive and negative) unipolar affect ratings were done sequentially. With a bipolar rating scale, however, a constraint is imposed by the measurement instrument, which is grounded on a spurious bipolar theory about the underlying evaluative structure. Because of the inherent complexity in higher levels of evaluative processes, as well as physical and measurement constraints, basic positive (generally associated with approach) and negative (generally associated with avoidance) evaluative systems may not always be readily discernable in behavior. Although affective states may at times appear to lie along a continuum from positive to negative, the fundamental underlying substrates evidence a bivalent organization, even at the highest levels of the neuraxis. The cortical system represents the ultimate level of neuronal complexity and processing capacity. The mammalian neocortical system includes networks responsible for the most complex of sensory and perceptual processes, associative learning, memory, attentional focus, contextual awareness, strategizing, and outcome monitoring. In primates, the expanded cortex allows for even more elaborate processing. The additional computational power of primate neocortical structures allows for intricate social interactions that are dependent on the ability to anticipate future outcomes, run cognitive simulations, and manage social alliances. Although such complex neuropsychological phenomena would not be possible without the highestlevel brain systems, these functions have more primitive representations at lower levels of the neuraxis and, in many cases, are derivative of the lower substrates.
c32.indd 628
EVALUATIVE SPACE AND THE NEUROARCHITECTURE OF EVALUATIVE PROCESSES Wundt (1896) and Thurstone (1931) were early champions of the bipolar model of affect, in which the momentary affective states could be characterized as lying along a bipolar continuum extending from positive to negative. This view has also been incorporated into contemporary models of emotion, including the circumplex model of Russell and others (Russell, 1980, 1983; see also Posner et al., 2005). Cacioppo and Berntson have proposed an alternative, bivariate model of affect whereby the positive and negative dimensions are at least partially independent in both their conceptual and neurological bases (Cacioppo & Berntson, 1994; Cacioppo, Gardner, & Berntson, 1997; Cacioppo et al., 2004). As illustrated in Figure 32.8, this evaluative space model subsumes the bipolar model as the reciprocal diagonal and also offers a more comprehensive representative of affective states. Whereas bipolar models are unable to represent states of ambivalence, the evaluative space model readily accounts for such states as a manifestation of coactivation of both positive and negative affect. It is also in accord with the finding that positive and negative emotions are not always correlated (Larsen et al., 2004). The evaluative space model illustrates how neuroevolutionary and neurobehavioral frameworks can guide and constrain theories and models of higher neuropsychological functioning. Moreover, behavioral findings and features may inform neurobehavioral theories as well, in a reciprocal fashion. The fact that neuronal substrates of approach and avoidance are at least partially independent allows for evolutionary pressures to sculpt these circuits independently. Because the driving force in evolution is the ability to pass on genetic information, avoiding noxious or potentially lethal stimuli may assume greater adaptive importance, especially at close proximities, than approaching positive or rewarding stimuli. The latter can always be pursued subsequently if the organism lives to see another day. This may be the evolutionary basis for the negativity bias in evaluative processing, as is apparent in lower reflex substrates discussed previously. It is also apparent from the steeper slope of the avoidance gradient in Neal Miller ’s (1959, 1961) behavioral studies of conflict. Additional research utilizing event-related potentials has demonstrated a similar negativity bias in early stages of evaluative processing in humans (Cacioppo et al., 2004). Miller also observed what has been termed a positivity offset in his conflict paradigm. This refers to the fact that the approach gradient often surpasses the avoidance gradient as the distance to the goal increases beyond the equilibrium point (see Figure 32.9). Both the positivity offset and
8/19/09 4:12:17 PM
Multilevel Interactions: Examples from the Autonomic Nervous System 629 (B)
(A) Uncoupled Negative
Re
ro
a
cit
Co
y
⫺
High
Activation
0
Low Negativity
Evaluative Disposition
Positivity Low
Positive Gradient
⫹
Uncoupled Positive
cip
ty vi cti
Amplitude
High
Negative Gradient
h ig
H ity iv sit
Po w
Lo
Figure 32.8 Bivariate evaluative space. A: The bivariate evaluative plane. B: A three-dimensional depiction of evaluative space, where the surface overlying the bivariate plane represents the net approach/avoidance disposition for any location on that plane. Note: (A) The y axis represents the level of activation of positive evaluative processes (Positivity), and the x axis represents the level of activation of the negative evaluative process (Negativity). The reciprocity diagonal represents the classical bipolar model of valence that extends from high positivity (upper left) to high negativity (lower right) along a single evaluative continuum. The coactivity diagonal represents an alternative mode where both evaluative dimensions (conflict, ambivalence) are coactivated.
negativity bias are apparent in numerous behavioral contexts (see Ito & Cacioppo, 2005, for a review and empirical studies). As depicted in Figure 32.8, the negativity bias and the positivity offset are reflected in the differential slopes of the positivity and negativity functions in the evaluative space model. The overlying surface of Figure 32.8 represents the net action dispositions on both the positivity and negativity continua. Movement along the positivity axis represents the positive or approach gradient, movement along the negativity axis represents the negative or avoidance gradient, and the surface in between these extremes represents varying degrees of ambivalence. The evaluative space model in Figure 32.8 is useful in describing and explaining overall action dispositions. It should be noted, however, that this characterization could be applied to distinct levels within the evaluative heterarchy, with the overall action disposition representing a composite or aggregate of the multiple processing levels. This aggregate function, therefore, may well be dynamic, as differing levels of processing may come into play depending on the situation or context.
c32.indd 629
vity Negati
High
Low
The arrows outside of the box represent uncoupled changes in positive or negative evaluative processing. This evaluative plane provides a more comprehensive model of evaluative processes that subsumes the bipolar model as one reciprocal. (B) The insert on this figure illustrates activation functions along the positivity and negativity axes. Differences in the slopes and intercepts of these functions depict the positivity offset (higher intercept) and negativity bias (higher slope). From “Relationship between Attitudes and Evaluative Space: A Critical Review, with Emphasis on the Separability of Positive and Negative Substrates,” by J. T. Cacioppo and G. G. Berntson, 1994, Psychological Bulletin, 115, p. 412. Adapted with permission.
MULTILEVEL INTERACTIONS: EXAMPLES FROM THE AUTONOMIC NERVOUS SYSTEM The evaluative space model in Figure 32.8 may provide a broader framework for conceptualizing other neurobiological processes that have a fundamental bivariate structure. One example is the autonomic nervous system (ANS) and its neurobehavioral control (Berntson et al., 1998). Mirroring the bipolar conceptualizations of evaluative processes, historical depictions of the sympathetic and parasympathetic branches of the ANS have been of a reciprocally regulated system, with increases in activity of one branch associated with decreases in the other (Berntson, Cacioppo, & Quigley, 1991). This bipolar conceptualization arose largely out of research on basic autonomic reflexes that, like the flexor–extensor circuits, are rather rigid and lack the range and flexibility of control characteristic of more rostral systems. The efferent arm of the baroreceptor heart rate reflex, for example, entails a notable reciprocal regulation of the sympathetic and parasympathetic branches of the ANS. Baroreceptor afferents
8/19/09 4:12:18 PM
630
Evaluative Processes Reciprocal Co-activation Parasympathetic
3 Re
R)
ci
A
i oc pr
2
ty
ity
tiv
B)
A
(C
Parasympathetic (HFz)
1
(C
c -a
Co
0
⫺1
⫺2
⫺3 ⫺3
Co-inhibition ⫺2
Reciprocal Sympathetic
0 1 ⫺1 Sympathetic (⫺PEPz)
2
3
Figure 32.9 Distribution of normalized parasympathetic cardiac control (as indexed by HF [HFz]) and sympathetic cardiac control (as indexed by PEP [-PEPz]) scores across the CHASRS population, and their relation to the derived CAR and CAB metrics. Note: The overall distribution deviates considerably from the reciprocal diagonal representing a bipolar model. Individuals in the reciprocal parasympathetic quadrant would have relatively high CAB scores, whereas those in the reciprocal sympathetic quadrant would have relatively low CAB scores. An additional dimension is reflected along the coactivity diagonal. Individuals in the coactivation quadrant would have relatively high CAR scores, whereas those in the co-inhibition quadrant would have relatively low CAR scores. CAB = Cardiac autonomic balance; CAR = Cardiac autonomic regulatory capacity; CHASRS = Chicago Health and Social Relations Study; HF = High-frequency heart rate variability; PEP = Preejection period. From “Cardiac Autonomic Balance versus Cardiac Regulatory Capacity,” by G. G. Berntson, G. J. Norman, L. C. Hawkley, and J. T. Cacioppo, 2008, Psychophysiology, 45, p. 646. Reprinted with permission.
increase their rate of firing in response to mechanical distortion associated with an increase in blood pressure, and this afferent signal is conveyed to the nucleus tractus solitarius in the medulla, which is the primary visceral receiving area of the brain. The nucleus tractus solitarius subsequently issues direct and indirect excitatory projections to vagal motor neurons in the nucleus ambiguous and dorsal motor nucleus, leading to a reflexive increase in parasympathetic outflow. This yields a decrease in heart rate and a reduction in cardiac output, which tends to normalize or oppose the pressor perturbation. Projections from the nucleus tractus solitarius also indirectly suppress sympathetic outflow via inhibition of the sympathoexcitatory neurons in the rostral ventrolateral medulla. This sympathetic withdrawal acts to further slow the beat of the heart as well as to decrease myocardial contractility. Thus, the reciprocal actions of the individual branches of the ANS synergistically contribute to the homeostatic regulation of blood pressure.
c32.indd 630
The baroreceptor heart rate reflex represents a prototypic, reciprocally regulated system, having a bipolar action disposition extending from sympathetic to parasympathetic dominance. Although descriptive of some basic autonomic reflexive circuits, this characterization belies the true complexity of autonomic control of cardiovascular function. Higher-level brain structures are capable of modulating ANS activity via direct projections from forebrain structures such as the cingulate cortex (Critchley et al., 2005), amygdala (LeDoux, Iwata, Cicchetti, & Reis, 1988), and insular cortex (Oppenheimer, 1993) to autonomic brain stem nuclei (see also Berntson et al., 1998). Stimulation and lesion studies of rostral structures have shown that higher systems can facilitate, inhibit, or even bypass basic brain stem autonomic reflexes and thereby modulate autonomic outflow directly (Sévoz-Couche, Comet, Hamon, & Laguzzi, 2003). It is not likely a coincidence that many of these same brain structures may be the substrates for higher-level functions, including evaluative processes. These descending pathways are the conduit by which psychological stressors can yield anti-homeostatic effects on the ANS, including concurrent increases in blood pressure and heart rate (in opposition to baroreflex control). Direct stimulation of the hypothalamus, for example, can invoke each of the basic modes (see Figure 32.9) of reciprocal, coactive, or independent changes in the activity of the autonomic branches (Koizumi & Kollai, 1981; Shih, Chan, & Chan, 1995). The ability of higher-level systems to flexibly modulate activities in the autonomic branches has required an expansion of simple, reciprocally regulated homeostatic models. Given the research on evaluative processes, it has now also become clear that simple bipolar conceptualizations of the ANS are inadequate. Contemporary systems models recognize the basic bivariate organization of the ANS and include concepts such as heterostasis, allostasis, and allodynamic regulation that recognize the greater breadth and flexibility of autonomic control associated with rostral regulatory substrates (Berntson, Norman, Hawkley, & Cacioppo, 2008; McEwen, 2004). Theories about both evaluative processes and autonomic control have significance for the kinds of data scientists collect and for scientists’ understanding of the basic neurobiology of these processes. Bipolar theories of affect, for example, lead to the development of bipolar scales of valence that obscure the underlying bivariate nature of evaluative processes. Similarly, the reciprocal model of autonomic control biases toward particular conceptions of psychosomatic relations that may impact research and understanding of disease processes. Concepts of a reciprocally regulated, homeostatic system have limited understanding of the ANS and lead to models of autonomic contributions to disease states as
8/19/09 4:12:20 PM
Summary
c32.indd 631
2
Reciprocity (CAB)
reflecting a homeostatic failure. Although there are homeostatic features to some aspects of autonomic control, the ANS is not a universally homeostatic system. The concurrent increase in blood pressure and heart rate during stress is not a homeostatic response—it is explicitly anti-homeostatic. But it may be nevertheless highly adaptive, at least in the short term, in preparing for action. The multiplicity of the modes of autonomic control (see Figure 32.9) may have important health implications. Berntson et al. (1994) found substantial individual differences in patterns of stress reactivity, and such differences may play an important role in the susceptibility to disease (see Cacioppo et al., 1998). The understanding and measurement of these patterns, however, is heavily dependent on models of autonomic control. An index of autonomic balance could be derived from a bipolar conception of autonomic control as a scale extending from maximal sympathetic activation at one end to maximal parasympathetic activation at the other (i.e., along the reciprocal diagonal of Figure 32.9). A measure of cardiac autonomic balance (CAB) was so derived from normalized measures of high-frequency heart rate variability (which provides a relatively pure index of parasympathetic cardiac control) and preejection period (which provides a relatively pure index of sympathetic cardiac control). Although this index was not correlated with most aspects of health and disease in a population-based sample (i.e., the Chicago Health and Social Relations Study), it was predictive of diabetes mellitus and independent of demographics and health behaviors (Berntson et al., 2008). Other conceptualizations of psychosomatic relations have emphasized not so much the state of sympathetic/ parasympathetic balance but rather the overall capacity for autonomic control as indexed by autonomic flexibility and variability. This concept, together with the demonstration of the basic bivariate structure of autonomic control, suggests an alternative metric to CAB. An index of cardiac autonomic regulatory capacity (CAR) was derived as the sum of activities of the autonomic branches, again based on normalized high-frequency heart rate variability and preejection period measures. In contrast to the reciprocal diagonal represented by CAB, CAR as a metric captures the coactivity diagonal of Figure 32.9. Analysis of the Chicago Health and Social Relations Study sample revealed that CAR was a better predictor of overall health status and was a significant predictor of the prior occurrence of myocardial infarction, whereas the reciprocity metric (CAB) was not (Figure 32.10). These results suggest that distinct patterns of modes of autonomic control may be associated with distinct health dimensions. A bipolar conception of autonomic control, however, admits theory and measurement only of CAB and would occlude the relationships between CAR and health. In contrast, the broader and more comprehensive bivariate model of auto-
631
1
Myocardial Infraction Diabetes Normal
0 ⫺1 ⫺2 ⫺2
0 1 ⫺1 Co-activity (CAR)
2
Figure 32.10 CAR and CAB in disease states. Note: Data points illustrate means and standard errors of CAR and CAB as a function of participant group, relative to the population. Compared to other participants, those with a prior myocardial infarction (MI) had lower CAR scores, indicating lower overall cardiac regulatory capacity, but were not highly deviant on CAB. In contrast, those with diabetes showed a lower CAB score, reflective of a predominant sympathetic balance, but were not highly deviant on CAR. CAB = Cardiac autonomic balance; CAR = Cardiac autonomic regulatory capacity. From “Cardiac Autonomic Balance versus Cardiac Regulatory Capacity,” by G. G. Berntson, G. J. Norman, L. C. Hawkley, and J. T. Cacioppo, 2008, Psychophysiology, 45, p. 649. Reprinted with permission.
nomic control subsumes CAB as one diagonal (reciprocal diagonal of Figure 32.9) and captures CAR as the coactivity diagonal. Theories impact understandings, and specious or oversimplified theories may obscure lawful relationships.
SUMMARY With recent theoretical and technological advances, scientifically relevant conceptualizations of affective processes and their neural substrates are now possible. Utilizing strong evidence from fields such as genetics, evolutionary biology, neurobiology, and psychology provides points of convergence where interdisciplinary perspectives complement one another. The bivariate multilevel model of evaluation allows for the inclusion of new theoretical constructs and empirical evidence that can resolve competing hypotheses, generate new and testable hypotheses, and increase theoretical breadth and depth, leading to better conceptualizations of affective phenomena. Theories that assume strictly bipolar (valence) mechanisms underlying affective responses have difficulty accounting for evidence from the neurosciences that shows distinct neural substrates are coactivated in the presence of appetitive and aversive stimuli. Nor do these theories incorporate the influence of evaluative mechanisms organized at lower levels of the neuraxis. The evaluative space model provides a more comprehensive conception of evaluative processes and
8/19/09 4:12:21 PM
632
Evaluative Processes
subsumes, rather than discards, more simplistic models based on bipolar conceptualizations of affect.
Berridge, K. C. (2004). Motivation concepts in behavioral neuroscience. Physiology and Behavior, 81, 179–209. Berridge, K. C., & Grill, H. J. (1984). Idohedonic tastes support a twodimensional hypothesis of palatability. Appetite, 5, 221–231.
REFERENCES Adams, D. B. (1979). Brain mechanisms for offense, defense, and submission. Behavioral and Brain Sciences, 2, 201–241. Adolphs, R., Tranel, D., Koenigs, M., & Damasio, A. R. (2005). Preferring one taste over another without recognizing either. Nature Neuroscience, 7, 860–861.
Blair, K., Marsh, A. A., Morton, J., Vythilingam, M., Jones, M., Mondillo, K., et al. (2006). Choosing the lesser of two evils, the better of two goods: Specifying the roles of ventromedial prefrontal cortex and dorsal anterior cingulate in object choice. Journal of Neuroscience, 26, 11379–11386. Buchanan, T. W., Tranel, D., & Adolphs, R. (2006). Memories for emotional autobiographical events following unilateral damage to medial temporal lobe. Brain, 129, 115–127.
Allport, G. W. (1935). Attitudes. In C. Murchison (Ed.), Handbook of social psychology (Vol. 2, pp. 798–884). Worchester, MA: Clark University Press.
Cacioppo, J. T., & Berntson, G. G. (1994). Relationship between attitudes and evaluative space: A critical review, with emphasis on the separability of positive and negative substrates. Psychological Bulletin, 115, 401–423.
Aronson, E., & Carlsmith, J. M. (1963). Effect of severity of threat on the devaluation of forbidden behavior. Journal of Abnormal and Social Psychology, 66, 584–588.
Cacioppo, J. T., & Berntson, G. G. (1999). The affect system: Architecture and operating characteristics. Current Directions in Psychological Science, 8, 133–137.
Bard, P., & Macht, M. B. (1958). The behavior of chronically decerebrate cats. In G. E. W. Wolsten-Holme & C. M. O’Connor (Eds.), Neurological basis of behavior (pp. 55–75). London: Churchill.
Cacioppo, J. T., Berntson, G. G., Malarkey, W. B., Kiecolt-Glaser, J. K., Sheridan, J. F., Poehlmann, K. M., et al. (1998). Autonomic, neuroendocrine, and immune responses to psychological stress: The reactivity hypothesis. Annals of the New York Academy of Sciences, 840, 664–673.
Bechara, A., Tranel, D., Damasio, H., Adolphs, R., Rockland, C., & Damasio, A. R. (1995, August 25). Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science, 269, 1115–1118. Berntson, G. G. (2006). Reasoning about brains. In J. T. Cacioppo, P. S. Visser, & C. L. Pickett (Eds.), Social neuroscience: People thinking about thinking people (pp. 1–11). Cambridge, MA: MIT Press. Berntson, G. G., Bechara, A., Damasio, H., Tranel, D., & Cacioppo, J. T. (2007). Amygdala contribution to selective dimensions of emotion. Social Cognitive and Affective Neuroscience, 2, 123–129. Berntson, G. G., Boysen, S. T., & Cacioppo, J. T. (1993). Neurobehavioral organization and the cardinal principle of evaluative bivalence. Annals of the New York Academy of Science, 702, 75–102. Berntson, G. G., & Cacioppo, J. T. (2000). From homeostasis to allodynamic regulation. In J. T. Cacioppo, L. G. Tassinary, & G. G. Berntson (Eds.), Handbook of psychophysiology (pp. 459–481). Cambridge, England: Cambridge University Press. Berntson, G. G., & Cacioppo, J. T. (2008). The neuroevolution of motivation. In J. Shah & W. Gardner (Eds.), Handbook of motivation science (pp. 188–200). New York: Guilford Press. Berntson, G. G., Cacioppo, J. T., Binkley, P. F., Uchino, B. N., Quigley, K. S., & Fieldstone, A. (1994). Autonomic cardiac control: III. Psychological stress and cardiac response in autonomic space as revealed by pharmacological blockades. Psychophysiology, 31, 599–608. Berntson, G. G., Cacioppo, J. T., & Quigley, K. S. (1991). Autonomic determinism: The modes of autonomic control, the doctrine of autonomic space, and the laws of autonomic constraint. Psychological Reviews, 98, 459–487. Berntson, G. G., & Micco, D. J. (1976). Organization of brainstem behavioral systems. Brain Research Bulletin, 1, 471–483. Berntson, G. G., Norman, G. J., Hawkley, L. C., & Cacioppo, J. T. (2008). Cardiac autonomic balance versus cardiac regulatory capacity. Psychophysiology, 45, 643–652. Berntson, G. G., Sarter, M., & Cacioppo, J. T. (1998). Anxiety and cardiovascular reactivity: The basal forebrain cholinergic link. Behavioural Brain Research, 94, 225–248. Berntson, G. G., Tuber, D. S., Ronca, A. E., & Bachman, D. S. (1983). The decerebrate human: Associative learning. Experimental Neurology, 81, 77–88. Berridge, K. C. (1996). Food reward: Brain substrates of wanting and liking. Neuroscience Biobehavioral Review, 20, 1–25.
c32.indd 632
Cacioppo, J. T., & Gardner, W. L. (1999). Emotion. Annual Review of Psychology, 50, 191–214. Cacioppo, J. T., Gardner, W. L., & Berntson, G. G. (1997). The affect system has parallel and integrative processing components: Form follows function. Journal of Personality and Social Psychology, 76, 839–855. Cacioppo, J. T., Larsen, J. T., Smith, N. K., & Berntson, G. G. (2004). The affect system: What lurks below the surface of feelings. In A. S. R. Manstead, N. Frijda, & A. Fischer (Eds.), Feelings and emotions (pp. 221–240). Cambridge, England: Cambridge University Press. Cacioppo, J. T., & Petty, R. E. (1984). The elaboration likelihood model of persuasion. Advances in Consumer Research, 11, 673–675. Canli, T., Desmond, J. E., Zhao, Z., Glover, G., & Gabrieli, J. D. (1998). Hemispheric asymmetry for emotional stimuli detected with fMRI. NeuroReport, 9, 3233–3239. Craig, A. D. (2003). Pain mechanisms: Labeled lines versus convergence in central processing. Annual Review of Neuroscience, 26, 1–30. Craig, A. D. (2005). Forebrain emotional asymmetry: A neuroanatomical basis? Trends in Cognitive Sciences, 9, 566–571. Critchley, H. D., Taggart, P., Sutton, P. M., Holdright, D. R., Batchvarov, V., Hnatkova, K., et al. (2005). Activity in the human brain predicting differential heart rate responses to emotional facial expressions. NeuroImage, 24, 751–762. Cunningham, W. A., & Zelazo, P. D. (2007). Attitudes and evaluations: A social cognitive neuroscience perspective. Trends in Cognitive Sciences, 11, 97–104. Davidson, R. J. (1990). Approach-withdrawal and cerebral asymmetry: Emotional expression and brain physiology. Journal of Personality and Social Psychology, 58, 330–341. Davidson, R. J. (1998). Anterior electrophysiological asymmetries, emotion, and depression: Conceptual and methodological conundrums. Psychophysiology, 35, 607–614. Davidson, R. J. (2004). What does the prefrontal cortex “do” in affect: Perspectives on frontal EEG asymmetry research. Biological Psychology, 67, 219–233. Davidson, R. J., Shackman, A. J., & Maxwell, J. S. (2004). Asymmetries in face and brain related to emotion. Trends in Cognitive Science, 8, 389–391. De Gelder, B., Vroomen, J., Pourtois, G., & Weiskrantz, L. (1999). Non-conscious recognition of affect in the absence of striate cortex. NeuroReport, 10, 3759–3763.
8/19/09 4:12:22 PM
References 633 Egan, L. C., Santos, L. R., & Bloom, P. (2007). The origins of cognitive dissonance: Evidence from children and monkeys. Psychological Science, 18, 978–983.
Larsen, J. T., McGraw, A. P., & Cacioppo, J. T. (2001). Can people feel happy and sad at the same time? Journal of Personality and Social Psychology, 81, 684–696.
Grabenhorst, F., Rolls, E. T., Margot, C., da Silva, M. A., & Velazco, M. I. (2007). How pleasant and unpleasant stimuli combine in different brain regions: Odor mixtures. Journal of Neuroscience, 27, 13532–13540.
Larsen, J. T., McGraw, A. P., Mellers, B. A., & Cacioppo, J. T. (2004). The agony of victory and thrill of defeat: Mixed emotional reactions to disappointing wins and relieving losses. Psychological Science, 15, 325–330.
Gray, J. A., & McNaughton, N. (2000). The neuropsychology of anxiety: An enquiry into the functions of the septo-hippocampal system. Oxford, England: Oxford University Press. Harmon-Jones, E., Vaughn, K., Mohr, S., Sigelman, J., & Harmon-Jones, C. (2004). The effect of manipulated sympathy and anger on left and right frontal cortical activity. Emotion, 4, 95–101. Harris, R. B., Kelso, E. W., Flatt, W. P., Bartness, T. J., & Grill, H. J. (2006). Energy expenditure and body composition of chronically maintained decerebrate rats in the fed and fasted condition. Endocrinology, 147, 1365–1376. Hart, S., Sarter, M., & Berntson, G. G. (1999). Cholinergic inputs to the rat medial prefrontal cortex mediate potentiation of the cardiovascular defensive response by the anxiogenic benzodiazepine receptor partial inverse agonist FG 7142. Neuroscience, 94, 1029–1038. Heath, R. G. (1972). Pleasure and brain activity in man: Deep and surface electroencephalograms during orgasm. Journal of Nervous and Mental Diseases, 154, 3–18. Hess, W. R. (1954). Diencephalon: Autonomic and extrapyramidal functions. Monographs in biology and medicine: Vol. III. New York: Grune & Stratton. Hoebel, B. G., Rada, P. V., Mark, G. P., & Pothos, E. N. (1999). Neural systems for reinforcement and inhibition of behavior: Relevance to eating, addiction, and depression. In D. Kahneman, E. Diener, & N. Schwarz (Eds.), Well-being: The foundations of hedonic psychology (pp. 558–572). New York: Russell Sage Foundation. Irwin, W., Davidson, R. J., Lowe, M. J., Mock, B. J., Sorenson, J. A., & Turski, P. A. (1996). Human amygdala activation detected with echoplanar functional magnetic resonance imaging. NeuroReport, 7, 1765–1769. Ito, T. A., & Cacioppo, J. T. (2005). Variations on a human universal: Individual differences in positivity offset and negativity bias. Cognition and Emotion, 19, 1–26. Jackson, J. H. (1958). Evolution and dissolution of the nervous system (Croonian Lectures). In J. Taylor (Ed.), Selected writings of John Hughlings Jackson (pp. 45–63). New York: Basic Books. (Original work published 1884.) Knox, D., & Berntson, G. G. (2006). Effect of nucleus basalis magnocellularis cholinergic lesions on fear-like and anxiety-like behavior. Behavioral Neuroscience, 120, 307–312. Koizumi, K., & Kollai, M. (1981). Control of reciprocal and non-reciprocal action of vagal and sympathetic efferents: Study of centrally induced reactions. Journal of the Autonomic Nervous System, 3, 483–501. Koob, G. F. (1992). Drugs of abuse: Anatomy, pharmacology, and function of reward pathways. Trends in Pharmacological Sciences, 13, 177–184. Kosslyn, S. M., Shin, L. M., Thompson, W. L., McNally, R. J., Rauch, S. L., Pitman, R. K., et al. (1996). Neural effects of visualizing and perceiving aversive stimuli: A PET investigation. NeuroReport, 7, 1569–1576. LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of emotional memory. Nature Reviews Neuroscience, 7, 54–64. Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (1999). International affective picture system (IAPS): Instruction manual and affective ratings. [Technical report A-4.] Gainsville, FL: University of Florida, Center for Research in Psychophysiology. Lang, P. J., Davis, M., & Ohman, A. (2000). Fear and anxiety: Animal models and human cognitive psychophysiology. Journal of Affective Disorders, 61, 137–159.
c32.indd 633
Larson, C. L., Schaefer, H. S., Siegle, G. J., Jackson, C. A., Anderle, M. J., & Davidson, R. J. (2006). Fear is fast in phobic individuals: Amygdala activation in response to fear-relevant stimuli. Biological Psychiatry, 60, 410–417. LeDoux, J. E. (1996). The emotional brain: The mysterious underpinnings of emotional life. New York: Simon & Schuster. LeDoux, J. (2000). Emotion circuits in the brain. Annual Review of Neuroscience, 23, 155–184. LeDoux, J. (2003). The emotional brain, fear, and the amygdala. Cellular and Molecular Neurobiology, 23, 727–738. LeDoux, J. E., Iwata, J., Cicchetti, P., & Reis, D. J. (1988). Different projections of the central amygdaloid nucleus mediate autonomic and behavioral correlates of conditioned fear. Journal of Neuroscience, 8, 2517–2529. Lee, G. P., Meador, K. J., Loring, D. W., Allison, J. D., Brown, W. S., Paul, L. K., et al. (2004). Neural substrates of emotion as revealed by functional magnetic resonance imaging. Cognitive and Behavioral Neuroscience, 17, 9–17. Lundberg, A. (1979). Multisensory control of spinal reflex pathways. Progress in Brain Research, 50, 11–28. MacLean, P. D. (1985). Evolutionary psychiatry and the triune brain. Psychological Medicine, 15, 219–221. Mauk, M., & Thompson, R. F. (1987). Retention of classically conditioned eyelid responses following acute decerebration. Brain Research, 403, 89–95. McEwen, B. (2004). Protection and damage from acute and chronic stress: Allostasis and allostatic overload and relevance to the pathophysiology of psychiatric disorders. Annals of the New York Academy of Sciences, 1032, 1–7. Miller, N. E. (1959). Liberalization of basic S-R concepts: Extensions to conflict behavior, motivation and social learning. In S. Koch (Ed.), Psychology: A study of a science (pp. 196–292). New York: McGraw-Hill. Miller, N. E. (1961). Some recent studies on conflict behavior and drugs. American Psychologist, 16, 12–24. Nitschke, J. B., Sarinopoulos, I., Mackiewicz, K. L., Schaefer, H. S., & Davidson, R. J. (2006). Functional neuroanatomy of aversion and its anticipation. NeuroImage, 29, 106–116. Norman, R. J., Buchwald, J. S., & Villablanca, V. J. (1977, April 29). Classical conditioning with auditory discrimination of the eye blink in decerebrate cats. Science, 196, 551–553. Öhman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: Toward an evolved module of fear and fear learning. Psychological Review, 108, 438–522. Okun, M. S., Bowers, D., Springer, U., Shapira, N. A., Malone, D., & Rezai, A. R. (2004). What’s in a ‘smile’? Intra-operative observations of contralateral smiles induced by deep brain stimulation. Neurocase, 10, 271–279. Oppenheimer, S. (1993). The anatomy and physiology of cortical mechanisms of cardiac control. Stroke, 24, 13–15. Oppenheimer, S. M. (2006). Cerebrogenic cardiac arrhythmias: Cortical lateralization and clinical significance. Clinical Autonomic Research, 16, 1619–1560. Osgood, C., Suci, G., & Tannenbaum, P. (1957). The measurement of meaning. Urbana: University of Illinois.
8/19/09 4:12:23 PM
634
Evaluative Processes
Pecina, S., & Berridge, K. C. (2005). Hedonic hot spot in the nucleus accumbens shell: where do mu-opiods cause increased hedonic impact of sweetness? Journal of Neuroscience, 25, 11777–11787. Pegna, A. J., Khateb, A. A., Lazeyras, F., & Seghier, M. L. (2005). Discriminating emotional faces without primary visual cortices involves the right amygdala. Nature Neuroscience, 8, 24–25. Petty, R. E., Cacioppo, J. T., Strathman, A. J., & Priester, J. R. (2005). To think or not to think: Exploring two routes to persuasion. In S. Shavitt & T. C. Brock (Eds.), Persuasion: Psychological insights and perspectives (2nd ed., pp. 81–116). New York: Allyn & Bacon. Phelps, E. A. (2006). Emotion and cognition: Insights from studies of the human amygdala. Annual Review of Psychology, 57, 27–53. Phelps, E. A., & LeDoux, J. E. (2005). Contributions of the amygdala to emotion processing: From animal models to human behavior. Neuron, 48, 175–187. Phillips, R. G., & LeDoux, J. E. (1992). Differential contribution of amygdala and hippocampus to cued and contextual fear conditioning. Behavioral Neuroscience, 106, 274–285. Pizzagalli, D. A., Sherwood, R. J., Henriques, J. B., & Davidson, R. J. (2005). Frontal brain asymmetry and reward responsiveness: A sourcelocalization study. Psychological Science, 16, 805–813. Porter, R. (1987). Functional studies of motor cortex. Ciba Foundation Symposium, 132, 83–97. Posner, J., Russell, J. A., & Peterson, B. S. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 173, 715–734. Quirk, G. J., Armony, J. L., & LeDoux, J. E. (1997). Fear conditioning enhances different temporal components of tone-evoked spike trains in auditory cortex and lateral amygdala. Neuron, 19, 613–624. Quirk, G. J., Repa, C., & LeDoux, J. E. (1995). Fear conditioning enhances short-latency auditory responses of lateral amygdala neurons: Parallel recordings in the freely behaving rat. Neuron, 15, 1029–1039. Robinson, T. E., & Berridge, K. C. (2003). Addiction. Annual Review of Psychology, 54, 25–53. Ronca, A. E., Berntson, G. G., & Tuber, D. A. (1986). Cardiac orienting and habituation to auditory and vibrotactile stimuli in the infant decerebrate rat. Developmental Psychobiology, 18, 79–83. Root, J. C., Wong, P. S., & Kinsbourne, M. (2006). Left hemisphere specialization for response to positive emotional expressions: A divided output methodology. Emotion, 6, 473–483. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39, 1161–1178. Russell, J. A. (1983). Pancultural aspects of human conceptual organization of emotions. Journal of Personality and Social Psychology, 45, 1281–1288. Russell, J. A. (2003). Core affect and the psychological construction of emotion. Psychological Review, 110, 145–172. Sabatinelli, D., Bradley, M. M., Fitzsimmons, J. R., & Lang, P. J. (2005). Parallel amygdala and inferotemporal activation reflect emotional intensity and fear relevance. NeuroImage, 24, 1265–1270. Sandrini, G., Serrao, M., Rossi, P., Romaniello, A., Cruccu, G., & Willer, J. C. (2005). The lower limb flexion reflex in humans. Progress in Neurobiology, 77, 353–395. Schouenborg, J., Holmberg, H., & Weng, H. R. (1992). Functional organization of the nociceptive withdrawal reflexes: II. Changes of excitability and receptive fields after spinalization in the rat. Experimental Brain Research, 90, 469–478. Sévoz-Couche, C., Comet, M. A., Hamon, M., & Laguzzi, R. (2003). Role of nucleus tractus solitarius 5-HT3 receptors in the defense
c32.indd 634
reaction-induced inhibition of the aortic baroreflex in rats. Journal of Neurophysiology, 90, 2521–2530. Sherrington, C. S. (1906). The integrative action of the nervous system. New Haven, CT: Yale University Press. Shih, C. D., Chan, S. H., & Chan, J. Y. (1995). Participation of hypothalamic paraventricular nucleus in locus ceruleus-induced baroreflex suppression in rats. American Journal of Physiology: Heart and Circulatory Physiology, 269, H46–H52. Shin, L. M., Orr, S. P., Carson, M. A., Rauch, S. L., Macklin, M. L., Lasko, N. B., et al. (2004). Regional cerebral blood flow in the amygdala and medial prefrontal cortex during traumatic imagery in male and female Vietnam veterans with PTSD. Archives of General Psychiatry, 61, 168–176. Steiner, J. E., Glaser, D., Hawilo, M. E., & Berridge, K. C. (2001). Comparative expression of hedonic impact: Affective reactions to taste by human infants and other primates. Neuroscience and Biobehavioral Reviews, 25, 53–74. Stowell, J. R., Berntson, G. G., & Sarter, M. (2000). Attenuation of the bidirectional effects of chlordiazepoxide and FG 7142 on conditioned response suppression and associated cardiovascular reactivity by loss of cortical cholinergic inputs. Psychopharmacology, 150, 141–149. Straube, T., Mentzel, H. J., & Miltner, W. H. (2007). Waiting for spiders: Brain activation during anticipatory anxiety in spider phobics. NeuroImage, 37, 1427–1436. Thurstone, L. L. (1931). The measurement of attitudes. Journal of Abnormal Psychology, 26, 249–269. Tooby, J., & Cosmides, L. (1990). The past explains the present: Emotional adaptations and the structure of ancestral environment. Ethology and Sociobiology, 11, 375–424. Tranel, D., Gullickson, G., Koch, M., & Adolphs, R. (2006). Altered experience of emotion following bilateral amygdala damage. Cognitive Neuropsychiatry, 11, 219–232. Tuber, D. S., Berntson, G. G., Bachman, D. S., & Allen, J. N. (1980, November 28). Associative learning in premature hydranencephalic and normal twins. Science, 210, 1035–1037. Wakana, S., Jiang, H., Nagae-Poetscher, L. M., Zijl, P. C., & Mori, S. (2004). Fiber tract based atlas of human white matter anatomy. Radiology, 230, 77–87. Walker, D. L., & Davis, M. (1997). Double dissociation between the involvement of the bed nucleus of the stria terminalis and the central nucleus of the amygdala in light-enhanced versus fear potentiated startle. Journal of Neuroscience, 17, 9375–9383. Walker, D. L., Toufexis, D. J., & Davis, M. (2003). Role of the bed nucleus of the stria terminalis versus the amygdala in fear, stress, and anxiety. European Journal of Pharmacology, 463, 199–216. Watson, D., Wiese, D., Vaidya, J., & Tellegen, A. (1999). The two general activation systems of affect: Structural findings, evolutionary considerations, and psychobiological evidence. Journal of Personality and Social Psychology, 76, 820–838. Weiskrantz, L. (1986). Blindsight: A case study and implications. Oxford, England: Oxford University Press. Wise, R. A. (2006). Role of brain dopamine in food reward and reinforcement. Philosophical Transactions of the Royal Society of Biological Sciences, 361, 1149–1158. Wundt, W. (1896). Outlines of psychology. Leipzig, Germany: Engelmann. Yates, B. J., Jakus, J., & Miller, A. D. (1993). Vestibular effects on respiratory outflow in the decerebrate cat. Brain Research, 3, 209–217. Zald, D. H., & Pardo, J. V. (1997). Emotion, olfaction, and the human amygdala: Amygdala activation during aversive olfactory stimulation. Proceedings of the National Academy of Sciences, USA, 94, 4119–4124.
8/19/09 4:12:23 PM
Chapter 33
Pain: Mechanisms and Measurement JOSÉE GUINDON AND ANDREA G. HOHMANN
DEFINITION OF PAIN
circuits implicated in pain transmission and modulation and ongoing improvements in the evaluation of its effects. It is now generally acknowledged that pain comprises sensory-discriminative, motivational-affective, and cognitive-evaluative dimensions (Figure 33.1). In the mid1990s, pain was defined by the International Association of the Study of Pain (IASP) as “an unpleasant sensory and emotional experience associated with actual or potential tissue damage, or described in terms of such damage” (Merskey & Bogduk, 1994, pp. 209–214). This definition raised questions because it is possible to experience an injury without pain and pain can also be experienced in the absence of any apparent injury. For example, people born with congenital analgesia exhibit profound insensitivity to pain even in the presence of serious injury (e.g., fractures, burns, appendicitis; Comings & Amromin, 1974; Manfredi et al., 1981; Waxman, 2007). The cause underlying congenital analgesia until recently has remained elusive.
The word pain comes from the Latin word peona meaning punishment or penalty. Pain is an unpleasant, complex, personal, and subjective experience that can range in intensity from slight through severe to indescribable. In the general population of the United States, the two most common forms of pain involve headaches and back pain that affect 45 and 9 million people, respectively. The management of moderate and severe chronic pain is the main concern and burden of patients and clinicians. Despite improvements in our understanding of neural circuits contributing to pain transmission and modulation, the need for safe and effective approaches for pain relief remains predominant. The mission of defining pain is a complicated one (for review, see Brennan, Carr, & Cousins, 2007; Price, 1999). This definition has evolved through time together with advances in both our understanding of the neural
Multidimensional Experience
ive l ect a Aff vation ti mo
Cog eva nitive lua tive
Sensory-discriminative Dimensions
Pain Neurophysiological Unpleasantness Affective motivational aspect
Psychophysical
Psychological
Figure 33.1
Algosity Pain intensity (psychophysical properties)
Neuropsychological
Cultural
Social
Environmental
Pain is a multidimensional phenomenon. 635
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c33.indd 635
8/17/09 3:05:03 PM
636
Pain: Mechanisms and Measurement
A mutation in the SCN9A gene, which is linked to chromosome 2q24.3 and results in nonfunctional Nav1.7 channels, has been implicated in congenital insensitivity to pain (Cox et al., 2006; Waxman, 2007). Furthermore, causes underlying the two most common forms of pain in the population, headaches (Moskowitz, 1992; Villalon, Centurion, Valdivia, de Vries, & Saxena, 2003) and back pain (Loeser, 2001), remain poorly understood. In many cases, pain is felt without any sign of injury causing controversy and speculation about the pathophysiology of these conditions. Furthermore, it is also possible for motorcycle accident victims to experience pain after healing the avulsion of the brachial plexus (Wynn Parry, 1980). Therefore, another definition of pain was proposed by Price (1999). According to this definition, pain is a somatic perception containing: (a) a bodily sensation with qualities like those reported during tissue-damaging stimulation, (b) an experienced threat associated with this sensation, and (c) a feeling of unpleasantness or other negative emotion based on this experienced threat. This definition doesn’t require the demonstration of tissue damage or the association between sensation and tissue lesion. However, unpleasant somatic sensation (e.g., itch) is not necessarily associated with pain. Therefore, pain may more appropriately be defined with the association of two somatosensory qualities: unpleasantness (affective-motivational aspect) and algosity (a unique quality of pain that allows it to be unequivocally identified). The psychophysics and neural mechanisms may differ for each of these dimensions. These dimensions are distinct in their intensity-based sensory discriminations. Moreover, the magnitude of unpleasantness can be dissociated from the pain intensity (algosity magnitude; Fields, 1999). Pain remains a complex multidimensional experience related to sensory-discriminative, cognitive-evaluative, and motivational-affective dimensions which, by its complexity, cannot be solely explained by social, cultural, environmental, neurophysiological, psychophysical, psychological, or neuropsychological aspects (Melzack & Katz, 1999; Melzack & Wall, 1991; Price, 1999). Interactions between these aspects are also likely to occur (Figure 33.1).
THE CONCEPTUALIZATION OF PAIN HAS EVOLVED The concept of pain has evolved over time (for a review, see Melzack & Wall, 1991). The ancient Greeks grouped pain with the emotions or appetites and not with sensation. They considered pain to be the opposite of pleasure (Dallenbach, 1939; Livingstone, 1998; Marshall, 1894). The view of pain as pure emotion went into decline in the
c33.indd 636
seventeenth century with the advent of specificity theory. This theory postulates that a specific pain system carries a message from pain receptors in the skin to a pain center in the brain. The best classical description of this theory comes from Descartes (1664) who conceived of the pain system as a straight-through channel from the skin to the brain (Figure 33.2). Different qualities of sensation were not recognized in this early conceptualization. Müller ’s (1842) doctrine of specific nerve energies postulated that the qualities of experience were associated with the properties of sensory nerves. Müller proposed that the brain receives information about external objects only by way of the sensory nerves and five classical senses were recognized: seeing, hearing, taste, smell, and touch. The theory of cutaneous senses was established by Max von Frey between 1894 and 1895. Von Frey’s designation of the free nerve endings as pain receptors is the basis of the specificity theory (Boring, 1942; Figure 33.2). A “solution” to the puzzle of pain was explained by the existence of specific pain receptors in the body tissue traveling via pain fibers and a pain pathway to a pain center in the brain (Boring, 1942). It was later proposed that the amount and quality of perceived pain were modulated by many psychological variables in addition to the sensory input. For example, dogs that received electric shocks, burns, or cuts followed by the presentation of food eventually responded to these noxious stimuli as signals for food and most notably failed to show any signs of pain (Pavlov, 1927). Goldscheider (1894) was the first to propose that stimulus intensity and central summation are the critical determinants of pain. This conceptualization represented the origin of pattern theory. Goldscheider concluded that mechanisms of central summation, which were postulated to be localized in the dorsal horn of the spinal cord, were essential for any understanding of pain mechanisms. Livingstone (1943) proposed that pathological stimulation of sensory nerves (e.g., peripheral nerve damage) initiates activity in reverberatory circuits in the gray matter of the spinal cord. This abnormal activity could be triggered by normally nonnoxious inputs and generates volleys of nerve impulses that are interpreted centrally as pain (Figure 33.2). The simplest form of pattern theory deals with peripheral instead of central patterning. According to this theory, the pattern of pain is produced by an intense stimulation of nonspecific receptors (Sinclair, 1955; Weddell, 1955). Note that in this conceptualization, all fiber endings (except those innervated by hair cells) were believed to be alike. The sensory interaction theory developed by Noordenbos (1959) suggests that the small diameter fibers exist to carry nerve impulse patterns that produce pain, whereas the large fibers inhibit this transmission. Therefore, a shift in
8/17/09 3:05:03 PM
The Conceptualization of Pain Has Evolved 637 Central control processes Motivationalaffective system
L
⫹
S L S
⫹ ⫹ Theory of cutaneous senses
Descartes 1664
Pain Appetites ⫺
Emotions
Opposite
Summation theory Goldscheider 1894
S
Model of reverberatory circuits Livingston 1943
1927 Pavlov
1842 Müller Doctrine of specific nerve energies
CR (salivation)
US (food) UR (salivation)
US (food) paired with CS (electric, burn, cut)
1955 Sinclair: Weddell Peripheral pattern theory
Sensory-discriminative system
T
(Spatio-temporal analysis)
Sensory interaction theory Noordenbos 1959
Melzack & Casey 1968
1965 Melzack & Wall
1990 Melzack
Gate-control system
Input
Pain perception and action systems
Central control
L ⫹ SG
⫺ S
A
E
⫺⫹ T
Action system
S
⫺⫹
Historical time line showing the evolution of pain
1990; SG ⫽ substantia gelatinosa; T ⫽ transmission; US ⫽ unconditioned stimuli; UR ⫽ unconditioned response.
Note: A ⫽ Affective-motivational; CR ⫽ conditioned response; CS ⫽ conditioned stimuli; E ⫽ evaluative-cognitive; L ⫽ large diameter fiber; S ⫽ small diameter fiber ⫽ SD ⫽ sensory-discriminative for Melzack,
Data source: aBoring (1942), Melzack and Wall (1991); bMarshall (1894), Dallenbach (1939), Livingstone (1998).
the ratio toward small fibers would result in an increase in neural transmission, summation, and excessive pathological pain. Finally, the gate-control theory postulated that the perception of pain is determined by interactions between different types of fibers, both small-diameter pain transmitting and large-diameter nonpain transmitting fibers. This theory asserted that activation of the large diameter (fast-conducting) nonpain transmitting fibers could indirectly inhibit signals from small-diameter (slowconducting) pain transmitting fibers and block the transmission and perception of pain (Melzack & Wall, 1965; Figure 33.2). Under pathological conditions, the fast system loses its dominance over the slow one, resulting in slow pain (Bishop, 1946), diffuse burning pain (Bishop, 1959), or hyperalgesia (Noordenbos, 1959). This model was updated by Melzack and Casey (1968) as a conceptual model of the sensory, motivational, and central control determinants of pain. In this conceptualization, the
output of the transmission (T) cells of the gate-control system projects to the sensory-discriminative system (via the neospinothalamic fibers) and the motivational-affective system (via the paramedical ascending system). The central control trigger projects back to the gate-control system, the sensory-discriminative system, and the motivationalaffective system. These three systems interact and project to the motor system to influence motor responses to pain (Figure 33.2). Body-self matrix theory developed from attempts to explain the neurophysiology of phantom limb pain (Melzack, 1990a, 1999). According to this theory, a genetically built-in matrix of neurons for the whole body comprises a widely distributed neural network that incorporates somatosensory, limbic, and thalamocortical components. These components contain smaller parallel networks that contribute to sensory-discriminative, affective-motivational, and evaluative-cognitive dimensions of pain experience as the neuromatrix. According to this
Figure 33.2 theories.
c33.indd 637
Gate control system
Pattern theory
CS (electric, burn, cut)
Pleasure ⫹
L
⫹
Max von Freya 1894–1895
Specificity theory Ancient Greeksb
⫹
S
Motor mechanisms
(Central intensity monitor)
8/17/09 3:05:04 PM
638
Pain: Mechanisms and Measurement
theory, the cyclical processing and synthesis of nerve impulses in the neuromatrix imposes a characteristic output pattern or a neurosignature that is perceived as pain (Melzack, 1990a; Figure 33.2). The conceptualization of pain is likely to continue to evolve over time together with advances in our understanding of pain transmission and modulation.
TRANSIENT PAIN Transient pain is defined by the brief duration of the experienced pain sensation. It is the feeling commonly experienced with minor injuries (e.g., a stubbed toe, a mild burn, the itching of sunburn). Transient pain has no long-term consequence because it is associated with almost no tissue damage. Transient pain is not typically accompanied by anxiety (Melzack & Wall, 1991). In this situation, a first pain is felt which is well localized and relatively mild (Table 33.1). This is followed shortly by a second pain that distracts attention from the person’s previous activity and decreases in intensity until it fades away (Marchand, 1998; Melzack & Wall, 1991; Table 33.1).
ACUTE PAIN Acute pain (e.g., such as that felt after twisting an ankle or cutting your finger) is a more intense sensation than transient pain. This experience is marked by intense consciousness of the event and a penetrating sensation that is accompanied by alertness and orientation toward the affected region. This painful experience contains a sequence of perceptions, evaluations, and emotions (Price, 1999). Perception and evaluation are mostly related to examination of the injury itself but can be influenced by previous experiences (e.g., a prior history of similar experiences) or the personality of the injured (Melzack & Wall, 1991; Price, 1999). The emotional component of acute pain may be related to fear or anxiety emanating from the body following the injury and includes autonomic activation. Furthermore, after some delay, thoughts and concerns regarding pain become more elaborated, reflective, and directed toward the long-term consequences of this injury (e.g., activities and responsibilities that will be unattended) and anxiety about the healing process. The immediate and late stage of this acute pain experience involves two main aspects: the desire to avoid harm due to the injury and the expectation that you will succeed in preventing harm (Price, 1999). Expectations can also be focused on the healing process and anxiety/fear can be experienced
c33.indd 638
if these expectations are not met. Thus, acute pain can reiterate some of the complexities of the multidimensional experience that represents persistent pain.
PAIN AS A RECUPERATIVE HEALING MECHANISM Transient and acute pain is adaptive in that it serves a protective function and enables the affected individual to learn to avoid serious injuries in the future. A sequence of change occurs after an injury on several levels: physiological, biological, neurological, and behavioral. A relationship exists between the behavioral and biological events to ensure subsequent recuperation and healing after injury. Individuals born with congenital insensitivity to pain often sustain serious injuries, providing a rather extreme illustration of the useful functions of pain to warn, protect, and heal injured tissue (Fields, 1987; Melzack & Wall, 1983). Wall (1979) proposed that three phases of pain behavior— immediate, acute, and chronic—follow an injury in both animals and humans. The immediate phase of pain behavior is the first period corresponding to the activation of nociceptive afferent neurons and is related to autonomic responses (e.g., fight-or-flight responses) combined with emotions (e.g., fear or anger). It is possible that pain is not felt at this moment if the subject is caught in a stressful situation where it is necessary to escape or find safety. The acute phase of pain behavior corresponds to the behavior associated with the recovery process. At this point, the subject will feel pain and will have to cope with it (i.e., find treatment and prepare for recovery). This phase can be accompanied by anxiety and distress about the injury. Finally, the chronic phase of pain consists of quiet inactivity and related behavior related to rest, inactivation, recuperation, and healing. Long-term changes in the nervous system induced by the failure or delays in complete healing of the injury where pain is no longer a symptom of an injury, but rather becomes a serious medical syndrome, may contribute to the development of chronic pain syndromes in humans (Price, 1999; Wall, 1979).
LONG-TERM OR CHRONIC PAIN The fact that long-term or chronic pain is accepted and recognized as a distinct medical entity represents a major breakthrough in the field of pain (Bonica, 1953, 1974). A few decades ago, people who felt pain long after healing had occurred were frequently sent to psychiatric hospitals. Misdiagnosis and mistreatment resulted from an utter lack
8/17/09 3:05:04 PM
c33.indd 639
8/17/09 3:05:04 PM
TABLE 33.1
Yes
Yes
No
A
A␦
C
A␦
C
Myelinated
Fibre Type
30 – 100 m/sec
6 – 30 m/s
1.0 – 2.5 m/s
5 – 15 m
1 – 5 m
0.25 – 1.5 m
First pain A␦ fiber
Conduction Velocity
Diameter
Characterization of different types of afferent fibers and their relationship to first and second pain
Free
Free
Second pain C fiber
Specialized and free
Receptor Type
Light pressure Heavy pressure Heat (45˚C ⫹) Chemicals Warmth
Light pressure Heavy pressure Heat (45˚C ⫹)
Light pressure or touch
Respond To
640
Pain: Mechanisms and Measurement
of knowledge regarding the pathophysiology of chronic pain. Chronic pain is defined as pain that persists after all possible healing has occurred or long after pain can serve any useful function. Thus, chronic pain is no longer a symptom of injury or disease but is a medical problem in its own right that requires urgent attention to alleviate unnecessary suffering (Melzack & Wall, 1991). Feelings of fear, anxiety, and anger characterizing the earliest phase of the pain experience are transformed into despair, frustration, hopelessness, and depression that can develop later (Cohen, Patel, Khetpal, Peterson, & Kimmel, 2007; Price, 1999; Scholl & Allen, 2007). Such changes are understandable and may be related to reflections about the interference of pain in everyday life, difficulty of enduring such pain, and the ultimate negative consequences of enduring this persistent pain (Price, 1988, 1999). Moreover, treatments that are usually beneficial for most acute pains are not necessarily effective for chronic pain (Guindon, Walczak, & Beaulieu, 2007). Chronic pain is perceived as intractable and becomes intolerable because existing pharmacotherapies show only limit efficacy for pain management and adverse side-effects constrain therapeutic dosing. Many factors interact and negatively contribute to chronic pain including psychological factors that can lead to depression in both adults (Chenot et al., 2008; Cohen et al., 2007) and children (Scholl & Allen, 2007). Future research aimed at further elucidating mechanisms underlying transmission and modulation of pain, especially chronic pain, are required to eliminate unnecessary suffering in affected patients.
PATHWAYS CONTRIBUTING TO THE TRANSMISSION AND MODULATION OF PAIN The trajectory of nociceptive information traveling from the periphery through sensory nerve fibers to the central nervous system after an injury is characterized by a series of chemical and electrical reactions. These reactions may be divided into four steps (Fields, 1987). The first step is transduction, which corresponds to the transformation of the chemical, thermal, or mechanical stimulus into energy (action potentials) in sensory nerve endings. For example, when an individual burns a finger, intense heat energy from the burn will be converted into electrical nerve impulses at the free nerve endings in the skin (Melzack & Wall, 1991; Price, 1999). In this case, any skin damage will activate these networks and initiate the transmission (second step) of trains of neural impulses along sensory nerve fibers (known as primary afferent neurons) running from the periphery (skin) to the spinal cord. Then, the signal moves along secondary projection neurons via ascending pathways that originate in the spinal cord and innervate the
c33.indd 640
brain stem and thalamus. Thalamic neurons subsequently convey this information to diverse cortical regions (Fields, 1987). The third step is the modulation of neurons responsible for the transmission of nociceptive information from the periphery to the central nervous system by descending projections from the brain. These descending processes may either inhibit or facilitate pain (Bie & Pan, 2007; Garcia-Larrea & Magnin, 2008; Mason, 2005). The fourth step refers to the perception of pain. This phenomenon corresponds to the finality of the nociceptive experience that can be influenced by emotional state and previous experience (Fields, 1987).
Transduction Nociceptive information is perceived by the application of a stimulus capable of harming the integrity of the organism (e.g., burning a finger with fire). Note that the same stimulus will be ignored by some receptors and perceived by others. The latter receptors, better known as nociceptors (responding to noxious stimulation), are bushy networks of fibers that penetrate many layers of the skin, muscles, articulations, and visceral structures (Melzack & Wall, 1991; Price, 1999). These nociceptive receptors are sensitive to stimuli of various kinds (chemical, thermal, or mechanical) and transduce this stimulus energy into actions potentials. Transmission of this information is linked to the intensity of nociceptive stimulation and is encoded by the frequency and pattern of firing of nociceptive afferent neurons (Fields, 1987; Julius & Basbaum, 2001; Price, 1999). These primary afferents consequently mediate both transduction and transmission of pain (Fields, 1987; Millan, 1999). These nociceptive receptors are also described as nerve endings linked to sensory nerve fibers. The primary afferent sensory fibers include A (large myelinated), A (small myelinated), and C fibers (thin unmyelinated), which are classified based on their fiber diameters and conduction velocities (Table 33.1). These fibers are tuned precisely to begin firing nerve impulses when a particular event occurs (depending on the fiber recruited) in the region of their terminals. For example, A fibers start firing when skin is cooled by a fraction of a degree and A/C fibers largely respond to an increase in temperature. A single painful stimulus evokes two successive and qualitatively different pain sensations, termed first and second pain. First pain is mediated by A fibers and is characterized by a brief, well-localized sensation whereas second pain is mediated by C fibers and is associated by a later, longer lasting burning and more diffusely localized sensation (Melzack & Wall, 1991; Price, Hu, Dubner, & Gracely, 1977; Table 33.1).
8/17/09 3:05:05 PM
Pathways Contributing to the Transmission and Modulation of Pain 641
Transmission The transmission of nociceptive information from the periphery to the central nervous system constitutes a first line of defense to minimize damage to the organism. This process is more complex than described previously by early pain theories due to the multiplicity of different receptors, overlap in receptive fields of afferent fibers, and involvement of multiple ascending nociceptive pathways (for a review, see Julius & Basbaum, 2001; Melzack & Wall, 1991; Millan, 1999; Price, 1999). Skin damage initiates trains of nerve impulses along primary afferent fibers that are running from the periphery to the spinal cord. The primary nociceptive afferent neurons enter the spinal cord via the dorsal roots to converge on and synaptically excite neurons in the grey matter of the dorsal horn of the spinal cord. The cells in the grey matter of the spinal cord are arranged in laminae (or layers) in a dorsal ventral direction running the entire length of the spinal cord (Rexed, 1952). A total of 10 laminae have been described, of which 6 are found in the dorsal horn (Figure 33.3). Laminae II may be further subdivided into laminae II inner and outer, which receive different primary afferent inputs (Julius & Basbaum, 2001; Marker, Lujan, Colon, & Wickman, 2006). In general, nociceptive afferent C fibers and some A fibers terminate in laminae
Sensory cortex
I and II, whereas other A fibers penetrate deeper into the dorsal horn (laminae V). The second-order neurons in the dorsal horn of the spinal cord cross over (decussate) to the contralateral side and ascend to innervate the thalamus via the spinothalamic tract (neospinothalamic and paleospinothalamic), the major ascending central pathway for pain (Melzack & Wall, 1991; Price, 1999). The neospinothalamic pathway originates in part from laminae I of the spinal cord that receives input from A fibers responsible for rapid and well-localized pain (for review, see Besson & Chaouch, 1987; Julius & Basbaum, 2001; Kandel, 1985; Melzack, 1990b). At least four classes of spinal and medullary dorsal horn neurons have been identified. These include low threshold mechanosensitive (LTM), thermoreceptive (warm and cold), wide dynamic range (WDR), and nociceptive specific (NS) cells. WDR neurons respond with increasing action potential frequency to stimulation ranging from nonnoxious to noxious. These neurons, which receive input from both large diameter (A) and small diameter (A␦ and C) fibers, code information about stimulus intensity. Nociceptive specific cells, which receive input from small diameter (A␦ mechanosensitive, A␦ heat, and C polymodal), respond exclusively to noxious stimuli. By contrast, low-threshold mechanosensitive cells, which receive input
Sensory cortex Frontal lobe
Limbic System
Limbic System Thalamus
0
0
1 2 3 4 5 6
Frontal lobe Thalamus
PAG
0
NRM
0
⫹ 3 4 5 6 0
0
Transmission
Modulation
Spinothalamic tract
Figure 33.3 Circuitry mediating transmission (from the periphery to the brain) and modulation (from the brain to the periphery) of pain.
c33.indd 641
8/17/09 3:05:05 PM
642
Pain: Mechanisms and Measurement
from large-diameter A fibers, respond to light touch, pressure, and hair movement. Few of these fibers project into the spinothalamic tract. Thermoreceptive cells respond exclusively to either warming or cooling of the receptive field. Several lines of evidence specifically support a role for WDR and NS cells in pain discrimination (see Price & Dubner, 1977, for a review): (a) selective stimulation of these fibers produces sensations of pain, (b) these neurons exhibit maximal responses to noxious (as opposed to nonnoxious) levels of stimulation, (c) manipulations that reduce neural responses in these cells produce concomitant reductions in pain sensation, and (d) these neurons exhibit anatomical connections consistent with a role in pain transmission. Nociceptive dorsal horn neurons project to the ventroposterolateral (VPL) nucleus of the thalamus. VPL neurons subsequently relay the transmitted information to the sensory cortex. The rapid conduction of A fibers and the small receptive fields of the neospinothalamic pathway are essential qualities that permit the physical or sensorydiscriminative aspects of pain (localization and perception; Figure 33.3). The paleospinothalamic tract is located on the median position of the thalamus. This tract is mainly innervated by dorsal horn neurons that receive afferent input from C fibers that transmit slow and diffuse pain. Synapses converge principally on the nucleus of the reticular formation of the cerebral trunk and the median nucleus of the thalamus whose neurons subsequently project to the frontal cortex and limbic system. These latter regions are also implicated in emotion and memory (Kandel, 1985; Melzack, 1990b). The slow conduction of C fibers, the diffuse aspect of the receptive fields, and the higher cerebral structures implicated in the paleospinothalamic pathway support a role for this pathway in motivationalaffective (e.g., unpleasantness) aspects of pain perception (Figure 33.3). Modulation Nociceptive information traveling from the periphery to the central nervous system can be modulated by inhibition or facilitation from pathways that descend from the brain to the spinal cord. Descending inhibition occurs at any moment of the transmission of nerve impulse. Three main mechanisms describe this modulation: the gatecontrol theory (Melzack & Wall, 1965), the descending inhibitory control system (Basbaum & Fields, 1978; Bie & Pan, 2007; Garcia-Larrea & Magnin, 2008; Millan, 2002; Reynolds, 1969), and the inhibitory control produced by higher centers of the central nervous system (Bie & Pan, 2007; Craig & Bushnell, 1994; Garcia-Larrea & Magnin,
c33.indd 642
Verbal Rating Scale No pain 0
Score
0
1
2
Mild
Moderate
1
Severe 3
2
Numerical Rating Scale 3 4 5 6 7
8
9
No pain
0
10 Worst possible pain
10
20
30
40
50
60
70
80
90
100
Visual Analogue Scale Worst possible pain
No pain
McGill Pain Questionnaire 102 words divided in three categories: sensory, affective, evaluative Pain Rating Index Number of Words Chosen Actual Pain Intensity: Sum of ranking of each words chosen in all 20 subclasses
Sum the words chosen in all 20 subclasses
• • • • • •
No pain (0) Mild (1) Discomforting (2) Distressing (3) Horrible (4) Excruciating (5)
Figure 33.4 Four valid tools (or instruments) to measure pain in humans: (1) Verbal Rating Scale, (2) Numerical Rating Scale, (3) Visual Analogue Scale, and (4) McGill Pain Questionnaire.
2008; Hagbarth & Kerr, 1954; Mason, 2005; Melzack, Stotler, & Livingston, 1958; Figure 33.3). The gate-control theory (see Figure 33.2) proposes the existence of a rapidly conducting fiber system (referring to A fibers), which inhibits the synaptic transmission at laminae I and II of the dorsal horn of the spinal cord of a more slowly conducting system (A and C fibers) that carries the signal for pain (Melzack & Wall, 1965). The existence of a descending inhibitory control system was first described by Reynolds (1969), who hypothesized that electrical stimulation of a small area of the grey matter surrounding the cerebral aqueduct—the periaqueductal grey area (PAG)—could enhance descending inhibition and produce analgesia. His hypothesis was borne out as electrical stimulation of the PAG induced sufficient analgesia to perform an invasive surgery (laparotomy) on otherwise awake rats. Later, it was discovered that brain stem inhibitory fibers descend through a distinct pathway in the dorsolateral spinal cord called the dorsolateral funiculus (Basbaum, Marley, O’Keefe, & Clanton, 1977). The PAG was a key component of this descending system (Fields, Basbaum, & Heinricher, 2006). The PAG receives input from many different brain regions (e.g., hypothalamus, cortex, thalamus) and is implicated in the mechanism whereby cortical and other inputs act to control the nociceptive gate in the dorsal horn. PAG neurons activate neurons in the nucleus raphe magnus (NRM), an
8/17/09 3:05:06 PM
Pathways Contributing to the Transmission and Modulation of Pain 643
area of the rostral medulla close to the midline, which in turn project via the dorsolateral funiculus of the spinal cord to make synaptic connections on dorsal horn interneurons (Basbaum & Fields, 1978; Fields, Basbaum, & Heinricher, 2006). Inhibitory control produced by higher centers of the central nervous system was first demonstrated by the fact that responses evoked in the ventrolateral spinal cord could be virtually abolished by the stimulation of different brain structures including the reticular formation, the cerebellum, and the cerebral cortex (Hagbarth & Kerr, 1954). The clear implications of this demonstration were that these neural structures exert an inhibitory control over the transmission of pain in the dorsal horn. This hypothesis was confirmed by the demonstration that lesions of a small area of the reticular formation (the central tegmental tract adjacent to the lateral PAG) produced hyperalgesia in cats. This area was postulated to exert a tonic inhibitory control over the pain signals because ablation of this structure produced hyperresponsiveness to pain. Thus, removal of inhibition allows pain signals to travel unchecked to the brain and can even permit summation of nonnoxious signals to produce spontaneous pain (Melzack et al., 1958). Chronic pain is also known to be actively facilitated by descending projections from the nucleus raphe magnus (Bie & Pan, 2007; Garcia-Larrea & Magnin, 2008). Mechanisms underlying descending inhibition (see reviews by Bie & Pan, 2007; Fields, Basbaum, & Heinricher, 2006; Garcia-Larrea & Magnin, 2008; Mason, 2005; Melzack & Wall, 1991; Price, 1999) and descending facilitation (Garcia-Larrea & Magnin, 2008; Mason, 2005) of pain
TABLE 33.2
are reviewed elsewhere. Pathophysiologic modifications that can contribute to pain and the attenuation of spinal inhibition include selective neuronal loss and subsequent development of inflammatory phenomena (e.g., cytokine secretion by macrophages and glial cells). These observed changes in the dorsal horn can modify the activity of projections neurons to the brain stem, thereby increasing spinal hyperactivity (DeLeo & Yezierski, 2001; Garcia-Larrea & Magnin, 2008; Pruimboom & vanDam, 2007). Perception The personal interpretation of a nociceptive stimulus that is associated with an emotional state of mind (or situation) or past experiences is described as the perception of pain. Note that a similar stimulus may provoke different sensations in different individuals, suggesting that pain is highly modifiable. Many psychological, neurophysiological, and psychophysical factors can modulate and influence the perception of pain, thereby altering the effectiveness of the treatment (Greenwald, 1991; Guindon, Walczak, & Beaulieu, 2007; see Table 33.2). Thus, early imaging studies using positron emission tomography (PET) and, more recently, functional magnetic resonance imaging (fMRI) hold significant promise for identifying networks of interconnected cerebral structures implicated in the affective component of pain (Bingel, Schoell, & Büchel, 2007; Craggs, Price, Verne, Perlstein, & Robinson, 2007; Moisset & Bouhassira, 2007; Rainville, Duncan, Price, Carrier, & Bushnell, 1997).
Drugs commonly used in the treatment of pain
Classical analgesics and new adjuvants Opioids
NSAIDs
Antidepressants
Local anesthetics
• Morphine
• Diclofenac
• Venlaflaxine
• Bupivacaine
• Hydromorphone
• Ketorolac
• Imipramine
• Ropivacaine
• Fentanyl
• Ketoprofen
• Duloxetine
• Levobupivacaine
• Remifentanil
• Ibuprofen
• Bupropion
• Lidocaine 5%
• Alfentanil
• Naproxen
• Sufentanil • Meperidine
COXIBS
Anticonvulsants • Gabapentin
Others
• Pregabalin
• Acetaminophen
• Buprenorphine
• Celecoxib
• Lamotrigine
• Butorphanol
• Etoricoxib
Cannabinoids
• Ketamine
(paracetamol)
• Nalbuphine
• Lumiracoxib
• Cannabis
• Nefopam
• Oxycodone
• Parecoxib
• Δ9-THC/CBD
• Clonidine
• Tramadol
• Nabilone
• Neostigmine
• Methadone
• Dronabinol
• Magnesium
• Adjulemic Acid
c33.indd 643
8/17/09 3:05:06 PM
644
Pain: Mechanisms and Measurement
PAIN MEASUREMENT IN HUMANS The development of valid instruments for measuring pain in humans is critical to effectively quantify the intensity, quality, and duration of pain. Measurement of pain in humans is important for diagnosis, choice of therapy, and evaluation of the relative effectiveness of different therapies (Guarino & Myers, 2007; Melzack & Katz, 1999). The varieties of pain experience can be assessed by a description of the qualities of pain experienced within three (i.e., sensory-discriminative, motivational-affective, and cognitive-evaluative) dimensions (Melzack, 2005). However, it may be difficult to describe pain experience because these words are not often used and it seems impossible to actively capture such abstract sensations as the shooting pain felt by neuropathic patients (Melzack & Katz, 1999; Melzack & Wall, 1991). In the past, pain was measured with respect to a single unique quality: the variation in intensity (Beecher, 1959). Methods used included verbal rating, numerical rating, and visual analogue scales (Figure 33.4). Verbal rating scales consist of a series of verbal pain descriptors ordered from the least to the most intense sensation (no pain, mild, moderate, severe; Daoust, Beaulieu, Manzini, Chauny, & Lavigne, 2008; Jensen & Karoly, 1992). The patient reads the list and chooses the one word that most closely describes his momentary pain (e.g., a score of zero to the lowest rank descriptor, a score of one for the next lowest rank descriptor, and so on). Numerical rating scales are described as a series of numbers from 0 to 10 or 0 to 100 with endpoints intended to represent the extremes of the pain experience, such as no pain and worst possible pain, respectively (Jensen & Karoly, 1992; Molton, Jensen, Nielson, Cardenas, & Ehde, 2008). In this case, the patient has to decide the number that best corresponds to the intensity of his or her actual pain. These two methods are simple to administer, reliable, and valid. Visual analogue scales remain the measurement instrument of choice for pain assessment when unidimensional levels of pain are assessed. The visual analogue scale consists of a 10-cm horizontal (Daoust et al., 2008; Huskisson, 1983; Joyce, Zutshi, Hrubes, & Mason, 1975) or vertical (Sriwatanakul et al., 1983) line with the two endpoints marked as no pain and worst pain ever (or other similar descriptors). The patient is asked to place a mark on the line at the point that best describes the level of the pain intensity experienced. The distance from the low end to the mark is used as a numerical index. This method is particularly valuable because it can be used to assess unpleasantness of the pain experience (Nielsen, Price, Vassend, Stubhaug, & Harris, 2005; Price, Harkins, & Baker, 1987; Price, Harkins, Rafii, & Price, 1986) separately from intensity, is sensitive to pharmacological and
c33.indd 644
nonpharmacological procedures that alter pain experience (Bélanger, Melzack, & Lauzon, 1989; Choinière, Melzack, Girard, Rondeau, & Paquin, 1990; Price, Harkins, & Baker, 1987), and correlates well with verbal and numerical rating scales measuring pain (Daoust et al., 2008; Ekblom & Hansson, 1988; Kremer & Atkinson, 1983). This method was improved by Choinière and Amsel (1996) with their development of a visual analogue thermometer. This instrument consists of a rigid plastic horizontal scale of 10 cm with a band that slides from the left (no pain) to the right (worst pain ever) with a tab on the back of the thermometer; the patient places the cursor at the point referring to his or her pain intensity. However, in some elderly patients who lack manual dexterity, this method is not optimal and numerical or verbal rating scales are more reliable (Gagliese & Melzack, 1997). The visual analogue scale has been used to measure pain affect (or unpleasantness), although it corresponds to only one dimension of pain experience leaving the two other dimensions to be assessed separately. The complexity of pain necessitates describing it in terms of three dimensions to adequately capture the pain experience. The McGill Pain Questionaire (MPQ; Melzack, 1975) was developed to better specify the qualities of pain. This questionnaire developed from choosing 102 words from the literature and categorizing them into three major classes related to the three dimensions of pain: (1) sensory descriptors that describe the sensory qualities of the pain experience in terms of temporal, spatial, pressure, thermal, or other such qualities; (2) affective descriptors that assess the affective qualities in terms of tension, fear, and autonomic properties that are part of the pain experience; and (3) evaluative descriptors that describe the subjective overall intensity of the total pain experience (Melzack, 2005; Melzack & Torgerson, 1971). The words were divided in three major classes, and further separated into 16 subclasses. The intensity of each word was subsequently rated using a numerical scale ranging from least to worst pain by physicians, patients, and university graduates. Some key words were considered missing by patients and a fourth supplementary class (adding 4 subclasses) called miscellaneous was added to the lists of pain descriptors (Melzack & Torgerson, 1971). The descriptor lists of the MPQ are read to the patient who is instructed to choose only the words that describe the feelings and sensations at that precise moment (Melzack, 1975). The questionnaire gives the clinicians three major indices: (1) the pain rating index that corresponds to the sum of ranking of each word chosen in all 20 subclasses (the word in each subclass implying the least pain is given a score of 1, the next word a score of 2, and so on); (2) the number of words that are chosen; (3) the actual pain intensity when the questionnaire is administered from no pain (score 0) to excruciating pain
8/17/09 3:05:06 PM
Pain Measurement in Animals
(score 5; Melzack, 1975, 2005). The MPQ has been demonstrated to be a reliable, consistent, valid, and useful tool for clinicians to assess pain qualities in each patient (Chapman et al., 1985; Love, Leboeuf, & Crisp, 1989; Melzack, 1983; Wilkie, Savedra, Holzemier, Tesler, & Paul, 1990). The MPQ is sensitive enough to measure decreases in pain behavior following analgesic treatments in patients with wounds (Briggs, 1996) or breast cancer (Eija, Tiina, & Pertti, 1996) pain and also measure decreases comparable to those detected using visual analogue or verbal rating scales following oral analgesics treatments in postoperative pain (Jenkinson et al., 1995). Since its introduction in 1975, the MPQ has been used in several hundred studies of pain and translated into several languages (Melzack & Katz, 1999). A short version of the MPQ was subsequently developed to better suit project research or emergency situations where the time available to obtain information is limited and critical (Melzack, 1987). This short version consists of 15 words taken from the longer version (11 words from the sensory and 4 words from the affective categories), and is accompanied by a present pain intensity score and a visual analogue scale rating (Cook et al., 2004; Melzack, 1987). Although pain is a private and personal experience, patients suffering the same or similar pain syndromes can characterized their pain by a distinctive constellation of words. Patients with similar conditions but sometimes divergent backgrounds show remarkable consistency in this choice of words (Grushka & Sessle, 1984; Katz, 1992; Katz & Melzack, 1991; Melzack, 2005). Thus, the MPQ enables different pain syndromes to be reliably discriminated (Dubuisson & Melzack, 1976). However, high levels of anxiety in a patient can diminish the discriminative capacity of the instrument (Atkinson, Kremer, & Ignelzi, 1982; Bélanger et al., 1989; Melzack, Wall, & Ty, 1982). Behavioral approaches such as those used to measure pain in animals are relied on to assess pain in infants and preverbal children, and also in adults with poor language capacity or mental confusion (Chapman et al., 1985; Marinov, Mandadjieva, & Kostianev, 2008; McGrath & Unruh, 2002; Ross & Ross, 1984). However, behavioral measures of pain should not replace self-rated measures if the patient is able to rate his or her subjective state of pain. Someone else’s pain cannot be described completely by anyone other than the inflicted individual (Melzack & Katz, 1999). Some patients may be stoic in response to pain and outwardly exhibit a calm demeanor even while experiencing excruciating pain accompanied by autonomic activation, whereas other patients may exaggerate pain behaviors but in fact experience less pain. Therefore, concordance in the assessment of pain may vary between the patient
c33.indd 645
645
and health care professionals, effects that may be attributed to the personal and private nature of the pain experience (Choinière et al., 1990). In some cases, physiological approaches (e.g., heart rate, blood pressure, electrodermal activity) can also be used to better correlate pain experience with its behavioral counterpart in patients and elucidate mechanisms associated with the painful experience (al’Absi, Petersen, & Wittmers, 2002; Chapman et al., 1985; Price, 1988). Nonetheless, it must be noted that even though many physiological and endocrine events can be measured and occur concurrently with pain experience, many of these events are not exclusively related to pain and may appear in response to stress rather than pain per se (Christensen, Brandt, Rem, & Kehlet, 1982).
PAIN MEASUREMENT IN ANIMALS Early studies on pain were performed in anaesthetized animals and used transient stimuli to produce pain in order to avoid tissue damage. In the past decade, many animal pain models have been developed to study mechanisms underlying persistent pain in awake behaving animals (for a review, see Dubner & Ren, 1999). Animal subjects are necessary in pain research because they permit manipulation of experimental variables that can lead to a better understanding of pain mechanisms at cellular and subcellular levels and improve analgesic therapies. Moreover, animal models can be used to model certain human pathological conditions (Chapman et al., 1985). The main purpose of these studies is to further elucidate the physiopathological aspect of pain conditions found in humans and permit preclinical validation of novel analgesic targets. Experimental pain may be induced in animals with different modalities of noxious stimulation (e.g., heat, mechanical, electrical and chemical stimulation; for reviews, see Chapman et al., 1985; Dubner & Ren, 1999; Hogan, 2002; Ren & Dubner, 1999; Whiteside, Adedoyin, & Leventhal, 2008). Although animals lack the ability to verbally communicate their pain, they exhibit the same motor behaviors and physiological responses demonstrated by humans following pain stimulation (Chapman et al., 1985; Dubner & Ren, 1999; Whiteside et al., 2008). However, for ethical reasons, animals should be exposed to the minimal pain necessary to carry out the experiment (Dubner, 1983) and they should not be exposed to pain greater than humans themselves would tolerate (Bowd, 1980). Finally, animal studies on pain employ behavioral measures of two types: simple reflex measures and more complex voluntary and intentional behaviors that can be either unlearned or learned (Chapman et al., 1985).
8/17/09 3:05:06 PM
646
Pain: Mechanisms and Measurement
Simple Reflex Measures Simple reflex measures include the tail-flick, the hot-plate, the mechanical withdrawal reflex, the Hargreaves, and the paw pressure tests (Table 33.3). In most cases, latency to withdraw/escape from noxious stimulation is assessed. The tail-flick test measures the latency for a rodent to withdraw his tail from a heat source (radiant heat or hot water immersion) focused on the tail (D’Amour & Smith, 1941). The hot-plate test measures the latency to escape (licking, lifting, paw fluttering) when the animal is placed on a preheated plate (van Eick, 1967). In the mechanical withdrawal test, thermal, mechanical, or electrical stimuli may be employed (Chaplan, Bach, Pogrel, Chung, & Yaksh, 1994). The Hargreaves test measures the latency to paw withdrawal following application of radiant heat to the plantar surface of the paw through the floor of a glass platform (Hargreaves, Dubner, Brown, Flores, & Joris, 1988). One advantage of this latter method is that animals do not need to be restrained, but rather may be placed underneath an inverted plastic cage positioned on the glass platform so that they may move freely. The paw pressure test measures the latency to struggle or vocalization following application of a constant, or more frequently, an increasing mechanical pressure applied to the hind limb (Randall & Selitto, 1957). It is important to note that in all these simple reflex measures the animal has control over the intensity and duration of the stimulus ensuring that the animal is not exposed to intolerable levels of pain (Dubner & Ren, 1999). Organized Unlearned Behaviors Complex organized unlearned behaviors may be assessed to measure pain because these behaviors require supraspinal sensory processing rather than relying exclusively on simple reflexes (Chapman et al., 1985). Commonly associated behaviors are used to assess nocifensive manisfestation following inflammatory, neuropathic, visceral, cancer, or postoperative pain (Bennett & Xie, 1988; Brennan, Vandermeulen, & Gebhart, 1996; Hargreaves et al., 1988; Ness, 1999; Pogatzki, Niemeier, & Brennan, 2002; Schwei et al., 1999; Wacnik et al., 2003). In these models, pain is produced that cannot be controlled by the animal. Therefore, it is important that investigators assess the level of pain in these animals and provide analgesics when it doesn’t interfere with the experiment (Dubner & Ren, 1999). Organized Learned Behaviors Organized learned behaviors are considered as a separate category because pain is inferred from an animal’s learned or operant responses to escape noxious stimulation. Indeed,
c33.indd 646
the most common method involves an animal escaping from a noxious stimulus by initiating a learned behavior such as crossing a barrier or pressing a bar. For example, electric shock can be delivered to a grid floor in a cage and the animal can learn to jump over a barrier partition to escape this stimulus. It is important to note that these learning procedures give the animal control over the painful stimulus of the experiment (Dubner & Ren, 1999). Tissue Injury Models of Persistent Pain Animal models of tissue injury and inflammation have been developed to reproduce features of clinical pain conditions (Whiteside et al., 2008). The formalin test (Dubuisson & Dennis, 1977) and the orofacial formalin test (Clavelou, Dallel, Orliaguet, Woda, & Raboisson, 1995; Clavelou, Pajot, Dallel, & Raboisson, 1989) are commonly employed examples. Other models of more persistent pain employ irritants (e.g., capsaicin, bee venom) or inflammatory agents (e.g., carrageenan or complete Freund’s adjuvant; Hargreaves et al., 1988; LaMotte, Shain, Simone, & Tsai, 1991; Stein, Millan, & Herz, 1988). Animal models of polyarthritis and arthritis have also been developed to attempt to mimic these human conditions (Coderre & Wall, 1987; De Castro Costa, DeSutter, Gybels, & Van Hees, 1981; Okuda, Nakahama, Miyakawa, & Shima, 1984; Schaible, Schmidt, & Willis, 1987; Table 33.4). Furthermore, visceral pain (Chernov, Wilson, Fowler, & Plummer, 1967; Ness & Gebhart, 1988; Table 33.5), cancer pain (Medhurst et al., 2002; Wacnik et al., 2001; Table 33.6), and postoperative pain (Brennan et al., 1996; Pogatzki et al., 2002; Table 33.7) models have been developed as well. These models are well characterized and remain sensitive to analgesics that are effective for suppressing similar pain states in humans. Nerve Injury Models of Persistent Pain Nerve injury models have been developed to better understand the mechanisms involved in the development of neuropathic pain in humans. Several models of nerve ligation have been developed over the past 20 years (Bennett & Xie, 1988; Decosterd & Woolf, 2000; Kim & Chung, 1992; Seltzer, Dubner, & Shir, 1990) and have improved our understanding of mechanisms (e.g., central sensitization; Woolf & Thompson, 1991) that contribute to neuropathic pain states (Table 33.8). Bennett and Xie (1988) developed the first animal model of neuropathic pain, known as the chronic constriction injury (CCI) of the sciatic nerve. The development of animal models of neuropathic pain has had a monumental impact on the pain field, both by spurring research on the underlying mechanisms, by encouraging the development of additional animal models, and by enabling preclinical validation of novel analgesics that have shown efficacy
8/17/09 3:05:07 PM
• Radiant heat is focused on the tail
Tail-flick test
• Application of filaments of different calibers (force) are typically applied to the plantar surface of the paw
Von Frey filament test
Paw pressure test
• Heat is applied to the plantar surface of the hind paw through the floor of a glass platform
Hargreaves test
• A mechanical simulator is used to apply a constant noxious pressure or, more often, an increasing pressure to the hind paw of the animal
• Filaments applied through the floor of a wire mesh platform
• Animal is placed on a metal surface that is either preheated to a noxious temperature or progressively increases in temperature
Hot-plate test
• The tail can also be immersed in hot water
Description
Methods for measurement of acute pain in animals
Test
TABLE 33.3
Thermal
Mechanical
c33.indd 647
8/17/09 3:05:07 PM
• Sensitive to analgesics
• Animal must be manually restrained for assessment
• Sensitive to analgesics
• Frequency of paw withdrawal to a given filament (force) • Behavioral assessments: freezing (no movement), withdrawal of the paw, and vocalization produced by the animal
• Animal is not manually restrained during assessment
• Sensitive to analgesics
• Animal is not manually restrained during assessment
• Sensitive to analgesics
• Jump reflects an integrated response at the supraspinal level
• Licking is a reflex
• Sensitive to analgesics
• Simple method that measures spinal nociceptive reflex
Commentary
• The threshold of paw withdrawal is related to the force applied by the filament
• Measure the latency for the animal to withdraw its paw from a heat source
• Measure the latency to the appearance of evoked responses from the animal (lick, bite, fluttering of hindpaws, and/or jump)
• Measure the latency for the animal to remove it’s tail from the heat source
Dependent Measures
Randall & Selitto, 1957
Chaplan et al, 1994
Hargreaves et al., 1988
van Eick, 1967
D’Amour & Smith, 1941
References
c33.indd 648
8/17/09 3:05:07 PM
• Subcutaneous (dorsal surface) or intraplantar (plantar surface) injection of formalin in the paw
Formalin test
• Intraplantar injection of complete Freund’s adjuvant (CFA) in rodents
• Intrajoint injection of sodium urate crystals, carrageenan, kaolin, or other irritants provoke acute inflammation of the joint
• CFA injected into the tail induces a delayed hypersensitivity
Complete Freund’s adjuvant
Acute arthritis
Polyarthritis
• Inflammation and hyperalgesia of multiple joints occurs after 10 days to 3 weeks
• Intraplantar injection of carrageenan
• Mechanical and thermal hypersensitivity
• Reduction in motor activity
• Scratching behavior
• Mechanical and thermal hypersensitivity
• Circumference of the joint
• Degree of flexion of the joint
• Mechanical and thermal hypersensivitiy • Edema
• Mechanical and thermal hypersensitivity • Edema
• Mechanical and thermal hypersensitivity (assessed in rodents with von Frey filaments and Hargreaves testing)
• Intradermal injection of capsaicin into forearm of humans
Carrageenan
• Nocifensive behavior (licking, biting, and flinching injected paw)
• Intraplantar injection of capsaicin in rodents
Capsaicin
• Duration of time spent rubbing the injected upper lip
• Injection of formalin into the upper lip of a rodent
• Systemic disease develops and includes: skin lesions, destruction of bones and cartilage, liver impairment, and lymphadenopathy
• Sodium urate crystal-induced arthritis is fully developed within 24 hr
• Inflammation appears 2 hr after injection and is maximal after 6–8 hr
• Used in rats and mice
• Behavioral changes maximal at 2 hr post-injection and typically persist for 24 to 96 hr
• This model is used in rats, mice and human and nonhuman primates
• Reliable method to assess trigeminal pain
• Biphasic response similar to that observed following formalin injection in the paw
• This model is commonly used in rats and in mice
• A weighted pain score may be calculated based on time spent in each category of behavior
Commentary • Biphasic pain response, characterized by acute and inflammatory phases
• Duration of licking/biting, lifting, and favoring the injected paw is measured
Dependent Measures
Orofacial formalin test
• Injection is unilateral
Description
Test
TABLE 33.4 Animal models of tissue injury-induced persistent pain References
De Castro Costa et al., 1981
Okuda et al., 1984; Coderre & Wall, 1987; Schaible et al., 1987
Stein, Millan, Herz, 1988; Iadarola et al., 1988
Hargreaves et al., 1988
Simone, Ngeow, Putterman, LaMotte, 1987; LaMotte et al., 1991
Clavelou et al., 1989; Clavelou et al., 1995
Dubuisson & Dennis, 1977; Watson, Sufka, Coderre, 1997
c33.indd 649
8/17/09 3:05:07 PM
Myocardial ischemia
Ureteral calculosis
• Cystitis is induced in mice, and rats by i.p. injection of cyclophosphamide (anticancer substance) with adverse effects affecting the bladder
Cystitis
• Temporary obstruction of coronary artery or application of algogenic substances to epicardium of anaesthetized animal
• Animals show discomfort (abdominal contraction and hind limb extension) during 4 days following the implantation
• Artificial stone (made using dental cement) is implanted in rat ureteral tract
• Mustard oil, acetic acid, and other compounds have been used to induce cystitis
• Distension of the bladder in anaesthetized animals
• Tachycardia grade related to increasing colorectal distension
• Additionally used on anaesthetized animals to measure tachycardia simultaneously
Bladder distension
• Muscle contraction or behavioral reaction is measured
• Colorectal distension induced by an inflatable device
Colorectal distension
• Recordings of neuronal electrical activity
• Muscular lumbar hyperalgesia is demonstrated by vocalization of the animal following electrical stimulation of lumbar muscle
• In mice, mechanical allodynia and locomotion behavior are commonly measured
• In rats, pain behavior (licking, abdominal contraction, piloerection, arched back) are commonly evaluated
• Physiological responses or activation of micturition reflex
• Abdominal contractions and stretching of the body
• Injection (i.p.) of noxious irritants such as acetic acid produces a characteristic response (e.g., arched back, abdominal contractions, rolling on one side)
Writhing test
Dependent Measures
Description
Test
TABLE 33.5 Animal models of visceral pain
• This model is used to elucidate pain mechanisms related to angina
• Strong clinical relevance
• Strong clinical relevance
• The stimulus is known to affect only one viscera organ, the bladder
• Observed cardiovascular or visceromotor responses are reliable and reproducible
• Attenuated by analgesics
• Strong clinical relevance
Kumar et al., 1970; Pan & Chen, 2002
Giamberardino, Vecchiet, Albe-Fessard, 1990; Giamberardino, Valente, de Bigontina, Vecchiet, 1995
McMahon & Abel, 1987; Lanteri-Minet, Don, de Pommery, Michiels, Menetrey, 1995
Gosling, Dixon, Dunn, 1977; Ness, 1999
Ness & Gebhart, 1988; Ness, Randich, Gebhart, 1991, Ness, 1999
Helfer & Jaques, 1970; Vyklicky, 1979
• Used in mice and rats • Stimulus more natural than injections of irritants
Chernov, Wilson, Fowler, Plummer, 1967;
References
• Inflammation of the visceral organs and abdominal wall
Commentary
650
Pain: Mechanisms and Measurement
TABLE 33.6 Animal models of cancer pain Test
Description
Dependent Measures
Commentary
References
Carcinoma
• Neoplasia cells (from mammary gland) are injected into the tibia
• Mechanical hypersensitivity
• This metastasis cancer model is proposed to study new therapeutic treatments for metastasis pain
Medhurst et al., 2002; Walker et al., 2002
• Bone destruction is observed between 10 and 14 days; bone integrity is compromised after 20 days Fibrosarcoma
• Malignant neoplasia cells are injected into the humerus of mice • Hyperalgesia appears on the third day after injecting the cells and progresses with tumor development
• Metastatic bone pain is difficult to control due to fast progression and difficulty in predicting onset and severity • Hypersensitivity related to movement • Mechanical and thermal hypersensitivity
• Humerus cancer model permits evaluation of hypersensitivity in movement
Wacnik et al., 2001, Wacnik et al., 2003
• Injecting cancer cells in the femur of mice was the first cancer pain model to be developed
Schwei et al., 1999; Wang & Wang, 2003
• Malignant neoplasia cells may also be injected into the calcaneus bone • Destruction is appearing after six days Osteosarcoma
• Malignant neoplasia cells are injected inside the femur of mice
• Mechanical hypersensitivity
• Cancer pain lesions are developing and are destroying the bone Neuropathic cancer pain
• Sarcoma cells are injected at the proximity of the sciatic nerve in mice
• This model suggests that cancer pain is different from inflammatory or neuropathic pain • Mechanical and thermal hypersensitivity
• Tumor growth compresses the nerve
• Nerve damage appears progressively by compression and better represents neuropathic cancer pain observed clinically
Shimoyama, Tanaka, Hasue, Shimoayama, 2002
TABLE 33.7 Animal models of postoperative pain Test
Description
Dependent Measures
Commentary
References
Plantar hindpaw incision
• 1 cm incision is made on the plantar surface of the hindpaw in rats
• Mechanical hypersensitivity
• Nociceptive behavior and mechanical hypersensitivity is similar to human reaction to wound
Brennan et al., 1996; Brennan, 1999
• Plantar muscle is cut parallel to muscle fiber
• Pain is elevated after surgery and decreases 7 to 10 days later • Local morphine or anesthetics reduce pain behavior
Gastrocnemius muscles paw incision
• Gastrocnemius muscles of the rat paw are sectioned under general anesthesia
• Thermal and mechanical hypersensitivity on the plantar face
• This model permits evaluation of mechanisms responsible for secondary hypersensitivity and its role in postoperative persistent pain
Pogatzki et al., 2002
• Opiate administration reduces pain behavior Ovariohysterectomy
c33.indd 650
• Removal of ovaries and uterus in rodents
• Mechanical and thermal hypersensitivity
• Pain behavior associated with abdominal contractions and stretching of the body
• Painful behaviors are evaluated with a rating scale
• Strong clinical relevance • Visceral or peritoneal pain may increase painful behaviors • Analgesics reduce these abnormal behaviors
Lascelles, Waterman, Cripps, Livingston, Henderson, 1995; Gonzalez, Field, Bramwell, McCleary, Singh, 2000
8/17/09 3:05:07 PM
Pain Measurement in Animals
651
TABLE 33.8A Animal models of neuropathic pain: traumatic nerve injury Test
Description
Dependent Measures
Commentary
References
Chronic construction of sciatic nerve
• Three loosely constrictive ligatures are placed around the common sciatic nerve
• Mechanical and thermal hypersensitivity
Bennett & Xie, 1988
• Animals develop nociceptive behaviors (protection and reduction of weight on the affected limb and lameness)
• Cold allodynia
• Hypersensitivity develops between 10 and 14 days and persists for 8 weeks
• Spontaneous pain
• Used in rats and mice
• The sciatic nerve is exposed at highthigh level and 1/3–1/2 of the dorsal thickness of the nerve is trapped in a ligature
• Mechanical and thermal hypersensitivity
• Hypersensitivity can persist for many months
• Ipsilateral and contralateral deficits
• Used in mice and rats
• Mechanical and thermal hypersensitivity • Spontaneous pain
• Thermed SNL (Chung) model
• Mechanical and thermal hypersensitivity
• This model shows that sensitivity of intact small nerve is modified by section of an adjacent nerve fiber • Rapid onset of cutaneous hypersensitivity
Decosterd & Woolf, 2000
• This model assesses the contribution of inflammation to the development of neuropathic pain
Bennett et al., 2000; Bennett, Everhart, Hulsebosch, 2000
• This model is also used in mice; known as chronic constriction of the saphenous nerve
Walczak, Pichette, Leblond, Desbiens, Beaulieu, 2005, 2006
Partial sciatic nerve injury
• Termed CCI model Seltzer et al., 1990
• Animals develop nociceptive behaviors (protection and licking of the affected limb) Spinal nerve ligation
• L5 and L6 spinal nerves are ligated and sectioned distal to the dorsal root ganglion; L4 spinal nerve is intact • Animals develop nociceptive behaviors (protection and reduction of weight on the affected limb and lameness)
Spared nerve injury
• Two of the three terminal branches of the sciatic nerve (tibial and common peroneal nerves) are ligated and sectioned. The third branch, the sural nerve, is left intact
• Cold allodynia • Spontaneous pain
• Animals develop chronic nociceptive behaviors (modification of position and reduction of weight on the affected limb) Neuritis-induced neuropathic pain
Saphenous nerve partial ligation
• Application of complete Freund’s adjuvant to the nerve (usually at the periphery of the sciatic nerve)
• Mechanical and thermal hypersensitivity
• 1/3–1/2 of the saphenous nerve is ligated (via three loose ligatures around the circumference of the saphenous nerve)
• Mechanical and thermal hypersensitivity
• Cold allodynia
Kim & Chung, 1992
• The SNL model (with the CCI and partial ligation of the sciatic nerve) constitute the three most widely used neuropathic pain models
• Neuropathic pain behaviors are observed 3 to 5 days after surgery
TABLE 33.8B Animal models of neuropathic pain: metabolic and toxic neuropathies Test
Description
Dependent Measures
Commentary
References
Diabetic neuropathy
• Injection (i.p.) of streptozotocine in rats provokes destruction of  cells in the pancreas and subsequent development of diabetes type 1
• Mechanical and thermal hypersensitivity
• This model is controversial because observations are difficult to attribute to neuropathy because rats show impaired health and obvious signs of discomfort
Rakieten, Rakieten, Nadkarni, 1963; Wuarin Bierman, Zahnd, Kaufmann, Burcklen, Adler, 1987
• Mechanical hypersensitivity always observed
• These models are useful for understanding neuropathy induced in patients by treatment with chemotherapeutic agents
Aley, Reichling, Levine, 1996; Authier, Coudore, Eschalier, Fialip, 1999; Polomano & Bennett, 2001
• Rats develop reductions in locomotor activity Chemotherapyevoked toxic neuropathy
c33.indd 651
• Repeated injections of antitumor agents (vincristine, paclitaxel, or cisplatin) change responsiveness to mechanical and sometimes thermal (cold or heat) stimulation
• Thermal hyperalgesia or hypoalgesia may be absent or present depending on agent and dose used
8/17/09 3:05:08 PM
652
Pain: Mechanisms and Measurement
for treating neuropathic pain in humans (e.g., gabapentin). The development of neuropathy following neuritis (Bennett, Chastain, & Hulsebosch, 2000), metabolic challenges (Rakieten, Rakieten, & Nadkarni, 1963; Wuarin Bierman, Zahnd, Kaufmann, Burcklen, & Adler, 1987), and chemotherapeutic treatment (Aley, Reichling, & Levine, 1996; Authier, Coudore, Eschalier, & Fialip, 1999; Polomano & Bennett, 2001) have also been studied (Table 33.8).
PHARMACOLOGICAL MANAGEMENT OF PAIN The main concern of clinical health professionals is to improve the management of pain in their patients. However, available pharmacotherapies for the management of pain are inadequate and require that multiple factors (e.g., unwanted side-effect profiles) also be considered (Table 33.9) to optimize the efficacy of the treatment. Pain is a multifactorial phenomenon requiring interdisciplinary approaches. Multimodal analgesia is an approach that involves the combination of several analgesics administered by the same or different routes to achieve better, more effective relief than analgesics administered separately. Thus, the combination of desipramine (an antidepressant) with morphine enhances the efficacy of the narcotic analgesic for the control of postoperative pain (Levine, Gordon, Smith, & McBryde, 1986). Furthermore, low back pain is well relieved by the combination of tramadol with acetaminophen (Perrot, Krause, Crozes, & Naïm, 2006). Musculoskeletal pain is similarly reduced by the combination of acetaminophen with hydrocodone (Hewitt et al., 2007). Different pharmacological approaches exist for the treatment of pain such as the use of opioids or nonsteroidal anti-inflammatory drugs
TABLE 33.9 Multiple factors influencing pharmacological treatment of pain • Cultural and religious belief • Personal experience • Social/family environment • Previous medical history • Pain intensity • Reduced work status • Interference with leisure activity • Other diseases interacting • Interactions with other drugs • Toxicity • Cost • Patient acceptance and compliance • Patient expectations and beliefs about the cause of pain
c33.indd 652
(NSAIDs), anticonvulsants, antidepressants, ketamine, and others (for review, see Guindon, Walczak, & Beaulieu, 2007) although adverse side-effects remain the main constraint (Table 33.10). Furthermore, along with standard routes of administration, novel drug delivery approaches have become available in the past few years, including transdermal patches, oral mucosal sprays, intranasal instillation, rectal suppositories, and others (Table 33.10). The literature has indicated that multimodality approaches are associated with an increase in patient satisfaction and a reduction in side-effects compared to those resulting from single analgesic techniques in pain management (Brodner et al., 2001; Mugabure Bujedo, Tranque Bizueta, Gonzalez Santos, & Adrian Garde, 2007; Pyati & Gan, 2007).
NONPHARMACOLOGICAL MANAGEMENT OF PAIN Less than 50 years ago, neurosurgery was a common approach employed to manage severe or chronic pain. Indeed, the destruction of peripheral nerves by a meticulous surgery that excise the injured area and include grafting of a new nerve section was performed frequently. However, this procedure failed to alleviate pain and in some cases exacerbated it (Noordenbos & Wall, 1981). In addition, the cordotomy procedure (cutting tracts of the spinal cord) was used on terminally ill cancer patients to reduce severe cancer pain (Ischia, Ischia, Luzzani, Toscano, & Steele, 1985; Ischia, Luzzani, Ischia, & Pacini, 1984). This procedure was recommended only to terminal cancer patients because the pain returns and is frequently accompanied by unpleasant sensations and incontinence (Ischia et al., 1984, 1985). However, surgery infrequently achieves long-term control of pain and resumption of pain is common. This finding is unsurprising because surgical section disrupts the normal patterns of input to the central nervous system (e.g., resulting in abnormal bursting activity in the deafferented central cells that persists long after the surgery; Melzack & Loeser, 1978). Morever, the complexity of brain activity and its plasticity contradict a simple surgical solution for pain problems. Therefore, a more promising approach consists of continuous nerve blockage to reduce evoked pain. This technique reduces nausea and vomiting in patients receiving continuous peripheral nerve blocks while increasing their satisfaction. In this case, rehabilitation is improved and incidence of postsurgery chronic pain syndromes is greatly decreased (Boezaart, 2006). In home treatments are also possible, increasing patient satisfaction and comfort (Ilfed & Enneking, 2005). Furthermore, the use of nonpharmacological options such as massage, acupuncture, heat therapy, relaxation, and
8/17/09 3:05:08 PM
c33.indd 653
8/17/09 3:05:08 PM
• Oral
Relief of: • Osteoarthritis • Rheumatoid arthritis • Acute or postoperative pain
• Blocking evoked pain
Local anesthetics Lidocaïne, bupivacaine, and others
Local/regional Transdermal (patch) Intravenous Neuroaxial (spinal, epidural)
• Oral • Sublingual spray • Inhalation
• Acute • Chronic pain
Cannabinoids Cannabis, nabilone, dronabinol, ⌬9-THC/CBD • • • •
• Oral
• Diabetic neuropathy • Postherpetic neuralgia • Trigeminal neuralgia
Anticonvulsants Gabapentin, pregabalin, lamotrigine
• Oral
• Neuropathic pain
Antidepressants Tricyclic (imipramine) Newer (venlaflaxine, duloxetine, bupropion)
Coxibs: celecoxib, etoricoxib, lumiracoxib, parecoxib
• Oral • Topical
• Analgesics • Anti-inflammatory agents
Oral Intravenous Transdermal (patch) Sublingual spray Intranasal spray Oral transmucosal Pulmonary Microspheres
NSAIDs Traditional: diclofenac, ketorolac, ketoprofen, ibuprofen, naproxen
• • • • • • • •
Treatment of pain such as: • Neuropathic • Inflammatory • Cancer • Acute • Post-operative
Opioids Morphine (or alternative opioid: hydromorphone, fentanyl, remifentanil, alfentanil, sufentanil, meperidine, buprenorphine, butorphanol, etc.)
Route of Administration
Indications
Different pharmacological treatments and administrations for pain relief
Drug
TABLE 33.10
Respiratory depression Sedation Nausea and vomiting Constipation Cognitive dysfunction Pruritus Tolerance/dependence Euphoria
• Tachycardia
• Convulsions • Coma • Skin erythema • Rash • Cardiorespiratory depression with increasing doses
• Tolerance • Memory impairment
• Euphoria
• Edema • Weight gain
• Ataxia • Diplopia
Sedation Constipation Dry mouth Orthostatic hypotension Weight gain with tricyclic Ataxia, nausea, and anorexia using newer antidepressants • Sedation
• • • • • •
• Cardiac (myocardial infarction and stroke) • Gastrointestinal associated with long-term use • Renal (acute renal failure)
• Gastrointestinal disturbances • Renal • Skin reactions
• • • • • • • •
Adverse Effects
• Locoregional anesthesia problems: non-consenting patients, local infection, coagulation disorders, inadequate monitoring
• Patients with hypertension and ischemic heart disease
• Patients with renal dysfunction need a dose adjustment
• Patients with glaucoma and/or taking monoamine oxidase inhibitors • Duloxetine has been approved by US FDA for use in diabetic, neuropathy
• Patients with cardiovascular and cerebrovascular disease • Carefulness in patients with hypertension, hyperlipidaemia, diabetes, arterial disease, or smoking
• Patients with gastrointestinal and renal complications
• Screen patients for alcohol/substance abuse; co-administer preemptive stool softeners and antiemetics
Contraindications
654
Pain: Mechanisms and Measurement
transcutaneous electrical nerve stimulation (TENS) as adjuvants to conventional analgesia can also be considered and incorporated to achieve an effective and successful pain management regimen in some patients (Pyati & Gan, 2007). It is also possible to reduce clinical pain by using cognitive therapies although these therapies are not exclusive and need to be used in conjunction with the proper medication. For example, in patients suffering from posttraumatic stress syndrome (Muse, 1985, 1986) and ovarian cancer (Montazeri, McEwen, & Gillis, 1996), psychological counseling is validated to improve quality of life and reduce pain. However, there are no perfect therapies and the effectiveness of each can vary depending on the disease and the patient. Although each therapy has its own specific limitations related to the disease/patient context (see Table 33.10), it is highly significant that the effects of two or more therapies given in combination can produce additive or synergistic effects.
SUMMARY Great advances have been made in the past several decades in defining pain and understanding its underlying mechanisms. Further research is nonetheless necessary to reduce unnecessary suffering in chronic pain patients. Although many treatment modalities are being used to reduce and alleviate pain, many basic research and clinical questions remain unanswered. These gaps in our knowledge base may be attributed to the complexity of pain mechanisms that are involved. Filling these gaps may identify previously unrecognized therapeutic targets. In the coming years, advances in our preclinical and clinical understanding of pain mechanisms are expected, which should provide an impetus for improving pharmacotherapies for chronic pain. Treatment paradigms are shifting from single to multiple therapies that combine medications with distinct mechanisms of action and/or combine medications with nontraditional therapies. Targeting multiple analgesic mechanisms simultaneously holds promise for attaining a more complete attenuation of pain with a more limited spectrum of unwanted side-effects.
REFERENCES al’Absi, M., Petersen, K. L., & Wittmers, L. E. (2002). Adrenocortical and hemodynamic predictors of pain perception in men and women. Pain, 96, 197–204. Aley, K. O., Reichling, D. B., & Levine, J. D. (1996). Vincristine hyperalgesia in the rat: A model of painful vincristine neuropathy in humans. Neuroscience, 73, 259–265.
c33.indd 654
Atkinson, J. H., Kremer, E. F., & Ignelzi, R. J. (1982). Diffusion of pain language with affective disturbance confounds differential diagnosis. Pain, 12, 375–384. Authier, N., Coudore, F., Eschalier, A., & Fialip, J. (1999). Pain related behavior during vincristine-induced neuropathy in rats. NeuroReport, 10, 965–968. Basbaum, A. I., & Fields, H. L. (1978). Endogenous pain control mechanisms: Review and hypothesis. Annals of Neurology, 4, 451–462. Basbaum, A. I., Marley, N. J. E., O’Keefe, J., & Clanton, C. H. (1977). Reversal of morphine and stimulus produced analgesia by subtotal spinal cord lesions. Pain, 3, 43–56. Beecher, H. K. (1959). Measurement of subjective responses. New York: Oxford University Press. Bélanger, E., Melzack, R., & Lauzon, P. (1989). Pain of first-trimester abortion: A study of psychosocial and medical predictors. Pain, 36, 339–350. Bennett, A. D., Chastain, K. M., & Hulsebosch, C. E. (2000). Alleviation of mechanical and thermal allodynia by CGRP(8–37) in a rodent model of chronic central pain. Pain, 86, 163–175. Bennett, A. D., Everhart, A. W., & Hulsebosch, C. E. (2000). Intrathecal administration of an NMDA or a non-NMDA receptor antagonist reduces mechanical but not thermal allodynia in a rodent model of chronic central pain after spinal cord injury. Brain Research, 859, 72–82. Bennett, G. J., & Xie, Y. K. (1988). A peripheral mononeuropathy in rat that produces disorders of pain sensation like those seen in man. Pain, 33, 87–107. Besson, J. M., & Chaouch, A. (1987). Peripheral and spinal mechanisms of nociception. Physiological Reviews, 67, 67–186. Bie, B., & Pan, Z. Z. (2007). Trafficking of central opioid receptors and descending pain inhibition. Molecular Pain, 3, 37. Bingel, U., Schoell, E., & Büchel, C. (2007). Imaging pain modulation in health and disease. Current Opinion in Neurology, 20, 424–431. Bishop, G. H. (1946). Neural mechanisms of cutaneous sense. Physiological Reviews, 26, 77–102. Bishop, G. H. (1959). The relation between nerve fiber size and sensory modality: Phylogenetic implications of the afferent innervations of the cortex. Journal of Nervous and Mental Diseases, 128, 89–114. Boezaart, A. P. (2006). Perineural infusion of local anesthetics. Anesthesiology, 104, 872–880. Bonica, J. J. (1953). The management of pain. Philadelphia: Lea and Febiger. Bonica, J. J. (1974). Organization and function of a pain clinic. In J. J. Bonica (Ed.), Advances in neurology (pp. 433–443). New York: Raven Press. Boring, E. G. (1942). Sensation and perception in the history of experimental psychology. New York: Appleton-Century-Crofts. Bowd, A. D. (1980). Ethics and animal experimentation. American Psychologist, 35, 224–225. Brennan, F., Carr, D. B., & Cousins, M. (2007). Pain management: A fundamental human right. Anesthesia and Analgesia, 105, 205–221. Brennan, T. J. (1999). Postoperative models of nociception. Animal models of pain. ILAR Journal, 40, 129–136. Brennan, T. J., Vandermeulen, E. P., & Gebhart, G. F. (1996). Characterization of a rat model of incisional pain. Pain, 64, 493–501. Briggs, M. (1996). Surgical wound pain: A trial of two treatments. Journal of Wound Care, 5, 456–460. Brodner, G., Van Aken, H., Hertle, L., Fobker, M., Von Eckardstein, A., Goeters, C., et al. (2001). Multimodal perioperative management: Combining thoracic epidural analgesia, forced mobilization, and oral nutrition: Reduces hormonal and metabolic stress and improves convalescence after major urologic surgery. Anesthesia and Analgesia, 92, 1594–600.
8/17/09 3:05:09 PM
References 655 Chaplan, S. R., Bach, F. W., Pogrel, J. W., Chung, J. M., & Yaksh, T. L. (1994). Quantitative assessment of tactile allodynia in the rat paw. Journal of Neuroscience Methods, 53, 55–63. Chapman, C. R., Casey, K. L., Dubner, R., Foley, K. M., Gracely, R. H., & Reading, A. E. (1985). Pain measurement: An overview. Pain, 22, 1–31. Chenot, J. F., Leonhardt, C., Keller, S., Scherer, M., Donner-Banzhoff, N., Pfingsten, M., et al. (2008). The impact of specialist care for low back pain on health service utilization in primary care patients: A prospective cohort study. European Journal of Pain, 12, 275–283. Chernov, H. I., Wilson, D. E., Fowler, W. F., & Plummer, A. J. (1967). Non-specificity of the mouse writhing test. Archives Internationales de Pharmacodynamie et de Thérapie, 167, 171–178. Choinière, M., & Amsel, R. (1996). A visual analogue thermometer for measuring pain intensity. Journal of Pain and Symptom Management, 11, 299–311. Choinière, M., Melzack, R., Girard, N., Rondeau, J., & Paquin, M. J. (1990). Comparisons between patients and nurses assessments of pain and medication efficacy in severe burn injuries. Pain, 40, 143–152.
Descartes, R. (1664). L’homme translated by M. Foster in 1901. Lectures on the history of physiology during 16th, 17th, and 18th centuries. Cambridge: Cambridge University Press. Dubner, R. (1983). Pain research in animals. Annals of the New York Academy of Sciences, 406, 128–132. Dubner, R., & Ren, K. (1999). Assessing transient and persistent pain in animals. In P. D. Wall & R. Melzack (Eds.), Textbook of pain (4th ed., pp. 359–369). London: Churchill Livingstone. Dubuisson, D., & Dennis, S. G. (1977). The formalin test: A quantitative study of the analgesic effects of morphine, meperidine, and brain stem stimulation in rats and cats. Pain, 4, 161–174. Dubuisson, D., & Melzack, R. (1976). Classification of clinical pain descriptors by multiple group discriminant analysis. Experimental Neurology, 51, 480–487. Eija, K., Tiina, T., & Pertti, N. J. (1996). Amitriptyline effectively relieves neuropathic pain following treatment of breast cancer. Pain, 64, 293–302.
Christensen, P., Brandt, M. R., Rem, J., & Kehlet, H. (1982). Influence of extradural morphine on the adrenocortical and hyperglycaemic response to surgery. British Journal of Anaesthesia, 54, 23–27.
Ekblom, A., & Hansson, P. (1988). Pain intensity measurements in patients with acute pain receiving afferent stimulation. Journal of Neurology, Neurosurgery, and Psychiatry, 51, 481–486.
Clavelou, P., Dallel, R., Orliaguet, T., Woda, A., & Raboisson, P. (1995). The orofacial formalin test in rats: Effects of different formalin concentrations. Pain, 62, 295–301.
Fields, H. L. (1987). Pain. New York: McGraw-Hill.
Clavelou, P., Pajot, J., Dallel, R., & Raboisson, P. (1989). Application of the formalin test to the study of orofacial pain in the rat. Neuroscience Letters, 103, 349–353.
Fields, H. L., Basbaum, A. I., Heinricher, M. M. (2006). Central nervous system mechanisms of pain modulation. In S. B. McMahon & M. Koltzenburg (Eds.), Wall & Melzack’s Textbook of pain, 5th edition (pp. 125–142). Edinburgh: Elsevier.
Coderre, T. J., & Wall, P. D. (1987). Ankle joint urate arthritis (AJUA) in rats: An alternative animal model of arthritis to that produced by Freund’s adjuvant. Pain, 28, 379–393. Cohen, S. D., Patel, S. S., Khetpal, P., Peterson, R. A., & Kimmel, P. L. (2007). Pain, sleep disturbance, and quality of life in patients with chronic kidney disease. Clinical Journal of the American Society of Nephrology, 2, 919–925. Comings, D. E., & Amromin, G. D. (1974). Autosomal dominant insensitivity to pain with hyperplastic myelinopathy and autosomal dominant indifference to pain. Neurology, 24, 838–848. Cook, A. J., Roberts, D. A., Henderson, M. D., VanWinkle, L. C., Chastain, D. C., & Hamill-Ruth, R. J. (2004). Electronic pain questionnaires: A randomized, crossover comparison with paper questionnaires for chronic pain assessment. Pain, 110, 310–317. Cox, J. J., Reimann, F., Nicholas, A. K., Thornton, G., Roberts, E., Springell, K., et al. (2006, August 24). An SCN9A channelopathy causes congenital inability to experience pain. Nature, 444, 894–898. Craggs, J. G., Price, D. D., Verne, G. N., Perlstein, W. M., & Robinson, M. M. (2007). Functional brain interactions that serve cognitive-affective processing during pain and placebo analgesia. NeuroImage, 38, 720–729. Craig, A. D., & Bushnell, M. C. (1994, July 8). The thermal grill illusion: Unmasking the burn of cold pain. Science, 265, 252–255. Dallenbach, K. M. (1939). Pain: History and present status. American Journal of Psychology, 52, 331–347. D’Amour, F. E., & Smith, D. L. (1941). A method for determining loss of pain sensation. Journal of Pharmacology and Experimental Therapeutics, 72, 74–79. Daoust, R., Beaulieu, P., Manzini, C., Chauny, J. M., & Lavigne, G. (2008). Estimation of pain intensity in emergency medicine: A validation study. Pain, 138, 565–570. De Castro Costa, M., DeSutter, P., Gybels, J., & Van Hees, J. (1981). Adjuvant-induced arthritis in rats: A possible animal model of chronic pain. Pain, 10, 173–186. Decosterd, I., & Woolf, C. J. (2000). Spared nerve injury: An animal model of persistent peripheral neuropathic pain. Pain, 87, 149–158.
c33.indd 655
DeLeo, J. A., & Yezierski, R. P. (2001). The role of inflammation and neuroimmune activation in persistent pain. Pain, 201, 1–6.
Fields, H. L. (1999). Pain: An unpleasant topic. Pain (Suppl. 6), S61–S69.
Gagliese, L., & Melzack, R. (1997). Age differences in the quality of chronic pain: A preliminary study. Pain Research and Management, 2, 157–162. Garcia-Larrea, L., & Magnin, M. (2008). Pathophysiology of neuropathic pain: Review of experimental models and proposed mechanisms. Presse Médicale, 37, 315–340. Giamberardino, M. A., Valente, R., de Bigontina, P., & Vecchiet, L. (1995). Artificial ureteral calculosis in rats: Behavioral characterization of visceral pain episodes and their relationship with referred lumbar muscle hyperalgesia. Pain, 61, 459–469. Giamberardino, M. A., Vecchiet, L., & Albe-Fessard, D. (1990). Comparison of the effects of ureteral calculosis and occlusion on muscular sensitivity to painful stimulation in rats. Pain, 43, 227–234. Goldscheider, A. (1894). Uber den Schmerz in physiologischer und klinischer hinsichl. Berlin, Germany: Hirschwald. Gonzalez, M. I., Field, M. J., Bramwell, S., McCleary, S., & Singh, L. (2000). Ovariohysterectomy in the rat: A model of surgical pain for evaluation of pre-emptive analgesia? Pain, 88, 79–88. Gosling, J. A., Dixon, J. S., & Dunn, M. (1977). The structure of the rabbit urinary bladder after experimental distension. Investigative Urology, 14, 386–389. Greenwald, H. P. (1991). Interethnic differences in pain perception. Pain, 44, 157–163. Grushka, M., & Sessle, B. J. (1984). Applicability of the McGill Pain Questionnaire to the differentiation of toothache pain. Pain, 19, 49–57. Guarino, A. H., & Myers, J. C. (2007). An assessment protocol to guide opioid prescriptions for patients with chronic pain. Missouri Medicine, 104, 513–516. Guindon, J., Walczak, J. S., & Beaulieu, P. (2007). Recent advances in the pharmacological management of pain. Drugs, 67, 2121–2133. Hagbarth, K. E., & Kerr, D. I. (1954). Central influences on spinal afferent conduction. Journal of Neurophysiology, 17, 295–307. Hargreaves, K., Dubner, R., Brown, F., Flores, C., & Joris, J. (1988). A new and sensitive method for measuring thermal nociception in cutaneous hyperalgesia. Pain, 32, 77–88.
8/17/09 3:05:09 PM
656
Pain: Mechanisms and Measurement
Helfer, H., & Jaques, R. (1970). The duration of action of antinociceptive agents ad determined in mice using the arachidonic acid writhing test. Pharmacology, 4, 163–168. Hewitt, D. J., Todd, K. H., Xiang, J., Jordan, D. M., Rosenthal, N. R., & CAPSS-216 Study Investigators. (2007). Tramadol/acetaminophen or hydrocodone/acetaminophen for the treatment of ankle sprain: A randomized, placebo-controlled trial. Annals of Emergency Medicine, 49, 468–480. Hogan, Q. (2002). Animal pain models. Regional Anesthesia and Pain Medicine, 27, 385–401. Huskisson, E. C. (1983). Visual analogue scales. In R. Melzack (Ed.), Pain measurement and assessment (pp. 33–37). New York: Raven Press. Iadarola, M. J., Brady, L. S., Draisci, G., & Dubner, R. (1988). Enhancement of dynorphin gene expression in spinal cord following experimental inflammation: Stimulus specificity, behavioral parameters and opioid receptor binding. Pain 1 35, 313–326. Ilfeld, B. M., & Enneking, F. K. (2005). Continuous peripheral nerve blocks at home: A review. Anesthesia and Analgesia, 100, 1822–1833. Ischia, S., Ischia, A., Luzzani, A., Toscano, D., & Steele, A. (1985). Results up to death in the treatment of persistent cervico-thoracic (Pancoast) and thoracic malignant pain by unilateral percutaneous cervical cordotomy. Pain, 21, 339–355. Ischia, S., Luzzani, A., Ischia, A., & Pacini, L. (1984). Role of unilateral percutaneous cervical cordotomy in the treatment of neoplastic vertebral pain. Pain, 19, 123–131. Jenkinson, C., Carroll, D., Egerton, M., Frankland, T., McQuay, H., & Nagle, C. (1995). Comparison of the sensitivity to change of long and short form pain measures. Quality of Life Research, 4, 353–357. Jensen, M. P., & Karoly, P. (1992). Self-report scales and procedures for assessing pain in adults. In D. C. Turk & R. Melzack (Eds.), Handbook of pain assessment (pp. 135–151). New York: Guilford Press. Joyce, C. R., Zutshi, D. W., Hrubes, V., & Mason, R. M. (1975). Comparison of fixed interval and visual analogue scales for rating chronic pain. European Journal of Clinical Pharmacology, 8, 415–420. Julius, D., & Basbaum, A. I. (2001, July 21). Molecular mechanisms of nociception. Nature, 413, 203–210. Katz, J. (1992). Psychophysical correlates of phantom limb experience. Journal of Neurology, Neurosurgery, and Psychiatry, 55, 811–821. Katz, J., & Melzack, R. (1991). Auricular TENS reduces phantom limb pain. Journal of Pain and Symptom Management, 6, 73–83. Kandel, E. R. (1985). Central representations of pain and analgesia. In E. R. Kandel & J. H. Schwartz (Eds.), Principles of neural science (pp. 331–343). New York: Elsevier. Kim, S. H., & Chung, J. M. (1992). An experimental model for peripheral neuropathy produced by segmental spinal nerve ligation in the rat. Pain, 50, 355–363. Kremer, E., & Atkinson, J. H. (1983). Pain language as a measure of effect in chronic pain patients. In R. Melzack (Ed.), Pain measurement and assessment (pp. 119–127). New York: Raven Press. Kumar, B., Hood, W. B., Jr., Joison, J., Gilmour, D. P., Norman, J. C., & Abelmann, W. H. (1970). Experimental myocardial infarction: VI. Efficacy and toxicity of digitalis in acute and healing phase in intact conscious dogs. Journal of Clinical Investigation, 49, 358–364. LaMotte, R. H., Shain, C. N., Simone, D. A., & Tsai, E. F. (1991). Neurogenic hyperalgesia: Psychophysical studies of underlying mechanisms. Journal of Neurophysiology, 66, 190–211. Lanteri-Minet, M., Bon, K., dePommery, J., Michiels, J. F., & Menetrey, D. (1995). Cyclophosphamide cystitis as a model of visceral pain in rats: Model elaboration and spinal structures involved as revealed by the expression of c-Fos and Krox-24 proteins. Experimental Brain Research, 105, 220–232.
c33.indd 656
Lascelles, B. D., Waterman, A. E., Cripps, P. J., Livingston, A., & Henderson, G. (1995). Central sensitization as a result of surgical pain: Investigation of the pre-emptive value of pethidine for ovariohysterectomy in the rat. Pain, 62, 201–212. Levine, J. D., Gordon, N. C., Smith, R., & McBryde, R. (1986). Desipramine enhances opiate postoperative analgesia. Pain, 27, 45–49. Livingstone, W. K. (1943). Pain mechanisms. New York: Macmillan. Livingstone, W. K. (1998). Pain and suffering. Seattle: IASP Press. Loeser, J. D. (2001). Tic douloureux. Pain Research and Management, 6, 156–165. Love, A., Leboeuf, D. C., & Crisp, T. C. (1989). Chiropractic chronic low back pain sufferers and self-report assessment methods: Pt. I. A reliability study of the Visual Analogue Scale, the pain drawing and the McGill Pain Questionnaire. Journal of Manipulative and Physiological Therapeutics, 12, 21–25. Manfredi, M., Bini, G., Cruccu, G., Accornero, N., Berardelli, A., & Medolago, L. (1981). Congenital absence of pain. Archives of Neurology, 38, 507–511. Marchand, S. (1998). Le phénomène de la douleur. Montreal, Ontario, Canada: Chenelière/McGraw-Hill. Marinov, B., Mandadjieva, S., & Kostianev, S. (2008). Pictorial and verbal category-ratio scales for effort estimation in children. Child: Care, Health, and Development, 34, 35–43. Marker, C. L., Lujan, R., Colon, J., & Wickman, K. (2006). Distinct populations of spinal cord lamina II interneurons expressing G-protein-gated potassium channels. Journal of Neuroscience, 26, 12251–12259. Marshall, H. R. (1894). Pain, pleasure and aesthetics. London: Macmillan. Mason, P. (2005). Ventromedial medulla: Pain modulation and beyond. Journal of Comparative Neurology, 493, 2–8. McGrath, P. J., & Unruh, A. M. (2002). The social context of neonatal pain. Clinics in Perinatology, 29, 555–572. McMahon, S. B., & Abel, C. (1987). A model for the study of visceral pain states: Chronic inflammation of the chronic decerebrate rat urinary bladder by irritant chemicals. Pain, 28, 109–127. Medhurst, S. J., Walker, K., Bowes, M., Kidd, B. L., Glatt, M., Muller, M., et al. (2002). A rat model of bone cancer pain. Pain, 96, 129–140. Melzack, R. (1975). The McGill Pain Questionnaire: Major properties and scoring methods. Pain, 1, 277–299. Melzack, R. (1983). Pain measurement and assessment. New York: Raven Press. Melzack, R. (1987). The short-form McGill Pain Questionnaire. Pain, 30, 191–197. Melzack, R. (1990a). Phantom limbs and the concept of a neuromatrix. Trends in Neuroscience, 13, 88–92. Melzack, R. (1990b). The tragedy of needless pain. Scientific American, 262, 27–33. Melzack, R. (1999). From the gate to the neuromatrix. Pain (Suppl. 6), S121–S126. Melzack, R. (2005). The McGill Pain Questionnaire: From description to measurement. Anesthesiology, 103, 199–202. Melzack, R., & Casey, K. L. (1968). Sensory, motivational, and central control determinants of pain: A new conceptual model. In D. Kenshalo (Ed.), The skin senses (pp. 422–443). Springfield: Thomas. Melzack, R., & Katz, J. (1999). Pain measurement in persons in pain. In P. D. Wall & R. Melzack (Eds.), Textbook of pain (4th ed., pp. 409–426). London: Churchill Livingstone. Melzack, R., & Loeser, J. D. (1978). Phantom body pain in paraplegics: Evidence for a central “pattern generating mechanism” for pain. Pain, 4, 195–210.
8/17/09 3:05:10 PM
References 657 Melzack, R., Stotler, W. A., & Livingston, W. K. (1958). Effects of discrete brainstem lesions in cats on perception of noxious stimulation. Journal of Neurophysiology, 21, 353–367.
Pan, H. L., & Chen, S. R. (2002). Myocardial ischemia recruits mechanically insensitive cardiac sympathetic afferents in cats. Journal of Neurophysiology, 87, 660–668.
Melzack, R., & Torgerson, W. S. (1971). On the language of pain. Anesthesiology, 34, 50–59.
Pavlov, I. P. (1927). Conditioned reflexes. Oxford: Humphrey Milford.
Melzack, R., & Wall, P. D. (1965, November 19). Pain mechanisms: A new theory. Science, 150, 971–979. Melzack, R., & Wall, P. D. (1983). The challenge of pain. New York: Basic Books. Melzack, R., & Wall, P. D. (1991). The challenge of pain. London: Penguin Books. Melzack, R., Wall, P. D., & Ty, T. C. (1982). Acute pain in an emergency clinic: Latency of onset and description patterns related to different injuries. Pain, 14, 33–43. Merskey, H., & Bogduk, N. (1994). Part III: Pain terms, a current list with definitions and notes on usage. In H. Merskey & N. Bogduk (Eds.), Classification of chronic pain, 2nd edition (pp. 209–214). Seattle, WA: IASP Press. Millan, M. J. (1999). The induction of pain: An integrative review. Progress in Neurobiology, 57, 1–164. Millan, M. J. (2002). Descending control of pain. Progress in Neurobiology, 66, 355–474. Moisset, X., & Bouhassira, D. (2007). Brain imaging of neuropathic pain. NeuroImage, 37 (Suppl. 1), S80–S88. Molton, I. R., Jensen, M. P., Nielson, W., Cardenas, D., & Ehde, D. M. (2008). A preliminary evaluation of the motivational model of pain selfmanagement in persons with spinal cord injury-related pain. Journal of Pain, 9, 606–612. Montazeri, A., McEwen, J., & Gillis, C. R. (1996). Quality of life in patients with ovarian cancer: Current state of research. Support Care Cancer, 4, 169–179. Moskowitz, M. A. (1992). Neurogenic versus vascular mechanisms of sumatriptan and ergot alkaloids in migraine. Trends in Pharmacological Sciences, 13, 307–311. Müller, J. (1842). Elements of physiology. London: Taylor. Mugabure Bujedo, B., Tranque Bizueta, I., Gonzalez Santos, S., & Adrian Garde, R. (2007). Multimodal approaches to postoperative pain management and convalescence. Revista Española de Anestesiología y Reanimación, 54, 29–40. Muse, M. (1985). Stress-related, postraumatic chronic pain syndrome: Criteria for diagnosis, and preliminary report on prevalence. Pain, 23, 295–300. Muse, M. (1986). Stress-related, postraumatic chronic pain syndrome: Behavioral treatment approach. Pain, 25, 389–394. Ness, T. J. (1999). Models of visceral nociception. ILAR Journal, 40, 119–128. Ness, T. J., & Gebhart, G. F. (1988). Colorectal distension as a noxious visceral stimulus: Physiologic and pharmacologic characterization of pseudaffective reflexes in the rat. Brain Research, 450, 153–169. Ness, T. J., Randich, A., & Gebhart, G. F. (1991). Further behavioral evidence that colorectal distension is a “noxious” visceral stimulus in rats. Neuroscience Letters, 131, 113–116. Nielsen, C. S., Price, D. D., Vassend, O., Stubhaug, A., & Harris, J. R. (2005). Characterizing individual differences in heat-pain sensitivity. Pain, 119, 65–74. Noordenbos, W. (1959). Pain. Amsterdam: Elsevier Press. Noordenbos, W., & Wall, P. D. (1981). The failure of nerve grafts to relieve pain following nerve injury. Journal of Neurology, Neurosurgery, and Psychiatry, 44, 1008–1073. Okuda, K., Nakahama, H., Miyakawa, H., & Shima, K. (1984). Arthritis induced in cats by sodium urate: A possible animal model for chronic pain. Pain, 18, 287–297.
c33.indd 657
Perrot, S., Krause, D., Crozes, P., Naïm, C., & GRTF-ZAL-1 Study Group. (2006). Efficacy and tolerability of paracetamol/tramadol (325 mg/37.5 mg) combination treatment compared with tramadol (50 mg) monotherapy in patients with subacute low back pain: A multicenter, randomized, double-blind, parallel-group, 10-day treatment study. Clinical Therapeutics, 28, 1592–1606. Pogatzki, E. M., Niemeier, J. S., & Brennan, T. J. (2002). Persistent secondary hyperalgesia after gastrocnemius incision in the rat. European Journal of Pain, 6, 295–305. Polomano, R. C., & Bennett, G. J. (2001). Chemotherapy-evoked painful peripheral neuropathy. Pain Medicine, 2, 8–14. Price, D. D. (1988). Psychological and neural mechanisms of pain. New York: Raven Press. Price, D. D. (1999). Psychological mechanisms of pain and analgesia: Progress in pain research and management. Seattle, WA: IASP Press. Price, D. D., & Dubner, R. (1977). Neurons that subserve the sensorydiscriminative aspects of pain. Pain, 3, 307–338. Price, D. D., Harkins, S. W., & Baker, C. (1987). Sensory-affective relationships among different types of clinical and experimental pain. Pain, 28, 297–307. Price, D. D., Harkins, S. W., Rafii, A., & Price, C. (1986). A simultaneous comparison of fentanyl’s analgesic effects on experimental and clinical pain. Pain, 24, 197–203. Price, D. D., Hu, J. W., Dubner, R., & Gracely, R. H. (1977). Peripheral suppression of first pain and central summation of second pain evoked by noxious heat pulses. Pain, 3, 57–68. Pruimboom, L., & vanDam, A. C. (2007). Chronic pain: A non-use disease. Medical Hypotheses, 68, 506–511. Pyati, S., & Gan, T. J. (2007). Perioperative pain management. CNS Drugs, 21, 185–211. Rainville, P., Duncan, G. H., Price, D. D., Carrier, B., & Bushnell, M. C. (1997, August 15). Pain affect encoded in human anterior cingulate but not somatosensory cortex. Science, 277, 968–971. Rakieten, N., Rakieten, L., & Nadkarni, M. V. (1963). Studies on the diabetogenic action of streptozotocin. Cancer Chemotherapy Reports, 29, 91–98. Randall, L. O., & Selitto, J. J. (1957). A method for measurement of analgesic activity on inflamed tissue. Archives Internationales de Pharmacodynamie et de Thérapie, 111, 409–419. Ren, K., & Dubner, R. (1999). Inflammatory models of pain and hyperalgesia. ILAR Journal, 40, 111–118. Rexed, B. (1952). The cytoarchitectonic organization of the spinal cord in the cat. Journal of Comparative Neurology, 96, 415–495. Reynolds, D. V. (1969, April 25). Surgery in the rat during electrical analgesia induced by focal brain stimulation. Science, 164, 444–445. Ross, D. M., & Ross, S. A. (1984). Childhood pain: The school-aged child’s viewpoint. Pain, 20, 179–191. Schaible, H. G., Schmidt, R. F., & Willis, W. D. (1987). Enhancement of the responses of ascending tract cells in the cat spinal cord by acute inflammation of the knee joint. Experimental Brain Research, 66, 489–499. Scholl, J., & Allen, P. J. (2007). A primary care approach to functional abdominal pain. Pediatric Nursing, 33, 247–254. Schwei, M. J., Honore, P., Rogers, S. D., Salak-Johnson, J. L., Finke, M. P., Ramnaraine, M. L., et al. (1999). Neurochemical and cellular reorganization of the spinal cord in a murine model of bone cancer pain. Journal of Neuroscience, 19, 10886–10897.
8/17/09 3:05:10 PM
658
Pain: Mechanisms and Measurement
Seltzer, Z., Dubner, R., & Shir, Y. (1990). A novel behavioral model of neuropathic pain disorders produced in rats by partial sciatic nerve injury. Pain, 43, 205–218. Shimoyama, M., Tanaka, K., Hasue, F., & Shimoyama, N. (2002). A mouse model of neuropathic cancer pain. Pain, 99, 167–174. Simone, D. A., Ngeow, J. Y., Putterman, G. J., & LaMotte, R. H. (1987). Hyperalgesia to heat after intradermal injection of capsaicin. Brain Research, 418, 201–203. Sinclair, D. C. (1955). Cutaneous sensation and the doctrine of specific nerve energies. Brain, 78, 584–614. Sriwatanakul, K., Kelvie, W., Lasagna, L., Calimlim, J. F., Weis, O. F., & Mehta, G. (1983). Studies with different types of visual analog scales for measurement of pain. Clinical Pharmacology and Therapeutics, 34, 234–239. Stein, C., Millan, M. J., & Herz, A. (1988). Unilateral inflammation of the hindpaw in rats as a model of prolonged noxious stimulation: Alterations in behavior and nociceptive thresholds. Pharmacology, Biochemistry, and Behavior, 31, 455–461. van Eick, A. J. (1967). A change in the response of the mouse in the “hot plate” analgesia-test, owing to a central action of atropine and related compounds. Acta Physiologica et Pharmacologica Neerlandica, 14, 499–500. Villalon, C. M., Centurion, D., Valdivia, L. F., de Vries, P., & Saxena, P. R. (2003). Migraine: Pathophysiology, pharmacology, treatment and future trends. Current Vascular Pharmacology, 1, 71–84. Vyklicky, L. (1979). Techniques for the study of pain in animals. In J. J. Bonica, J. C. Liebeskind, & D. G. Albe-Fessard (Eds.), Advances in pain research and therapy (Vol. 3, pp. 727–745). New York: Raven Press. Wacnik, P. W., Eikmeier, L. J., Ruggles, T. R., Ramnaraine, M. L., Walcheck, B. K., Beitz, A. J., et al. (2001). Functional interactions between tumor and peripheral nerve: Morphology, algogen identification, and behavioral characterization of a new murine model of cancer pain. Journal of Neuroscience, 21, 9355–9366. Wacnik, P. W., Kehl, L. J., Trempe, T. M., Ramnaraine, M. L., Beitz, A. J., & Wilcox, G. L. (2003). Tumor implantation in mouse humerus evokes movement-related hyperalgesia exceeding that evoked by intramuscular carrageenan. Pain, 101, 175–186. Walczak, J. S., Pichette, V., Leblond, F., Desbiens, K., & Beaulieu, P. (2005). Behavioral, pharmacological and molecular characterization of the saphenous nerve partial ligation: A new model of neuropathic pain. Neuroscience, 132, 1093–1102.
c33.indd 658
Walczak, J. S., Pichette, V., Leblond, F., Desbiens, K., & Beaulieu, P. (2006). Characterization of chronic constriction of the saphenous nerve, a model of neuropathic pain in mice showing rapid molecular and electrophysiological changes. Journal of Neuroscience Research, 83, 1310–1322. Walker, K., Medhurst, S. J., Kidd, B. L., Glatt, M., Bowes, M., Patel, S., et al. (2002). Disease modifying and anti-nociceptive effects of the bisphosphonate, zoledronic acid in a model of bone cancer pain. Pain, 100, 219–229. Wall, P. D. (1979). On the relation of injury to pain [The first John J. Bonica Lecture]. Pain, 6, 253–264. Wang, L. X., & Wang, Z. J. (2003). Animal and cellular models of chronic pain. Advanced Drug Delivery Reviews, 55, 949–965. Watson, G. S., Sufka, K. J., & Coderre, T. J. (1997). Optimal scoring strategies and weights for the formalin test in rats. Pain, 70, 53–58. Waxman, S. G. (2007). Nav1.7, its mutations, and the syndromes that they cause. Neurology, 69, 505–507. Weddell, G. (1955). Somesthesis and the chemical senses. Annual Review of Psychology, 6, 119–136. Whiteside, G. T., Adedoyin, A., & Leventhal, L. (2008). Predictive validity of animal pain models? A comparison of the pharmacokineticpharmacodynamic relationship for pain drugs in rats and humans. Neuropharmacology, 54, 767–775. Wilkie, D. J., Savedra, M. C., Holzemier, W. L., Tesler, M. D., & Paul, S. M. (1990). Use of the McGill Pain Questionnaire to measure pain: A metaanalysis. Nursing Research, 39, 36–41. Woolf, C. J., & Thompson, S. W. N. (1991). The induction and maintenance of central sensitization is dependent on N-methyl-D-aspartic acid receptor activation; implications for the treatment of post-injury pain insensitivity states. Pain, 44, 293–299. Wuarin Bierman, L., Zahnd, G. R., Kaufmann, F., Burcklen, L., & Adler, J. (1987). Hyperalgesia in spontaneous and experimental animal models of diabetic neuropathy. Diabetologia, 30, 653–658. Wynn Parry, C. B. (1980). Pain in avulsion lesions of the brachial plexus. Pain, 9, 41–53. You, H. J., Dahl Morch, C., Chen, J., & Arendt-Nielsen, L. (2003). Simultaneous recordings of wind up of paired spinal dorsal horn nociceptive neuron and nociceptive flexion reflex in rats. Brain Research, 960, 235–245.
8/17/09 3:05:11 PM
Chapter 34
Hunger TERRY L. POWLEY
produced with different methods have yet to be synthesized into a coherent account of hunger and satiety. Until there is better integration, effective treatments for eating disorders may continue to elude both researchers and those suffering from the disorders.
The neural basis of ingestive behavior is a central topic of behavioral neuroscience. Brain mechanisms of feeding have been discussed in clinical neurology for well over a century, and they have been the subject of intensive experimental scrutiny for more than six decades. Much has been learned. Crucial hypothalamic circuits have been delineated. Key brain stem mechanisms have been identified. Important limbic and telencephalic networks have been described. In addition, extensive autonomic control loops connecting the brain with the periphery have been defined, and batteries of gut-brain peptides, hormones, cytokines, and other signals reflecting energy balance have been identified and shown to affect feeding. Adipose tissue and other energy stores have also been shown to be innervated, to be actively regulated, and to generate feedback signals influencing feeding. But much remains to be explained. Perhaps the most sobering gauge of the adequacy of current models of central nervous system (CNS) mechanisms of ingestion is the fact that these accounts, to date, have not produced effective therapies for any of the major eating disorders, including anorexia, bulimia, and obesity that occur in epidemic proportions. Obesity, because of its prevalence, is the most widely targeted disorder. And, ironically, the most successful interventions for obesity yet devised are radical bariatric surgeries (Thomas, 1995). Such treatments do not draw on what is known about the brain mechanisms of feeding, rather they revise gastrointestinal (GI) physiology and feedback signals. The irony that these peripheral interventions were not formulated from an understanding of brain feeding circuits is accentuated by the likelihood that, as advances are made in understanding the neurobiology of ingestion, radical bariatric surgery will one day be considered gastroenterology’s frontal lobotomy. A survey of the research on the neural basis of energy intake underscores the conclusion that different neuroscience methods have generated different—in some cases conflicting, in some cases complementary—views of the neural mechanisms of feeding. These disparate perspectives
FOOD INTAKE: THE TERMINOLOGY AND CONSTRUCTS OF HUNGER AND SATIETY The behavioral neuroscience of ingestion employs, for the most part, a terminology of eating drawn from the popular vernacular (e.g., hunger, satiety, appetite, anorexia). With this lay vocabulary comes excess baggage—the terminology carries multiple connotations and incorporates assumptions, a “folk psychology” of feeding. These implications embedded in the language can have ramifications for investigations designed to produce a neuroscience of feeding. Early investigations of the neural bases of feeding focused on ingestion as a motivated behavior and used food intake as a prototype for motivations associated with homeostasis. With this traditional emphasis on explaining intake in terms of motivation, many experiments measured behavior (feeding or feeding cessation) while making assumptions about internal motivational processes (hunger or satiety, respectively) and, in turn, then making inferences about how those imputed motivations might be organized in terms of their neural circuitry. Both the concept of hunger, the motivation to seek and ingest food that occurs when an individual is in negative energy balance, and the concept of satiety, the motivation to refrain from ingesting nutrients in the face of either a net positive energy balance or a substantial energy load in the GI tract, are constructs or intervening variables (cf. Blundell, 1980). For research purposes, since hunger and satiety cannot typically, if ever, be directly observed, investigators rely on operational definitions. In the behavioral neuroscience of ingestion, hunger is usually gauged by the amount a subject will spontaneously consume (or the 659
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c34.indd 659
8/19/09 3:36:37 PM
660
Hunger
amount of time the subject is deprived); satiety is used synonymously with the cessation of consumption (or an absence of deprivation). These conventions are used provisionally in this chapter. It should be stressed, though, that factors other than what most investigators would recognize as the experiential elements of hunger and satiety can also affect food intake. If an experimenter holds the energy balance of an animal or human subject constant while varying the subject’s stress, fatigue, distractions, hydrational status, arousal, nocioceptive stimulation, social contingencies, or any number of other altered states, the investigator can affect the subject’s intake of nutrients and thus confound operational definitions of hunger and satiety. When a researcher does a feeding experiment, he or she tries, of course, to control the environment and hold extraneous variables constant. When an experimenter manipulates an animal’s brain or physiology, however, it is far more difficult to establish that the manipulation does not indirectly affect feeding by altering one or more of the myriad conditions that can circuitously influence energy balance or its effects on feeding.
DIFFERENT METHODS, DIFFERENT MODELS OF NEURAL FEEDING MECHANISMS This chapter emphasizes two aspects of the literature on the neural models of feeding, or hunger and satiety. The first aspect relates to methods; the second to the evolution of the problem. Both are useful in terms of understanding the sources of current ideas and appreciating the limitations of these ideas. It is an axiom of science that experimental answers reflect the techniques that are used in the experiment. Techniques limit what can be measured and hence what results are obtained, but they also mold the interpretations of those results. The point is as true for the behavioral neuroscience of ingestion as for any science. Assessing or rethinking a particular observation and its common interpretation is often aided by a reconsideration of the selectivities and biases inherent in the methods used to generate the observation. It is also axiomatic that the scientific understanding of a problem does not develop against a static background. Techniques evolve as do experimental and conceptual paradigms. Experimental questions reflect these changes. Understanding the context or state of the science at the time a particular model of feeding was developed is another way of appreciating both the limitations and strengths of that model. As neuroscience has grown exponentially, its ideas about neural organization and function have changed,
c34.indd 660
and with the transformations, the models of behavioral neuroscience have also changed. Questions about the neural basis of behavior, for example, are not framed the same way they were decades ago, and newer results (e.g., Holstege, Bandler, & Saper, 1996; Janig, 2006; Swanson, 2000) are not interpreted the same way they would have been. Thinking in terms of CNS “centers” dominated the early work on the behavioral neuroscience of ingestion; “distributed circuits” and “networks” are more consonant with neuroscience’s present view of the CNS (e.g., Berthoud, 2002; Holstege et al., 1996; Sawchenko, 1998; van den Pol, 2003). These trends are illustrated in our survey of the neuroscience of feeding according to a rough chronology reflecting the introduction of different techniques.
HYPOTHALAMUS: FEEDING CENTER OR NETWORK NODE? The neuroscience of ingestive behavior has long focused on the hypothalamus as the hub of CNS feeding circuitry. Beginning with early clinical observations on Froehlich’s syndrome over a century ago and extending through the past six or more decades of experimental analysis, it is clear that damage to the basomedial hypothalamus can produce the classic “ventromedial hypothalamic syndrome” distinguished by hyperphagia and obesity. The syndrome is also characterized by a sensitivity of the hyperphagia to particular diets (e.g., high fat diets), a “finickiness,” as well as by other symptoms (e.g., disruptions of reproductive functions, changes in temperament; Corbit & Stellar, 1964; Hetherington & Ranson, 1942; King & Cox, 1973) that are usually presumed to be incidental to the ingestive disorder. In contrast, bilateral damage to the lateral hypothalamus often leads to an aphagia and/or anorexia with an associated reduction in body weight (Teitelbaum & Epstein, 1962). This lateral hypothalamic syndrome encompasses other symptoms including an exaggerated dependence of ingestion on the palatability of the diet and other consequences that are generally presumed to be incidental to the dramatic feeding effects (e.g., an adipsia). In a number of ways, these two classic hypothalamic syndromes appear to be mirror images or reciprocals of each other. And soon after the patterns of symptoms were initially characterized, investigators concluded that the intact ventromedial hypothalamus apparently comprises a “satiety center” and the lateral hypothalamus a “hunger center” (e.g., Anand & Brobeck, 1951). These observations and the center analysis were made not long after Sherrington’s (1906) seminal delineation of the reciprocal and antagonist operations of spinal motor neuron pools
8/19/09 3:36:38 PM
Hypothalamus: Feeding Center or Network Node? 661
innervating flexors and extensors had been incorporated into physiological and behavioral thinking, and the center models were cast in terms analogous to cross-linked and opposing Sherringtonian reflexes. These postulated reciprocally acting feeding centers were taken as a prototype for Stellar’s (1954) classical and highly influential description of hypothalamic mechanisms of motivated behaviors, and his schematic summary (Figure 34.1) captures the general outline articulated for feeding as well as other motivations. The signals most often postulated to supply the feedback to the hypothalamus from metabolic or energy balance were blood glucose (the “glucostatic” hypothesis formalized by Mayer, 1953) or lipids (the “lipostatic” hypothesis articulated by Kennedy, 1953). Another alternative was that the thermic effects of metabolism influenced hypothalamic functions not only to regulate body temperature, but also energy balance (the “thermostatic” hypothesis considered by Brobeck, 1957), but much of the evidence initially obtained [based on parabiosis experiments (discussed later in this chapter—see Figure 34.7), infusions of metabolites, glucoprivation experiments, gold thioglucose lesion selectivity, etc.] was used to support one or another variant of the glucostatic or lipostatic hypotheses. Though the initial analyses of the syndromes resulting from basomedial hypothalamic damage and lateral hypothalamic lesions emphasized the feeding alterations and interpreted the associated changes in body weight that occurred (obesity or excessive leanness, respectively) as secondary to changes in appetite or satiety and hunger, subsequent experimentation challenged the conclusions that the distorted motivation to feed was a primary effect of the lesion and the corresponding change in body weight was secondary. In the case of both the ventromedial hypothalamic and lateral hypothalamic syndromes, experimentally displacing an animal’s body weight to the plateau that would be achieved by the animal after it sustained hypothalamic damage, but doing so prior to the production of the lesions, was found to eliminate the dramatic hyperphagia or aphagia, respectively, that typically occurred after the hypothalamic damage (Hoebel & Teitelbaum, 1966; Powley & Keesey, 1970). The observations suggested that the affected hypothalamic areas might play a role in the long-term regulation of energy balance and body energy stores and that the areas might modulate, directly or indirectly, feeding behavior so as to regulate body weight or adiposity. The idea that hunger and satiety, the motivational substrates of ingestion, were organized within the hypothalamus in a manner similar to Sherringtonian motor neuron pools also frayed as additional experimentation on the feeding syndromes appeared. The ventromedial syndrome was often far from dramatic and robust, depending heavily on animal strain, gender, diet conditions, and other factors (Corbit &
c34.indd Sec1:661
Stellar, 1964; Teitelbaum, 1955). Many of the motivational and behavioral changes associated with hypothalamic damage also appeared to be secondary consequences of more proximal autonomic and endocrine adjustments occasioned by the hypothalamic manipulation (Powley, 1977). The lateral hypothalamic syndrome in some cases seemed produced by or reinforced by either sensory neglect (Marshall, Turner, & Teitelbaum, 1971) or dyskinesias or akinesias resulting from disruptions of nigrostriatal circuitry in addition to hypothalamic circuitry (Marshall, Richardson, & Teritelbaum, 1974; Ungerstedt, 1970). Furthermore, the nature of the hypothalamic role in feeding was also questioned by the realization that hypothalamic outflows are only one or two synapses upstream of both pituitary endocrine effectors and autonomic preganglionic motor neurons, whereas hypothalamic efferents are far less directly linked to somatic or skeletal motor neuron pools. As discussed in more detail later in this chapter, subsequent neuroanatomical developments failed to delineate obvious structural counterparts of the “final common path” to behavior posited by Stellar and others (see Figure 34.1). The patterns of connectivity discovered raised the possibility that the hypothalamus might operate to effect autonomic and endocrine control of energy handling, that is, “physiological energy balance,” and that the influences on ingestion, that is, “behavioral energy balance,” might either be coordinated in parallel or might be secondary effects that occurred as energy partitioning changed (Figure 34.2). Any path to behavior was not a “final common path” but rather an output relayed circuitously through polysynaptic networks with myriad opportunities for further modulation or neural editing (Figure 34.3). Consistent with the idea that much of the hypothalamic role in feeding was secondary to endocrine or autonomic adjustments, pair-feeding experiments indicated that animals with basomedial hypothalamic lesions fattened, even when energy intake was tightly clamped at control levels and even when both the amount of food and pattern of meal taking were both controlled (Walgren & Powley, 1985). Similarly, animals with lateral hypothalamic damage defend their altered body weight levels with physiological energy balance responses when caloric intake is experimentally controlled (Hirvonen & Keesey, 1996; Keesey, Powley, & Kemnitz, 1976). Such assessments also make the point that feeding represents one side of the energy balance equation. Energy homeostasis is the product of both intake and expenditure, and expenditure is affected by behaviors other than feeding (e.g., activity, nursing young) and by various anabolic processes that promote energy conversation and storage (e.g., slowing metabolic rate and growth) and catabolic processes that stimulate energy expenditure (e.g., thermogenesis).
8/19/09 3:36:38 PM
662
Hunger
Cortex & Thalamus Serial organization of pattern
Arousal of pattern
Hypothalamus Internal chemical & physical factors Hormones Blood temp Osmotic pressure Drugs
INHIB
EXCIT
Sensory stimuli Unlearned & learned
Final common path for behavior Regulation of internal balance
Feedback from consummatory behavior
Figure 34.1 Stellar ’s classical formulation (1954) of the hypothalamic control of motivated behaviors, including feeding. Note: Eliot Stellar developed his pivotal theory to account for motivated behaviors generally (hunger, sexual behavior, sleep, thirst, etc.), but he drew on early work concluding that the hypothalamus contained hunger and satiety centers. He also drew on the Sherringtonian physiology of spinal motor neurons, including the “final common path” concept. In the case specifically of feeding, the lateral hypothalamus was posited to issue an excitatory (EXCIT) command to feed by way of a “final common path for behavior.” In this explanation of feeding, the basomedial hypothalamus was postulated to inhibit (INHIB) feeding by acting as a brake on the excitatory outflow from the lateral hypothalamus. In this hypothalamic model, which dominated early experimentation on the neural basis of feeding, internal humoral signals, sensory information from the environment, arousal and patterning inputs from the forebrain were all envisioned to converge on the hypothalamic centers. These centers in turn were assumed to integrate these inputs and adjust feeding according to need. From “The Physiology of Motivation,” by E. Stellar, 1954, Psychological Review, 61, pp. 6. Reprinted with permission.
Two corollaries of the insight about energy balance are particularly relevant to attempts to understand feeding. First, not only can feeding not be fully appreciated without reference to the other factors in the equation, but, in terms of experimental analyses, it is important to recognize that alterations in feeding can in some cases be secondary to changes in other anabolic or catabolic adjustments. Second, the extensive participation of hypothalamic mechanisms in both endocrine and autonomic systems suggests that hypothalamic feeding effects will almost unavoidably be nested within more global neural programs of energy homeostasis broadly defined—energy homeostasis integrating the demands of growth, reproduction, thermogenesis, activity, and so on. This last point is particularly important in terms of the ongoing searches for therapeutic pharmacological interventions to manage feeding disorders because interventions targeted to the hypothalamic control of feeding
c34.indd Sec1:662
may ramify to interact with the physiologies of growth, reproduction, arousal, and so on. Other sets of observations based on more recently introduced methodologies (see discussions that follow) have further reshaped our understanding of the role of the hypothalamus—as well as other brain mechanisms—in controlling food intake. Before surveying some of these contributions, however, it is useful to reconsider the way in which methods in behavioral neuroscience shape our explanations of behavior. Lesions, localized and nonspecific tissue damage, and macro stimulation, involving relatively large electrodes in the case of electrical stimulation or large cannulas in the case of chemical delivery, were the most widely used techniques in behavioral neuroscience during the era of the hypothalamic feeding center analyses. These techniques, however, in no small measure were responsible for generating the models. Just as to a hammer every problem is a nail, so to a lesion (or conventional stimulation procedure) every problem is a neural center. If not a center, then a fiber bundle that can be interrupted by a focal manipulation. The conventional lesion and stimulation techniques employed in early investigations of the hypothalamus were biased to locate concentrations or nexuses of neural tissue. At the same time, though, by their focal nature, they were also biased to overlook or miss diffusely distributed and redundantly organized neural systems. Both because of the dramatic nature of the symptoms that occurred following hypothalamic manipulations and because the hypothalamus was easily accessible and a convenient size for the neuroscience techniques of the time, it was particularly easy to ignore the limitations and complexities of interpreting lesions and to fall into the trap that every lesion “syndrome” uncovered a “center” (Glassman, 1978). And, additionally, because of the prevalence of the faculty psychology of the era, a psychology that posited centers for behavioral faculties or constructs such as hunger, the behavioral neuroscience of the feeding was preoccupied with the hypothalamus until roughly three decades ago.
FOOD INTAKE WITHOUT THE HYPOTHALAMUS: CAUDAL BRAIN STEM CIRCUITS The early monopoly of the hypothalamus in the behavioral neuroscience of ingestion was challenged by a key series of experiments that demonstrated the capacity of other CNS sites, specifically regions of the caudal brain stem, to organize feeding behavior without hypothalamic influence. In this series, to explore what controls of ingestion were still functional in long-term decerebrate rats, Grill and Norgren (1978) capitalized on the fact that the behavior of
8/19/09 3:36:39 PM
Food Intake without the Hypothalamus: Caudal Brain Stem Circuits
Catabolic path way s LHA ⴙ Response to satiety signals ⴚ A n ab olic pathways POMC
PFA
PVN
NPY
Fat mass
663
NTS
ARC Leptin
Adiposity signals
Insulin
GI tract Vagus nerve
Satiety signals
Liver Mechanical Chemical Energy metabolism
CCK release
Note: This neuroaxis coordinates endocrine and autonomic responses participating in overall energy balance and thus, through its modulation of anabolic and catabolic conditions and the resulting energy homeostasis, generates many of the signals that determine feeding decisions. These signals include not only the exemplars of leptin, insulin, and CCK noted in the figure, but also an extensive battery of peptides released in both gut and brain. Furthermore, other signals such as chemical and mechanical signals from the GI tract are generated in the periphery and affect central integration by way of centripetal neural projections (e.g., afferents of the vagus nerve) through the visceral neuroaxis. Receptors for
gut-brain peptides are expressed throughout this visceral neuroaxis, thus establishing a distributed network involved in processing and integrating energy balance signals. This extended visceral neuroaxis in turn is reciprocally and extensively interconnected rostrally with the limbic and telencephalic sites (not illustrated) that have been implicated in feeding and energy balance. ARC ⫽ Arcuate nucleus; LHA ⫽ Lateral hypothalamic area; NPY ⫽ Neuropeptide Y; NTS ⫽ Nucleus of the solitary tract; PFA ⫽ Perifornical area; POMC ⫽ Proopiomelanocortin; PVN ⫽ Paraventricular nucleus ⫽ SNS ⫽ Sympathetic nervous system. From “Central Nervous System Control of Food Intake,” by M. W. Schartz, S. C. Woods, D. Porte Jr., R. J. Seeley, and Baskin, D. G., 2000, Nature, 404 pp. 668. Reprinted with permission.
chronically decerebrate animals cannot be organized exclusively by the hypothalamus (or any other diencephalic or telencephalic regions) and must reflect the potential of the caudal brain stem. Though such animals do not seem to evidence long-term regulation of body weight (a function that can potentially still be ascribed to the hypothalamus), they do display the capacity to increase and decrease their ingestion in response to a variety of the short-term signals that control intake in intact or control animals. Grill and Norgren, in their reassessment of what they described as the “hypothalamic hegemony” over the neuroscience of feeding, argued persuasively for a more hierarchical view of feeding circuitry, one more consistent with a Jacksonian hierarchy in which functions are redundantly organized or “re-represented” at multiple levels of the neuroaxis, than with a “center” organization.
Thus, the caudal brain stem possesses many of the capacities to generate feedings responses and affect energy homeostasis that had long been attributed exclusively to the hypothalamus. Though many more recent research efforts still—or again—focus on the ventral diencephalic networks and even though the hypothalamic melanocortin system and the rest of the ventral diencephalic ingestive circuitry is clearly critically implicated in feeding (see later discussion), it has also been established that that the caudal brain stem independently possesses many of the same capacities attributed to hypothalamic circuits. Stated differently, hypothalamic circuitry must not be necessary or uniquely organized for many ingestive functions since a decerebrate animal can perform the functions even when the hypothalamus (and indeed the forebrain) is no longer connected to the brain stem.
Figure 34.2 The endocrine and autonomic core of the visceral neuroaxis.
c34.indd Sec6:663
Superior cervical ganglion
Cervical spine SNS afferents
8/19/09 3:36:39 PM
664
Hunger
Figure 34.3 Neural networks responsible for feeding and energy balance. Note: The schematic summarizes numerous projections and inputs that have been identified and characterized with modern neuroscience mapping tools. Dotted lines with arrows are used to designate signals from the environment and/or the internal milieu that converge on the central nervous system. Solid lines with arrows identify centrifugal projections to effector organs or sites. Stippled lines with arrows designate motor pathways. ACB ⫽ Nucleus accumbens; AIC ⫽ Agranular insular cortex; AMY ⫽ Amygdala; AP ⫽ Area postrema; ARC ⫽ Arcuate nucleus; dmnX ⫽ Dorsal motor nucleus of the vagus; HIP ⫽ Hippocampus;
A second type of observation, one based on receptor mapping studies, reinforces the point that the caudal brain stem contains the circuitry necessary to control feeding. As more recent research (see discussion that follows) has focused on key roles of cholecystokinin (CCK), leptin, insulin, ghrelin, and other peripheral hormones in feeding, the initial tendency has naturally been to focus on the hypothalamus, given its well-established involvement in ingestive behavior. The lack of a complete blood-brain barrier in the basomedial hypothalamus, the early demonstrations of receptors for the metabolic hormones in arcuate and other hypothalamic nuclei, and the delineation of the melanocortin circuitry in the hypothalamus all converged on a hypothalamic account of the feeding effects elicited by humoral signals. In spite of these several observations tending to reinforce what has been characterized as a “hypothalamocentric” model of feeding, parallel observations on both the blood-brain barrier of the dorsal vagal complex and the key hormones signaling energy conditions indicated that
c34.indd Sec6:664
LH ⫽ Lateral hypothalamus; MoN ⫽ Motor nuclei for oromotor control; NTS ⫽ Nucleus of the solitary tract; OLF ⫽ Olfactory bulb; PFC ⫽ Prefrontal cortex; PIR ⫽ Piriform cortex; PIT ⫽ Pituitary gland; PRL ⫽ Prelimbic cortex; PVN ⫽ Paraventricular nucleus of the hypothalamus; RF ⫽ Medullary reticular formation; RVLM rostroventrolateral medulla; SNS ⫽ Sympathetic nervous system; V1/V4 ⫽ Visual processing areas 1 and 4; VII ⫽ Facial nerve; V ⫽ Trigeminal nerve; IX ⫽ Glossopharyngeal nerve. From “Mind versus Metabolism in the Control of Food Intake and Energy Balance,” by H.-R. Berthoud, 2004, Physiology and Behavior, 81, pp. 785. Reprinted with permission.
this brain stem vagal trigone area possessed the same features as the basomedial hypothalamus. Like the basomedial hypothalamus, the nucleus of the solitary tract/area po-strema region possesses a leaky blood-brain area that gives circulating humoral factors access to the parenchyma of the dorsal vagal complex. In addition, like the basomedial hypothalamus, the dorsal vagal complex densely expresses receptors for CCK (Zarbin, Innis, Wamsley, Snyder, & Kuhar, 1983), leptin (Funahashi, Yada, Suzuki, & Shioda, 2003), ghrelin (Zigman, Jones, Lee, Saper, & Elmquist, 2006), insulin (Hill, Lesniak, Pert, & Roth, 1986; Kar, Chabot, & Quirion, 1993), melanocortin-4 (Kishi et al., 2003), and many of the other gut or gut-brain hormones and neuropeptides that also modulate central feeding systems. A complementary set of experiments, based on yet a different experimental strategy, also indicates that the receptors for gut-brain neuropeptides and metabolic hormones are not only present in the caudal brain stem, but that these receptors do apparently mediate many of the same ingestive responses previously ascribed to the basomedial
8/19/09 3:36:40 PM
Lessons from Anatomical Mapping and Tracing Technologies
hypothalamus. These experiments involve infusing the different neuropeptides and hormones directly into the brain stem or floor of the fourth ventricle and measuring ingestive responses. Indeed, in a series of experiments that have infused candidate signals directly into the fourth ventricle (since CSF flows caudally within the ventricular system, fourth ventricle infusions should produce negligible rostral effects) and compared the efficacies of this route of administration with those of more rostral infusions, Grill and colleagues (Grill & Kaplan, 1990) as well as others (reviewed in Blessing, 1997) have found that fourth ventricular stimulation is as effective or even, in some cases, more effective than third ventricular stimulation at mobilizing appropriate feeding responses. Such experiments find a foundation in the general point that many earlier infusion experiments designed to probe the role of the hypothalamus in organizing feeding in response to different humoral signals (and interpreted in terms of a hypothalamo-centric model) did not necessarily limit the administration of the signals to the hypothalamus. The most commonly used protocol for probing the humoral sensitivity of the hypothalamus has been to cannulate a lateral ventricle or the third ventricle and to infuse the signal, say leptin, into the ventricle. If we adopt a hypothalamo-centric model and assume the target receptors and target tissues are in the arcuate nucleus or infundibular hypothalamus, then such ventricular infusions will apparently have their effects at these diencephalic sites and diffusion or spillage to other sites will be inconsequential. Alternatively (if the hypothalamocentric assumption about limited distribution of receptors is wrong, as it has proven to be), since the flow of CSF is caudal toward the fourth ventricle, any receptors in dorsal vagal complex or other periventricular sites might also be activated. Two key perspectives emerge from the observations establishing that the lower brain stem contains the neural circuitry sufficient to control feeding: First, observations on the ingestive behavior evidence by decerebrate animals also suggest that metabolic signals are detected not exclusively by the hypothalamus, but also, in parallel, by other regions of the visceral neuroaxis including the caudal brain stem and even the primary afferents that innervate the viscera (see Figure 34.2; also see Figure 34.5). The evidence indicates that there is considerable redundancy in the CNS in terms of the metabolic signals that influence feeding— the role of the hypothalamus (and that of the caudal brain stem as well) is more one of a cooperative part of a distributed network, rather than a monolithic control center. Second, much of the neural apparatus controlling feeding must be organized in the caudal brain stem, thus suggesting
c34.indd Sec2:665
665
that these particular elements of ingestive behavior are not uniquely and/or exclusively organized in the hypothalamus. By extension, the point suggests, as discussed previously, that the hypothalamic circuits implicated in feeding may be committed more to monitoring humoral and endocrine signals and integrating them into coordinated autonomic or neuroendocrine adjustments associated with energy balance.
LESSONS FROM ANATOMICAL MAPPING AND TRACING TECHNOLOGIES: DISTRIBUTED NEURAL NETWORKS COOPERATE TO CONTROL OF FEEDING AND ENERGY REGULATION Anatomical analyses have implications for functional interpretations and define boundary conditions within which behavioral systems presumably operate. It is possible, in broad terms, to reverse engineer the types of operations that are likely to occur from the structural organization that exists, that is, to infer function from form. If, for example, there are no neural connections between two regions, then any interactions or communications must either (1) be non-neural (e.g., hormonal or humoral), or (2) possibly not occur. Strong projections between a site and a target, on the other hand, suggest substantial neural interactions. Or, yet again, sites or circuits that express a particular receptor presumably respond to the corresponding ligand. Often such constraints of structure influence behavioral neuroscience analyses and models without much explicit discussion. For example, the limited anatomical information about hypothalamic projections and interconnections that was available at the middle of the past century made it reasonable at the time to consider hypothalamic sites implicated in feeding behavior as executive centers that operated more or less autonomously. Similarly, with relatively few energy-balance feedback signals recognized at the time, and with those known (e.g., the glucostatic and lipostatic mechanisms) seemingly effective in the hypothalamus, it was reasonable for investigators to conclude that the few humoral signals and an equally limited number of visceral afferent inputs converged on the hypothalamus that then controlled feeding and the physiology of energy balance in a top-down executive program. By comparison with today’s information, knowledge about brain circuitry was quite limited when the original hypothalamic feeding center models were developed. Much of what was then known about neural circuits was inferred from staining procedures that delineated Nissl patterns of conspicuous clusters of neurons, nuclei, and the
8/19/09 3:36:40 PM
666
Hunger
more conspicuous and coherent fiber tracts, either in relief or in myelin staining. These nissl-and-myelin maps were supplemented by information from notoriously capricious silver methods for degeneration and limited tracing methods, mainly retrograde degeneration that is often illusive and rarely works in polysynaptic chains. Thus, the limited anatomical appreciation of the extent of the interconnectivity of the hypothalamus—and for that matter, the caudal brain stem—with other brain sites made it reasonable, and even necessary, to consider the hypothalamus in terms of overly simplified assumptions of inputs and outputs. These unrealistic expectations were then incorporated into many models and schematics (see, for example, Figure 34.1). The past decades, however, have seen the development of an enormous battery of tools for delineating the details of neural circuits. Hundreds of neural tracing techniques have been developed (e.g., Zaborszky, Wouterlood, & Lanciego, 2006) and used to specify myriad interconnections and projections that were unknown when lesion studies initially concentrated on the hypothalamus. Similarly, an equal or larger number of histochemical and immunocytochemical protocols have been developed and used to recognize pathways expressing particular neurotransmitters, peptides, or receptors (e.g., Bjorklund, Hokfelt, & Owman, 1988). As these mapping tools have been applied, much has been learned about just how extensive the interconnections of the different sites implicated in the control of feeding actually are. The developments include the recognition that the hypothalamus is reciprocally and multiply interconnected with the caudal brain stem sites implicated in feeding (Broberger & Hokfelt, 2001; Sawchenko, 1998). Furthermore, the mapping experiments indicate that both of these hubs of feeding circuitry are embedded in extensive, often parallel and redundant, circuitry that welds the two complex stations into a massively cross-linked system of re-entrant connections involving most, and probably all, of the brain (e.g., Berthoud, 2002, 2004; Holstege et al., 1996; see also Figure 34.3). Parallel discoveries, with the different mapping techniques have also produced a corresponding rethinking of the peripheral nervous system and have indicated both the complexity of autonomic circuitry in the viscera and the extensive afferent and efferent projections by which the CNS is linked to that peripheral circuitry. The enteric nervous system of the gut is now widely recognized as being complexly interconnected and containing so many neurons (equivalent to the total number in the spinal cord) that it is often considered a “second brain” or a “little brain” organized in a distributed fashion throughout the gut. Correspondingly, the extrinsic pathways that connect the CNS to the enteric nervous system or “brain” in the
c34.indd Sec2:666
gut are now similarly realized to be both more numerous and more highly organized than previously presumed. Autonomic efferents project densely to the GI tract (Holst, Kelly, & Powley, 1997), and, within the organs of digestion, visceral afferents supply a profuse network formed of a number of different specialized endings (Powley & Phillips, 2002). The autonomic efferents and afferents interconnecting the brain of the CNS with the little brain of the GI tract are so extensive that, from the functional perspective, it is even in all likelihood misleading functionally to compartmentalize and separate CNS and peripheral nervous system (PNS). As structural experiments revealed the ubiquitous cross-linkages throughout both the central and peripheral components of the visceral neuroaxis, two other aspects of the changing views of energy-balance signaling have reinforced the conclusion that the visceral neural network is distributed and decentralized with cross-linked control loops controlling energy balance and feeding. One of the developments was the recognition that, in contrast to the assumptions of early models of feeding, the peripheral organs that participated in energy metabolism and energy regulation also have much richer local control networks of not only neural, but also endocrine and paracrine, coordination that effect regulation of the energy economy in the periphery, presumably without hypothalamic executive intervention. The second change in the perspective on energy balance to develop was the realization of how extensive a battery of hormones and cytokines the organs of digestion and metabolism release in the course of executing the local regulation of the different phases of metabolism. In the middle of the twentieth century, when the hypothalamic feeding center model was proposed, only two or three GI hormones had been identified and characterized. In contrast, it is now appreciated that the gut is the largest endocrine organ in the body and that it orchestrates much of energy balance with the releases of over 30 peptide hormones that serve as endocrine signals reflecting anabolic and catabolic events (e.g., Rehfeld, 1998; also see Figures 34.4 and 34.5). The GI tract, for example, elaborates, among others, CCK, leptin, ghrelin, glucagon-like peptide-1 (GLP1), peptide YY (PYY), gastrin, secretin, obestatin, numerous other hormones, and a number of cytokines. These hormones commonly serve as paracrine and neurocrine factors influencing local physiology and thus, indirectly, produce additional feedback and feed forward to neural circuits. Similarly, adipose tissue is now realized to be an active, dynamic system with its own regulatory loops as well as to be innervated (Badman & Flier, 2007; Powell, 2007). In its decentralized orchestration of metabolism, adipose tissue synthesizes and releases leptin,
8/19/09 3:36:41 PM
Lessons from Anatomical Mapping and Tracing Technologies
adiponectin, resistin, estradiol, angiotensin, and cytokines such as interleukin-6 (IL-6) and tumor-necrosis factoralpha (TNF-a). Concomitantly, with the recognition that there is considerable local integration and control and that a substantial number of potential signals is generated in the process of local control of the viscera, came recognition (a) that many peptide hormones produced by the gut were actually gut-brain hormones (Figure 34.4) elaborated by both the viscera and the brain and (b) that the receptors for many of these signals could be found throughout the visceral neuroaxis (Figure 34.5). As the enormously wide distributions of receptors for the multiplicity of endocrine and humoral factors influencing energy balance was recognized (Funahashi et al., 2003; Hill, Lesniak, Pert, & Roth, 1986; Kar et al., 1993; Kishi et al., 2003; Zarbin et al., 1983, 2006), the simplifying proposition that any one hormone might code for or signal a particular function (e.g., ghrelin for hunger or CCK for satiety) became untenable. The hormones affecting energy balance also operate in many physiological and behavioral systems, not merely ingestion (see Figure 34.6). Leptin, for example, does not simply modulate feeding, it participates in, inter alia, immune function, inflammation, learning processes, cardiovascular function, reproduction, and
667
bone metabolism as well (Harvey & Ashford, 2003). That signaling associated with a single hormone cuts across so much physiology tends, of course, to confound the search for function-specific pharmacological treatments. With all of the new observations, the idea that the hypothalamus, among neural sites, might have near-exclusive access to the putative humoral signals and that it must therefore act as a top-down controller failed to square with (a) complex regulatory loops discovered in the periphery, (b) the rich flux of potential signals elaborated by these regulatory loops, (c) the fact that receptors for the putative signals are widely distributed throughout the visceral neuroaxis, and (d) the evidence that the signals set and bias the gains of the regulatory circuitry. Indeed, the amount of integration and local regulation that is now known to occur in the GI tract and other viscera makes it possible to assert persuasively that the control of feeding involves as much bottom-up regulation and integration (e.g., Cummings & Overduin, 2007) as it does top-down programming by hypothalamic circuits. What modern neuroanatomical methods have not revealed is also instructive. As mentioned, many early versions (and even recent versions) of the hypothalamic feeding center model implicitly or even explicitly (see Figure 34.1; also, for comparison, see Figure 34.8) considered the
Esophagus
Stomach Ghrelin Leptin GRP, NMB Duodenum CCK
Small intestine
Jejunum APO AIV Ileum GLP1 Oxyntomodulin PYY
Figure 34.4 Principal peripheral sites of synthesis of gut-brain peptides or gastrointestinal peptides that influence feeding. Note: Signals are depicted in terms of the main gut location of production, though many of the peptides are produced at multiple sites. Importantly, most of these peptides (i.e, CCK, APO AIV, GLP1, oxyntomodulin, PYY, enterostatin, ghrelin, gastrin-releasing peptide [GRP], neuromedin B [NMB], and possibly pancreatic polypeptide [PP]) are synthesized within the brain as well as within the gut. Gut peptides that influence feeding,
c34.indd Sec3:667
Pancreas Amylin Enterostatin Glucagon Insulin PP Colon GLP1 Oxyntomodulin PYY
but that do not appear to be synthesized in CNS include leptin, insulin, glucagon, and amylin. Additional abbreviations are given in Figure 34.6. APO AIV ⫽ Apolipoprotein A-IV; CCK ⫽ Cholecystokinin; GLP1 ⫽ glucagon-like peptide-1 ; GRP ⫽ Gastrin-releasing peptide; NMB ⫽ Neuromedin B; PP ⫽ pancreatic polypeptide; PYY ⫽ peptide YY. “Gastrointestinal Regulation of Food Intake,” by D. E. Cummings and J. Overduin, 2007, Journal of Clinical Investigation, 117, pp. 14. Reprinted with permission.
8/19/09 3:36:41 PM
668
Hunger
Selected GI and pancreatic peptides that regulate food intake Peptide
CCK GLP1 Oxyntomodulin PYY3⫺36 Enterostatin APO AIV PP Amylin GRP and NMB Gastric leptin Ghrelin
Main site of synthesis
Receptors mediating feeding effects
Proximal intestinal I cells Distal-intestinal L cells Distal-intestinal L cells Distal-intestinal L cells Exocrine pancreas Intestinal epithelial cells Pancreatic F cells Pancreatic  cells Gastric myenteric neurons Gastric chief and P cells Gastric X/A–like cells
CCK1R GLP1R GLP1R and other Y2R F1-ATPase  subunit Unknown Y4R, Y5R CTRs, RAMPs GRPR Leptin receptor Ghrelin receptor
Figure 34.5 A partial list of selected gastrointestinal and pancreatic gut-brain peptides that influence food intake. Note: The primary site of synthesis of the peptide and its receptor mediating its effects on ingestion are summarized, as is the orexic or anorexic influence that the peptide has on intake. In addition, known nervous system sites of action of the peptides are indicated with “Xs.” Even with an absence of evidence for some peptides at some sites (the blank spaces), as of yet, it is clear the majority of the peptides bind with their receptors
hypothalamic mechanisms an integrative center that generates executive motor decisions to feed or not to feed. These models, in the spirit of a Sherringtonian final common path or a command neuron output, often hypothesize a key output pathway to brain stem motor centers that would ultimately organize the behaviors. In contrast to the direct efferent access to the pituitary and autonomic preganglionics, such posited pathways from hypothalamic circuitry to brain stem and spinal cord motor neurons pools have not, however, been verified in the extensive tracing and mapping analyses that have now been done. Overall, the new insights to have emerged from modern mapping strategies have necessitated a rethinking of the neural basis of feeding. Such a reframing is still very much ongoing, though. For example, by one construction, each of the multiple sites associated with feeding could be considered a specialized processor or module—multiple specialized processors, each contributing unique analyses or syntheses to the enterprise of energy balance. Such a view has been adopted or discussed in recent reviews (Berthoud, 2002, 2004; Saper, Chou, & Elmquist, 2002; Williams et al., 2001). In contrast, however, another construction of the decentralized network delineated by the newer mapping analyses would be that there is an extensive amount of redundancy and overlap of functional capacity among the distributed sites. Presently available observations on the neural substrate of feeding neither firmly reject either of the constructions nor unequivocally establish the validity of either perspective.
c34.indd Sec3:668
Effect on Sites of action of peripheral food intakeA peptides germane to feeding Hypothalamus Hindbrain Vagus nerve X X X X X? X? X X X X X X X X X X X X ? ? X X X X
throughout the visceral neuroaxis—in the vagus nerve and brain stem as well as in the hypothalamus. APO AIV ⫽ Apolipoprotein A-IV; CCK ⫽ Cholecystokinin; CTRs ⫽ Calcitonin receptors; GLP-1 ⫽ Glucagon-like peptide 1; GRP ⫽ Gastrin-releasing peptide; GRPR ⫽ GRP receptor; NMB ⫽ Neuromedin B; RAMPs ⫽ Receptor activity-modifying proteins. From “Gastrointestinal Regulation of Food Intake,” by D. E. Cummings and J. Overduin, 2007, Journal of Clinical Investigation, 117, pp. 15. Reprinted with permission.
NEUROPHYSIOLOGY OF SINGLE CELLS: ANOTHER DISTRIBUTED-NETWORK VIEW OF BRAIN FEEDING MECHANISMS Single-unit electrophysiology has provided another important window on how the nervous system integrates the signals of energy balance and organizes feeding responses. The contributions of single-cell recording experiments to the neurobiology of feeding might be viewed as paralleling chronologically the development and application of the neural tracing technologies just discussed. Like anatomical mapping techniques, electrophysiological analyses have evolved considerably in terms of their sensitivity and scope. For purposes of this brief survey, the evolution might be considered to encompass three stages: an initial period in which recording needed to be performed in animals that were extensively restrained or, even more often, anesthetized; a later phase in which unit recording was practical in awake, freely behaving animals; and a final stage in which multiple units of circuits or ensembles could be simultaneously recorded during behavior. In the earliest electrophysiological work, recording experiments were perhaps most commonly designed to corroborate the hypothalamic feeding center model. Recording typically did not take place during behavior, and the animal subject was anesthetized and/or paralyzed. Many of the experiments were focused on the hypothalamus, with little or no sampling of neurons in other regions, and often explored the types of signals (glucose, other metabolites,
8/19/09 3:36:41 PM
Neurophysiology of Single Cells: Another Distributed-Network View of Brain Feeding Mechanisms
669
Selected appetite-modifying peptides, illustrating their central effects on energy balance and other physiological activities Peptide
Effects on Energy Balance Feeding
Thermogenesis
NPY
MCH
?
Orexin A Galanin Opioids ␣-MSH 5-HT GLP-1 CCK
Figure 34.6 Some of the most common neuropeptides implicated in the control of food intake. Note: These neuropeptides are widely expressed in the hypothalamus and rest of the CNS, as well as in the periphery. Arrows summarize dominant directions of effect exerted by the neuropeptides (decreases, increases, no clear increase or decrease). As the table organization illustrates, many of the neuropeptides influence both feeding (as well as body weight)
gustatory and visceral afferent inputs, etc.) that would affect hypothalamic unit activity. In this phase, most of the electrophysiology was designed to confirm and to elucidate the hypothalamic center model, and the resulting observations were generally consistent the proposition that the hypothalamus received (and therefore, at least in principle could integrate) humoral, gustatory, and visceral inputs (presumably in the service of decisions to feed or not to feed). As single-unit techniques evolved and could be used practically to monitor neuronal traffic in awake, behaving subjects, however, a picture of a more extensive circuitry of feeding emerged. In part perhaps because the power of the newer electrophysiological paradigms permitted recording from awake, behaving animals and in part because the proposition that feeding was virtually a hypothalamic prerogative had already begun to wane, unit recording began to describe a much more distributed network of sites participating in the control of feeding. Individual units throughout much of the limbic system, and particularly in the orbitofrontal cortex, were found (Rolls, 2005) to have firing patterns associated with deprivation, repletion, food choice, palatability, and other conditions classically ascribed to the hypothalamic feeding centers. Similarly, neurons throughout the gustatory and visceral neuroaxes,
c34.indd Sec8:669
Body Weight
Ezamples of Other Physiological Actions
Blood pressure regulation, circadian rhythmicity, and memory processing Locomotion and regulation of skin colour Wakefulness and alertness Reproduction Locomotion and reproductive behavior Grooming and blood pressure regulation Mood regulation and behavioral responses Regulation of blood glucose and gut motility Grooming and blood pressure regulation
and energy expenditure or thermogenesis. In addition, notably, as summarized in the right-hand column, all of the neuropeptides also influence other physiological systems, not merely energy homeostasis. From “The Hypothalamus and the Control of Energy Homeostasis: Different Circuits, Different Purposes” by G. Williams et al., 2004, Physiology and Behavior, 81, p. 212. Reprinted with permission.
including even first-, but particularly second- and all higher-order neurons of the neuroaxes displayed evidence that their respective activities were modulated by signals (e.g., energy infusions) or conditions (e.g., deprivation or hunger, refeeding or satiety) that often had been attributed to hypothalamic processing (Scott, Yan, & Rolls, 1995). The organizational pattern suggested by the results was more consonant overall with the distributed network ideas that were emerging in the neural tracing developments (see previous discussion) occurring in parallel. Even more recently, in what might be considered the third stage of development, the further evolution of techniques for the recording of cells in behaving animals, the introduction of “multi-trode” or multiple electrode recording simultaneously from large numbers or ensembles of individual neurons combined with the increased availability of hardware and software for massive computation has made it feasible to monitor systems or networks of neurons in behaving animals. For example, Araujo and co-workers (2006), based on concomitant recordings from the lateral hypothalamus, amygdala, insular cortex, and orbitofrontal cortex through a complete feeding-satiety-feeding cycle, argued that hunger may be organized in terms of a “distributed population code.”
8/19/09 3:36:45 PM
670
Hunger
Too few electrophysiological observations—certainly too few cases examining neurons in multiple sites for long intervals—are yet available to yield a complete perspective on the prospect of ensemble coding of hunger and satiety. Furthermore, the exquisitely high temporal and spatial resolution that unit recording achieves, often comes at the price of limited windows of time for observation (gauged by the length of a feeding bout or the duration of an inter-meal interval). Because of the temporal constraints and sampling limitations, only relatively phasic and potentially unrepresentative subsets of neurons from the entire population of the viscera neuroaxis can be readily characterized. Nonetheless, it is the case that as single-neuron electrophysiology has developed, it has come to portray an extensive and distributed network of neurons that are active during the execution of feeding behavior.
ECOLOGICAL OBSERVATIONS: ENVIRONMENTAL CONTINGENCIES HAVE SHAPED NEURAL MECHANISMS OF FEEDING Applications of newly developed neuroscience technologies have driven most of the evolution in ideas about the neurobiology of feeding. Nonetheless, advances have come from other biological fields as well. Two ecological analyses have been particularly instructive. The first analysis deals with rethinking regulatory mechanisms that determine hunger and satiety; the second addresses the issue of brain networks implicated in feeding behavior. These ecological perspectives developed initially independently of, and in parallel with, the neuroscience of feeding. More recently, though, the ecological and neural perspectives have begun to merge. Thrifty Gene Hypothesis The first ecological analysis challenges traditional views about the operations of the regulatory mechanisms controlling feeding. This viewpoint is associated with the thrifty gene hypothesis introduced by Neel (1962). The concept can be appreciated by contrasting it with the ideas about the control of energy balance that were common prior to Neel’s articulating the hypothesis. Early feeding models assumed, in effect, that hunger and satiety mechanisms are organized in a symmetrical manner, much like Sherrington’s agonists and antagonists or flexors and extensors, around a point of energy balance. Specifically, the models often assumed that the control function gains of the neural mechanisms translating energy perturbations into responses that correct deficits and surpluses, respectively, are comparable. Neel and other investigators who explored the thrifty gene concept noted, however, that a symmetry assumption
c34.indd Sec8:670
is not consistent with observations on the biological adaptations of most species (Bjorntorp, 2001; Coleman, 1979; Schwartz et al., 2003). Behavioral and physiological mechanisms that redress energy deficits and their counterpart mechanisms that correct energy surpluses are not controls of comparable efficiency organized in mirror-image symmetry around an equilibrium point at which intake is matched to expenditure. While there presumably have been many imperative selection pressures to avoid energy deficits, analyses suggest that there have not been equally strong pressures to avoid positive energy balances. Even more specifically, Neel noted that evolution in demanding environments would have selected for genes that are “thrifty” and promote efficient storage of calories that may mitigate times of intermittent food availability or famine. The more unpredictable and/or hostile the environment, the more advantage in having such thrifty genes promoting energy storage. Procreation requires not starving to death before you can pass on your genes. Falling into energy deficit can easily be fatal and thus thwart propagating one’s genes, while having plenty—perhaps even excesses—of calories stored may well see members of the species through their reproductive age. Even if energy surpluses are not optimal for avoiding diseases of old age (Neel focused on Type II diabetes and obesity) or for longevity, they make good reproductive sense. Indeed, too effective a set of defenses mobilized against any stored energy excess would be maladaptive, and reserves that were so tightly regulated that they could not fluctuate would be something of an oxymoron. Hence, from a thrifty gene vantage point, animals benefit from stringent hunger mechanisms and defenses against energy deficits, while they also benefit (at least reproductively) from elastic, flexible, and more limited satiety mechanisms and defenses against positive energy balance and storage. These functional asymmetries can elucidate how control mechanisms are structured. They also can partially explain why numerous challenges from dietary manipulations to even subtle metabolic disorders easily produce obesity and excessive energy storage conditions but less readily yield anorexias and wasting disorders. Feeding Complexities, Brain Mass, and Brain Circuits The second perspective, which emerges from neuroecology, reinforces the inferences about distributed networks that have emerged from anatomical mapping experiments (see earlier section) and have been emerging from both electrophysiological mapping experiments (see earlier discussion) and functional scanning studies as well (discussed in the next section). The perspective grows out of the widely accepted evidence that the brain sizes of different
8/19/09 3:36:46 PM
Ecological Observations: Environmental Contingencies Have Shaped Neural Mechanisms of Feeding 671
species are proportional to the variety, complexity, or unpredictability in space and time of their respective food supplies. Omnivores, active predators, and generalist species living in problematic environments, all have relatively larger brains (even once the contribution of factors such as body mass that also predict a portion of brain mass are accounted for). Herbivores with simple diets, animals adapted to predictable environments with ready sources of nutrients, and monophagous species have relatively smaller brains for their respective body sizes. The brain-environmental-demand correlation can be decomposed into two more particular relationships, each with implications for a neurobiology of feeding behavior. First, there is a correlation between the behavioral specializations species use in feeding and the relative size of the different sensory, motor, and “cognitive” neural systems that hypertrophy in those species. Raptors and other predators that rely on sight, for example, have more extensive visual systems; caching species that must remember storage sites have larger hippocampi; species that devise novel feeding solutions have larger forebrains and cortical association areas. Such observations are numerous, and they have been documented for a variety of different taxa and families, including fish, birds, rodents, and primates (see, e.g., Iwaniuk & Hurd, 2005; Lefebvre, Reader, & Sol, 2004; Nicolakakis & Lefebvre, 2000; Timmermans, Lefebvre, Boire, & Basu, 2000). Conspicuously—and tellingly—in its absence, hypothalamic hypertrophy has not been correlated with demanding environments. The second, and more general, correlation between brain mass and feeding complexity is that, even after the increases associated with general factors such as body size and with the specific factors of sensory or motor or cognitive systems have all been partialled out, there is still an underlying residual positive correlation between overall brain mass and the complexity of the ecological niche for feeding. Bigger brains are found in species with complex feeding patterns adapted to challenging environments. This more general relationship also appears to hold for a variety of taxa and families (e.g., Aboitiz, 1996; Bernard & Nurton, 1993; but also see Healy & Rowe, 2007). Though there are a number of interpretations we might apply to these observations, this general correlation is consistent with the implication suggested by the neuroanatomical mapping, electrophysiological, and functional mapping (see discussion that follows) literatures implicating many distributed CNS sites as a neural network active and involved in hunger and satiety. The two neuroecological correlations taken together point to a complex neural network involved in ingestive behavior. They also serve as a reminder to neurobiological analyses of ingestion of just how pervasive feeding behavior is in most species’ lives and just how much of the CNS is preoccupied with ingestive behavior. Though
c34.indd Sec4:671
animal or human subjects and nutritional neuroscientists who study them are, typically, buffered from the exigencies of their hunter-gatherer roots, most species expend most of their energy most of the time making feeding decisions in challenging environments where it is necessary to obtain nutrients while evading predation, conserving calories, balancing multiple homeostatic needs, maintaining physiological vigilance for microbes and toxins that often occur in potential food sources, and juggling resource unpredictabilities. A consideration of these environmental and physiological contingencies (for instructive discussions, see Collier & Johnson, 2004; Garcia, Hankins, & Coil, 1977; Harris & Ross, 1987; Rozin, 1976; Woods, 1991) and the multidimensional demands they place on sensory, motor, memory, and planning operations explains why a multiplicity of brain sites increase in mass in challenging environments. Neurobiology of Ingestive Mechanisms from the Environmental Perspective Ecology, in stepping back from a short-sighted focus on neural machinery and in considering the environment to which the circuitry is adapted, offers other lessons as well. Such examinations have forced a more general recognition that the mechanisms of energy homeostasis did not evolve in vacuums. Neural controls carry the stamp of the environment and the diets that species have evolved to exploit. This realization has spotlighted the fact that feeding strategies and the neural control mechanisms that were shaped for the Paleolithic era or before may be inefficient—or even pathological—in negotiating the dietary, nutritional and energetic contingencies of the twenty-first century. Just as the physiological disturbances that astronauts experience in zero gravity have emphasized that mechanisms selected for an earth-bound gravitational environment do not perform optimally in all environments, so the obesity epidemic and other modern feeding disorders would seem to suggest there are limits of energy-balance mechanisms that were selected for the environmental challenges faced by man’s hunter-gatherer ancestors (Eaton, Eaton, & Konner, 1999; Pollan, 2006). Certainly the environments in which feeding mechanisms evolved did not include ready surpluses of energy-dense, highly-processed sources of nutrients. Finally, ecological observations also add an instructive methodological footnote that applies to many feeding experiments in behavioral neuroscience. Laboratory experiments on feeding typically employ highly simplified, rigidly predictable environments and testing regimens. Such is the nature of good experimental control. Paradoxically, though, in establishing experimental control, laboratory research removes or clamps many of the challenges that the nervous system has evolved to negotiate when the individual
8/19/09 3:36:47 PM
672
Hunger
feeds. Thus, experiments often remove many of the sensory, motor, memory, and planning demands the animal more typically faces. In essence, to study, say, the hypothalamus, we remove the challenges for the association cortex, frontal lobe, parietal lobe, cingulate cortex, and cerebellum from the contingencies the animal encounters. Not surprisingly (though seldom actually discussed), such a paradigm accentuates, or even exaggerates, the apparent role of the hypothalamus by deemphasizing the operations of the other stations of the neural network. Simplify the environment enough, engineer most of the normal environmental contingencies out of the task, and employ tests selected to tap hypothalamic processing, and it is likely that the hypothalamus will appear to operate like an executive center controlling feeding.
HYPOTHALAMIC CIRCUITS OF INGESTION AND THEIR SIGNALS CHARACTERIZED WITH MOLECULAR BIOLOGY As previously outlined, early behavioral neuroscience explained the control of food intake and energy balance in terms of hypothalamic hunger and satiety centers. These ideas, though, seemed to lose much of their explanatory validity in the research of the past three decades of the twentieth century. The unique and executive preeminence originally assigned to the hypothalamus became unsupportable as different technologies such as decerebration techniques, anatomical mapping, and single unit electrophysiology were applied. Simultaneously, the hypothalamic center idea also seemed less accurate as neuroscientific explanations of a variety of motivated behaviors (e.g., reproductive behavior—see Chapters 5, 6, 24, and 35), including feeding, evolved. As better understanding of motivational mechanisms accumulated (see also Chapter 36), there was a growing recognition that the uncritical invocation of hunger and satiety mechanisms, as frequently had occurred, amounted to invoking a folk or faculty psychology to account for feeding. Behavioral “faculties” were simply mapped, particularly in some of the earlier brain behavior analyses, onto regions of the brain in a one-to-one phrenological fashion. But, unless the operation of neural region can be specified at the neuronal level, treating a region as a proverbial black box and invoking the idea of a hunger center to explain hunger is tautological, and positing a satiety center to explain satiety is hollow. In the past decade or so, in terms of experimental focus, the research pendulum has swung back to a strongly renewed interest in the hypothalamus (Elmquist, Elias, & Saper, 1999; King, 2006; Marx, 2003; Schwartz, Woods,
c34.indd Sec4:672
Porte, Seeley, & Baskin, 2000; Woods, Schwartz, Baskin & Seeley, 2000). Though the reversal is by no means complete, the hypothalamus has been implicated as a crucial hub in the control of feeding by new methodologies that provide some of the previously unavailable neuronal-level specification of mechanism, thus offering critical information to escape the circularity just discussed. At least three complementary types of investigations, emerging in large part from the revolution in molecular biology and applications of new genetic tools to the epidemic of obesity and other eating disorders, have again refocused considerable experimental effort on the hypothalamic mechanisms that affect ingestive behavior. An understanding of the circulating signals, the circuits, and the neuropeptides expressed in the hypothalamic network have developed synergistically. Earlier studies, and in particular experiments that combined genetically obese mutant ob/ob and db/db mice in parabiotic pairs (see Figure 34.7; see also Coleman, 1973; Coleman & Hummel, 1969), had implicated a lipostatic signal in the adiposity disorders, but the exact mechanism remained obscure. When the ob gene was eventually cloned and determined to encode leptin and the db gene was found to code for the leptin receptor (Tartaglia et al., 1995; Zhang et al., 1994), it rapidly emerged that the hormone leptin, produced primarily by adipocytes, served as a lipostatic signal, that receptors for the signal were expressed in the hypothalamus, that leptin was transported across the weak blood-brain barrier of the basomedial hypothalamus, and that appropriate manipulations of the hormone and receptor could variously produce phenocopies of the ob/ob mice or correct the disturbances caused by the mutation (Friedman & Halaas, 1998). With a key role for leptin established, receptor mapping studies of the hypothalamus indicated that leptin receptors were found through the tuberal hypothalamic nuclei implicated in feeding (including the ventromedial nucleus and lateral hypothalamus as well as the paraventricular, dorsomedial, and arcuate nuclei), with particularly heavy expression found in the arcuate nucleus (Barsh & Schwartz, 2002; Elmquist et al., 1999; Sawchenko, 1998; van den Pol, 2003). As investigations focused on the arcuate nucleus, the outlines of the intrahypothalamic circuitry implicated in energy balance, a network now recognized as a melanocortin system, emerged (see Figure 34.8). The arcuate nucleus contains two distinct types of neurons, both expressing the leptin receptor and both releasing the inhibitory transmitter gamma-aminobutyric acid (GABA), which through the neuropeptide modulators they produce and also release, have opposite effects (through the melanocortin system) on feeding. One class of arcuate neurons expresses both proopiomelanocortin (POMC) and cocaine-amphetamine
8/19/09 3:36:47 PM
Hypothalamic Circuits of Ingestion and Their Signals Characterized with Molecular Biology
regulated transcript (CART). POMC is cleaved into a number of products including melanocyte-stimulating hormones, adrenocorticotropic hormone, and -endorphin. When these POMC/CART neurons are activated, they affect melanocortin receptors (particularly the subtypes 3 and 4, or MC3R and MC4R) expressed through the paraventricular nucleus, lateral hypothalamus, dorsal hypothalamic nucleus, and arcuate. Such activation when elicited by leptin or other stimulation or when achieved by the appropriate pharmacological challenges reduces food
ob
db
Figure 34.7 The parabiosis method and a classical illustration of the technique used to demonstrate a lipostatic signal (i.e., leptin). Note: Parabiosis is the condition in which two animals are conjoined surgically, typically side-to-side. The surgical union is commonly performed so that the animals exchange blood through vascular anastomoses. This experimental analogue of Siamese twins thus provides a preparation in which hormonal and other blood-borne signals pass between the pair of animals (but, of course, their neural pathways remain separate). The demanding surgical and maintenance requirements have limited the use of the technique, but the method can, in some cases, provide particularly definitive tests. In classical experiments performed 25 years before the ob and db genes were cloned and determined to code for leptin and the leptin receptor, respectively, Coleman and his colleagues (Coleman, 1973; Coleman & Hummel, 1969) were able to predict the existence of the adipocyte hormone and its receptor and partially describe the unknown hormone’s physiology through experiments using parabiosis. Three of the surgical pairings that the Coleman laboratory employed are illustrated in the top row of this figure; the experimental outcomes of the different unions are illustrated in the bottom row. When (in the left column) an obese ob/ob mouse (dark gray) was joined parabiotically to a normal control mouse (medium gray), the ob/ob mouse reduced its food intake and dieted down, suggesting that the normal-weight control animal was producing a lipostatic signal (now known to be leptin) that the ob/ob mouse could detect, but not produce. When (in the middle column) a fat db/db mouse (light gray) was
c34.indd Sec5:673
673
intake and body weight while simultaneously increasing energy expenditure. The second class of arcuate neurons, intermingled with the first, has reciprocal effects. This second class also releases GABA, but produces and secretes the neuropeptides neuropeptide Y (NPY) and agouti gene-related transcript (AgRP). The peptides are endogenous antagonists of the MC3R and MC4R, thereby blocking the activation of the melanocortin receptors by ␣-MSH and other products of the POMC/ CART neurons. NPY and AgRP release with its blockade
ob
db
joined to a control mouse (medium gray), the control mouse reduced its food intake below normal and dieted down, suggesting that the obese db/ db mouse was producing high levels of a lipostatic signal (leptin, as it turns out), which it did not detect but which the normal control mouse interpreted as excess adiposity. When (in the right-hand column) an obese ob/ob mouse (dark gray) was parabiosed with a similarly obese db/db mouse (light gray), the ob/ob mouse reduced its intake and dieted down while the db/db mouse remained fat, suggesting again that the ob/ob mouse was sensitive to, or had the receptor for, a circulating lipostatic factor produced by the db/db mouse. The db/db mouse of the pair remained obese, consistent with the conclusion that this animal lacked the receptor for the lipostatic hormone. From “Genetics of food intake, body weight and obesity,” By R. Bowen(2001). Web publication: http://www.vivo.colostate .edu/hbooks/pathphys/digestion/pregastric/fatgenes.html. Reprinted with permission. As the Coleman experiment illustrates, the parabiosis method can yield compelling analyses of humoral factors (see also Martin, White, & Hulsey, 1991, for a review of additional demonstrations). Variants of the surgical protocol can also be employed to particular effect. In one such variant, it is possible to cross not only the blood supplies of parabiotic twins, but also segments of their gastrointestinal tracts. Koopmans, McDonald, and DiGirolamo (1997), for example, have done a number of experiments with parabiotic “intestines-crossed” rats in which food ingested and then partially digested by one animal can be diverted from its proximal intestines into the distal intestines of its parabiotic partner (and vice versa).
8/19/09 3:36:47 PM
674
Hunger
Second-order and downstream neurons
Neuron Y1r
Mc4r
Food intake
Energy expenditure
Food intake
Food intake
Arcuate nucleus
Ghsr
Agrp/ Npy
Y1r First-order neurons
Pomc/ Cart
Mc3r Third ventricle
Lepr ⫹
Mc3r
Lepr ⫺
⫹ Pancreas
Ghrelin Insulin, leptin Insulin Stomach Leptin
Adipose tissue
of the MC3/4R system leads to increased food intake and weight gain as well a corresponding energy conservation. The orexigenic NPY/AgRP neurons of the arcuate nucleus also project onto local anorexigenic POMC/CART neurons, where their GABA release effectively inhibits the anorexigenic neurons. With their somata in the median eminence and with the tuberal region’s fenestrated capillaries and weak blood-brain barrier, the reciprocally organized or push-pull orexigenic NPY/AgRP neurons and the anorexigenic POMC/CART neurons are viewed as first-order neurons that transduce circulating hormonal and humoral signals reflecting energy balance conditions. The neurons express receptors not only for leptin, but also for ghrelin and insulin and numerous other metabolic hormones. Thus, the neuropeptidergic arcuate neurons are situated to transduce and integrate hormones that reflect the energy regulation at the level of the fat pad (e.g., leptin), stomach (e.g., ghrelin), and pancreas (e.g., insulin). The first-order arcuate NPY/AgRP and POMC/CART neurons project to second-order melanocortin system neurons distributed within paraventricular nucleus (PVN) and perifornical and lateral hypothalamic regions of the
c34.indd Sec5:674
Figure 34.8 Hypothalamic circuitry controlling food intake and energy balance, as delineated with molecular biological and other modern neuroscience tools. Note: The model includes two set of neurons in the arcuate nucleus—Agrp/Npy and Pomc/Cart neurons— that are regulated by circulating anabolic and catabolic hormones. Ghsr ⫽ Growth hormone secretagogue receptor; Lepr ⫽ Leptin receptor; Mc3r/Mc4r ⫽ Melanocortin 3/4 receptor; Y1r ⫽ Neuropeptide Y1 receptor. From “Genetic Approaches to Studying Energy Balance: Perception and Integration,” by G. S. Barsh and M. W. Schwartz, 2002, Nature Reviews: Genetics, 3, pp. 592. Reprinted with permission.
hypothalamus. In the case of the lateral hypothalamus, two subpopulations of neurons have been implicated in feeding. One group expresses the neuropeptide hypocretin (or orexin); the other group expresses melanin-concentrating hormone (MCH). Both subpopulations have wide projection fields throughout the brain, suggesting that they modulate arousal, motivation, emotion, and motor systems. In terms of their projections and effects, the two subpopulations seem to operate independently, perhaps coordinating different responses that synergize in complementary responses appropriate to energy balance status. The ventromedial nucleus of the hypothalamus also receives inputs from, and reciprocally projects to, both NPY/AgRP and POMC/CART arcuate neurons. In their strategic location to monitor signals of metabolic status and with their demonstrated effects on feeding and body weight, the arcuate nucleus neurons and the hypothalamic neurons of the melanocortin system have become the subject of impressive and intense research efforts. Neurons of the local hypothalamic circuitry bind not only leptin, insulin, and ghrelin, they also bind many, if not most, hormones that impact catabolic or anabolic processes. A partial list includes glucocorticoids, estrogen,
8/19/09 3:36:47 PM
Hypothalamic Circuits of Ingestion and Their Signals Characterized with Molecular Biology
prolactin, and interleukins. In addition, neurons within the circuitry, through their intracellular utilization, also appear to monitor circulating fuels including glucose and fatty acids. The intracellular signaling cascades initiated by leptin and ghrelin receptor binding in the hypothalamic neurons have been particularly thoroughly delineated, because of the strategic position for transducing the many signals reflecting energy status. It is appropriate to consider these observations on hypothalamic circuitry and the signals involved in the context of other observations, some already surveyed and others discussed next. Like the dramatic symptoms elicited from the hypothalamus by lesions during the early feeding center analyses, the striking feeding effects that can be elicited by genetic and molecular manipulations (gene knockouts, peptide infusions, etc.) at a first look seem to substantiate the early feeding center description of hypothalamic function. Rather than seeing the hypothalamus as merely one important station among others in a multiplicity of complex circuits operating cooperatively to organize energy balance, it is tempting to return to the idea that it is the critical, executive node in the neural feeding apparatus. As mentioned, a number of reports have suggested that there is a “renaissance” of the hypothalamic feeding model (Elmquist et al., 1999; King, 2006). Similarly, many of the schematic summaries of feeding emphasize the hypothalamic circuitry while they reduce the contributions of the rest of the nervous system to a few vectors or arrows (e.g., see schematics in Figures 34.1, 34.3, and 34.8). Although there is a widespread reluctance to speak in terms of “center,” some analyses appear to circumvent the negative connotations of the term not by moving to a noncenter analysis but by stripping terms such as “circuit” or “network” of much of their meaning and using them as circumlocutions, in effect, for centers. Such views are at risk of being myopic. As discussed earlier, many of the receptors manipulated and discussed in terms of their hypothalamic expression are actually broadly distributed throughout the nervous system, many of the manipulations (e.g., peptide infusions) directed at the hypothalamus are not limited to the hypothalamus, and many of the feeding or energy balance effects that have been ascribed to the hypothalamus can be elicited from the caudal brain stem when infusions or other manipulations of the peptide systems are confined to the medulla. In this regard, it should be noted that a number of syntheses in which the hypothalamic circuitry is more explicitly integrated into or with the caudal brain stem and other CNS circuitries implicated in feeding have been suggested (e.g., Berthoud, 2002, 2004; Broberger & Hokfelt, 2001; Williams et al., 2001). The risk of short-sightedness is also underscored by the recent realization that the hypothalamic mechanisms are not highly stable hardwired circuits, but
c34.indd Sec5:675
675
rather neural pathways capable of synaptic plasticity and reorganization in response to different demands (Horvath & Diano, 2004). Another methodological footnote is appropriate here: Gene knockouts and mutations are fundamentally lesions. They are molecular lesions, but they are lesions just as surely as are ablative lesions and space-occupying lesions. Though they are often interpreted—as are other types of lesions—as revealing the function of the disabled or destroyed element, or in this case protein product, the effects that these molecular lesions cause may, of course, result from any of the convoluted, potentially distorted, and compensatory adjustments they occasion (Glassman, 1978). Knocking out or mutating, say the ob gene for leptin, does not illustrate by any simple subtractive logic the normal function of leptin. Instead, it illustrates how the organism is able to develop, adapt, compensate, and adjust in the absence of that gene. There have been repeated reminders that inferences from mutation effects back to normal function can be problematic. Loss-of-function lesions of the gene for leptin lead to dramatic obesity in rodents and humans, and this observation was initially used to support the conclusion that leptin operates as a critical negative feedback signal to inhibit excess positive energy balance, obesity, and overconsumption. Ironically, though, leptin administration to obese rodents or humans typically produces relatively subtle—or no—effects on the positive energy balance conditions (except for, of course, those few rodents and humans with loss-of-function mutations of the ob gene). And conversely, there are many other examples where a given neuropeptide or gene product has one effect when administered to intact individuals, but loss of the gene product produces very different and asymmetrical effects. NPY, for example, administered into the hypothalamus produces dramatic feeding, an effect that led to the conclusion that the peptide is a transmitter coding for feeding, but loss-of-function mutations or knockouts of the NPY gene have marginal to no effects on feeding (Qian et al., 2002). Or, for a final example, the orexigenic gut peptide hormone ghrelin is secreted by the stomach in a pattern that tracks hunger (by increasing) and satiety (by decreasing), and administration of the peptide elicits food intake, yet molecular “lesions” that eliminate ghrelin have little to no effect on feeding (Sun, Ahmed, & Smith, 2003). Finally, the risk of a myopia or a tunnel vision can be appreciated by comparison with other neural systems. The successes of modern molecular techniques in unraveling the neural connectivities and neurochemistries of the hypothalamic sites implicated in feeding have been dramatic. Nonetheless, most of the recent scrutiny of the circuitry of feeding focuses on first-order arcuate neurons that detect
8/19/09 3:36:49 PM
676
Hunger
circulating hormones and metabolites and second-order hypothalamic neurons (the paraventricular, ventromedial, dorsomedial, etc. neurons) and treats them as an integrative system, or as a simple system—a simulacrum standing in for all the extended neural machinery of feeding. All the rest of the nervous system and other peripheral signals tend to be subsumed into schematic input and output vectors (see, for example, Figure 34.8). Clearly, though, treating a two- or three-neuron chain of afferents as an executive site responsible for all the integration and analysis of the body’s energy economy is unrealistic. Relegating the rest of the neural trafficking to flowchart vectors begs questions of how feeding is orchestrated by the nervous system. It is hard to imagine anyone attempting to explain visual perception or visually guided behavior in terms of only a simplified circuitry consisting of retinal amacrine and bipolar cells, with the rest of the visual system reduced to schematic arrows.
NEUROIMAGING IDENTIFIES DISTRIBUTED CORTICAL AND DIENCEPHALIC SITES PARTICIPATING IN INGESTIVE BEHAVIOR The recent revolution in noninvasive neuroimaging techniques provides yet another perspective on the neural circuitry of feeding. As subjects—typically human subjects in this case—are presented with food cues or feeding opportunities while they are under either fasting or sated conditions, patterns of neural activity can be assessed. Within-subject comparisons can be made of the different neural signatures that characterize hunger and satiety by comparing the dynamic differences that occur when the subjects’ fasted trials are compared with their sated trials. Alternative, between-subject comparisons (obese vs. normal-weight subjects, anorexics vs. controls, etc.) can also be made. With the high temporal resolution permitted by current scanning techniques, the patterns of brain activation can be examined essentially in real time. Deprivation that presumably makes subjects hungry generally increases regional cerebral blood flow and hence regional activity in a variety of limbic, paralimbic, and cortical sites. Details of the pattern of activation vary to some extent between subject populations and between laboratories, but activation is commonly observed in the orbitofrontal and anterior cingulate cortex, as well as in visceral sensory cortex, including most particularly insular and piriform cortex (Tataranni et al., 1999; Wang et al., 2004). Notably, activity in the orbitofrontal cortex tends to correlate particularly well with self-reports and ratings of hunger (Wang et al., 2004). Additionally, during fasting, increased activation is also often seen in the amygdala, hippocampus, and parahippocampal regions, the striatum, and the
c34.indd Sec5:676
cerebellum, among other sites (Arana et al., 2003; Morris & Dolan, 2001; Tataranni et al., 1999). Increases in activation are also occasionally, but not always, seen in the region of the hypothalamus as well (Tataranni et al., 1999). Repletion or satiety or, even more operationally, the consumption of a test meal or nutrient load tends to be associated with increases in activation in prefrontal cortex and the inferior parietal lobe. Neuroimaging experiments also commonly assay the CNS response to external stimuli such as to taste stimuli or to food-related stimuli under different conditions such as fasted or fed state. Food stimuli routinely activate the insula and neighboring superior temporal cortex as well as the orbitofrontal cortex. Amygdaloid, temporal lobe, and parahippocampal lobe activity appears to be particularly sensitive to stimulus properties, including the attractiveness or salience of food stimuli (Morris & Dolan, 2001; Wang et al., 2004). Scanning methods have also been employed to evaluate how brain activity varies as a function of internal visceral and hormonal signals. Visceral inputs such as gastric stimulation also tend to activate the insula, amygdala, and hippocampus (Wang et al., 2006). As another means of appreciating the neural mechanisms of feeding and disturbances in these mechanisms that must cause and/or reflect common eating disorders, neuroimaging studies have also begun to characterize how patterns of regional blood flow vary between normal-weight and overweight populations or between health control populations and those with anorexia or bulimia or other feeding disorders (Kaye et al., 2005; Liu & Gold, 2003). Whereas most other neuroscience techniques have focused—or been focused—on the hypothalamus and the caudal brain stem, the functional scanning literature implicates limbic regions of the diencephalon and telencephalon in hunger, satiety, and food selection, many of these same limbic regions are also implicated in a variety of emotional and motivational behaviors (see Chapters 36 and 38). The pattern suggests that the processing associated with feeding shares common circuitry with other motivated behavior— that, in essence, feeding programs run on the same processors as other functional motivational systems, not on a dedicated feeding processor. Something of an apparent paradox in the scanning literature is the fact that the hypothalamus appears to have a much less conspicuous and prominent pattern of activation than might be inferred from the lesion and hormone binding analyses on the structure. Several factors—some methodological, but some perhaps substantive—may explain the apparent paradox. Such a lack of a hypothalamic signature may, in some cases, merely reflect the limits of spatial resolution of current scanning protocols and equipment. Furthermore, as Liu, Gao, Liu, and Fox (2000) have
8/19/09 3:36:49 PM
References 677
suggested, activation patterns within the distributed network that seems to organize ingestive behavior may have complex phasic temporal and spatial patterns, and only analyses that examine the spatial and temporal parcellations will be able to capture all the local transients. Alternatively, the lack of a conspicuous involvement may also reflect the possibilities that the hypothalamus is more heavily involved in longer-term regulatory adjustments of energy balance that eventually modulate feeding, and less involved in the real-time organization of feeding behavior. Finally, scanning experiments may be, perhaps very correctly, suggesting that the hypothalamic role in hunger and satiety has historically been blown out of proportion in respect to the roles of the various limbic, cortical, diencephalic, mesencephalic, and rhombencephalic circuits that recent research has implicated in ingestive behavior. In summary, sensitive neuroimaging techniques that have recently become available, have begun to delineate a picture of the neural substrate of ingestion that is quite different from that which was concentrated on the hypothalamus. Scanning work describes dynamic and distributed patterns of activation more adequately characterized by Sherrington’s “enchanted loom” weaving “dissolving pattern(s),” always “meaningful,” never “abiding,” than by executive centers in the hypothalamus. SUMMARY Behavioral neuroscience has yet to produce, as measured by its unsatisfactory record in predicting effective therapies for eating disorders, a completely coherent account of food intake. The different methods employed in the neurosciences generate distinctly different, sometimes even conflicting, views of the neural basis of food intake. These disparate views serve as reminders that, while our models affect our choice of methods, our methods also shape our models. Since the era of the early experimental formulations that accounted for feeding and energy balance in terms of hypothalamic centers, neuroscience has discovered a much more extensive and distributed network of sites participating in feeding. This network or visceral neuroaxis includes a multiplicity of CNS sites from the caudal brain stem to the frontal cortex. This visceral network also includes the enteric nervous system or “little brain” in the gut, and the autonomic efferents and visceral afferents linking the enteric network to the CNS, all extensively interconnected by multiple pathways. In addition, a diverse battery of gut-brain and adipocyte hormones, paracrine factors, neurocrine factors, cytokines, and other signals modulate or set the gains on neurons throughout the entire span of the visceral neuroaxis. Experimental protocols that employ limited environments, paradigms with tight experimental control, and techniques
c34.indd Sec7:677
biased toward localization outcomes (e.g., lesions, focal stimulation) have tended to support the inference that the brain is organized with compartmentalized centers specialized for the control of feeding or body weight. And this center paradigm has provided a convenient, accessible, and simplifying model of ingestive behavior. Though experimentally and conceptually tractable, the model appears, in some cases, to beg the questions it purports to answer and, in other cases, to be inaccurate and invalid. In contrast, tests that employ more complex environmental situations or stimuli, experimental paradigms designed to provide subjects with more opportunities or options, and techniques adapted to characterizing distributed networks (e.g., nervous-system-wide mapping of receptors or neuronal connections, functional scanning techniques) have tended to support the conclusion that feeding and body weight regulation are organized by an extensive network of decentralized sites throughout the central—as well as peripheral and enteric—nervous systems, with substantial interconnections and parallel architectures. Additionally, these open-architecture techniques challenge the idea that the brain “wetware” can be compartmentalized with any one area dedicated to, or specialized for, a single type of behavior such as feeding. In the immediate future, a major—perhaps the major— goal for the neuroscience of feeding behavior may be to better reconcile and synthesize the multiple dissimilar views of the neural circuitry of ingestion suggested by the disparate techniques now in use. To achieve this end, it will be particularly useful to weigh the influences—and recognize the biases—of the different methodologies on both the data collected and the interpretations generated.
REFERENCES Aboitiz, F. (1996). Does bigger mean better? Evolutionary determinants of brain size and structure. Brain, Behavior and Evolution, 47, 225–245. Anand, B. K., & Brobeck, J. R. (1951). Localization of a “feeding center” in the hypothalamus of the rat. Proceedings of the Society of Experimental Biology and Medicine, 77, 323–324. Arana, F. S., Parkinson, J. A., Hinton, E., Holland, A. J., Owen, A. M., & Roberts, A. C. (2003). Dissociable contributions of the human amygdala and orbitofrontal cortex to incentive motivation and goal selection. Journal of Neuroscience, 23, 9632–9638. Araujo, I. E., Gutierrez, R., Oliveira-Maia, A. J., Pereira, A., Jr., Nicolelis, M. A. L., & Simon, S. A. (2006). Neural ensemble coding of satiety states. Neuron, 51, 483–494. Badman, M. K., & Flier, J. S. (2007). The adipocyte as an active participant in energy balance and metabolism. Gastroenterology, 132, 2103–2115. Barsh, G. S., & Schwartz, M. W. (2002). Genetic approaches to studying energy balance: Perception and integration. Nature Reviews: Genetics, 3, 589–600.
8/19/09 3:36:49 PM
678
Hunger
Bernard, R. T. F., & Nurton, J. (1993). Ecological correlates of relative brain size in some South-African rodents. South African Journal of Zoology, 28, 95–98. Berthoud, H.-R. (2002). Multiple neural systems controlling food intake and body weight. Neuroscience and Biobehavioral Reviews, 26, 393–428. Berthoud, H.-R. (2004). Mind versus metabolism in the control of food intake and energy balance. Physiology and Behavior, 81, 781–793. Bjorklund, A., Hokfelt, T., & Owman, C. (Eds.). (1988). Handbook of chemical neuroanatomy: Vol. 6. The peripheral nervous system. Amsterdam: Elsevier. Bjorntorp, P. (2001). Thrifty genes and human obesity: Are we chasing ghosts? Lancet, 358, 1006–1008. Blessing, W. W. (1997). The lower brainstem and bodily homeostasis. New York: Oxford University Press. Blundell, J. E. (1980). Hunger, appetite and satiety: Constructs in search of identities. In M. Turner (Ed.), Nutrition and lifestyles (pp. 21–42). London: Applied Science. Bowen, R. (2001). Genetics of food intake, body weight and obesity. Retrieved (April 23, 2008) from www.vivo.colostate.edu/hbooks/ pathphys/digestion/pregastric/fatgenes.html. Brobeck, J. R. (1957). Neural control of hunger, appetite, and satiety. Yale Journal of Biology and Medicine, 29, 566–574. Broberger, C., & Hokfelt, T. (2001). Hypothalamic and vagal neuropeptide circuitries regulating food intake. Physiology and Behavior, 74, 669–682. Coleman, D. L. (1973). Effects of parabiosis of obese with diabetes and normal mice. Diabetologia, 9, 294–298. Coleman, D. L. (1979, February 16). Obesity genes: Beneficial effects in heterozygous mice. Science, 203, 663–665. Coleman, D. L., & Hummel, K. P. (1969). Effects of parabiosis of normal with genetically diabetic mice. American Journal of Physiology, 217, 1298–1304. Collier, G., & Johnson, D. F. (2004). The paradox of satiation. Physiology and Behavior, 82, 149–153. Corbit, J. D., & Stellar, E. (1964). Palatability, food intake, and obesity in normal and hyperphagic rats. Journal of Comparative and Physiological Psychology, 58, 63–67.
Harris, M., & Ross, E. B. (Eds.). (1987). Food and evolution: Toward a theory of human food habits. Philadelphia: Temple University Press. Harvey, J., & Ashford, M. L. J. (2003). Leptin in the CNS: Much more than a satiety signal. Neuropharmacology, 44, 845–854. Healy, S. D., & Rowe, C. (2007). A critique of comparative studies of brain size. Proceedings of the Royal Society, B, 274, 453–464. Hetherington, A. W., & Ranson, S. W. (1942). The relation of various hypothalamic lesions to adiposity in the rat. Journal of Comparative Neurology, 76, 475–499. Hill, J. M., Lesniak, M. A., Pert, C. B., & Roth, J. (1986). Autoradiographic localization of insulin receptors in rat brain: Prominence in olfactory and limbic areas. Neuroscience, 17, 1127–1138. Hirvonen, M. D., & Keesey, R. E. (1996). Chronically altered body protein levels following lateral hypothalamic lesions in rats. American Journal of Physiology-Regulatory, Integrative and Comparative Physiology, 270, R738–R743. Hoebel, B. G., & Teitelbaum, P. (1966). Weight regulation in normal and hypothalamic hyperphagic rats. Journal of Comparative and Physiological Psychology, 61, 189–193. Holst, M.-C., Kelly, J. B., & Powley, T. L. (1997). Vagal preganglionic projections to the enteric nervous system characterized with PHA-L. Journal of Comparative Neurology, 381, 81–100. Holstege, G., Bandler, R., & Saper, C. B. (Eds.). (1996). The emotional motor system. Progress in Brain Research, 107, (pp. 1–627). Amsterdam: Elsevier Press. Horvath, T. L., & Diano, S. (2004). The floating blueprint of hypothalamic feeding circuits. Nature Reviews: Neuroscience, 5, 662–667. Iwaniuk, A. N., & Hurd, P. L. (2005). The evolution of cerebrotypes in birds. Brain, Behavior and Evolution, 65, 215–230. Janig, W. (2006). The integrative action of the autonomic nervous system: Neurobiology of homeostasis. Cambridge, MA: Cambridge University Press. Kar, S., Chabot, J. G., & Quirion, R. (1993). Quantitative autoradiographic localization of [I-125] insulin-like growth factor-I, I[I-125] insulin-like growth factor-II, and [I-125] insulin-receptor binding sites in developing and adult rat brain. Journal of Comparative Neurology, 333, 375–397.
Cummings, D. E., & Overduin, J. (2007). Gastrointestinal regulation of food intake. Journal of Clinical Investigation, 117, 13–23.
Kaye, W. H., Frank, G. K., Bailer, U. F., Henry, S. E., Meltzer, C. C., Price, J. C., et al. (2005). Serotonin alterations in anorexia and bulimia nervosa: New insights from imaging studies. Physiology and Behavior, 85, 73–81.
Eaton, S. B., Eaton, S. B. III, & Konner, M. J. (1999). Paleolithic nutrition revisited. In W. R. Trevathan, E. O. Smith, & J. J. McKenna (Eds.), Evolutionary medicine (pp. 313–332). New York: Oxford University Press.
Keesey, R. E., Powley, T. L., & Kemnitz, J. W. (1976). Prolonging lateral hypothalamic anorexia by tube-feeding. Physiology and Behavior, 17, 367–371.
Elmquist, J. K., Elias, C. F., & Saper, C. B. (1999). From lesions to leptin: Hypothalamic control of food intake and body weight. Neuron, 22, 221–232.
Kennedy, G. C. (1953). The role of depot fat in the hypothalamic control of food intake in the rat. Proceedings of the Royal Society, B., 140, 578–592.
Friedman, J. M., & Halaas, J. L. (1998, October 22). Leptin and the regulation of body weight in mammals. Nature, 395, 763–770.
King, B. M. (2006). The rise, fall, and resurrection of the ventromedial hypothalamus in the regulation of feeding behavior and body weight. Physiology and Behavior, 87, 221–244.
Funahashi, H., Yada, T., Suzuki, R., & Shioda, S. (2003). Distribution, function, and properties of leptin receptors in the brain. International Review of Cytology, 224, 1–27.
King, J. M., & Cox, V. C. (1973). The effects of estrogens on food intake and body weight following vengtromedial hypothalamic lesions. Physiological Psychology, 1, 261–264.
Garcia, J., Hankins, W. G., & Coil, J. D. (1977). Koalas, men, and other conditioned gastronomes. In N. W. Milgram, L. Krames, & T. M. Alloway (Eds.), Food aversion learning (pp. 196–218). New York: Plenum Press.
Kishi, T., Aschkenasi, C. J., Lee, C. E., Mountjoy, K. G., Saper, C. B., & Elmquist, J. K. (2003). Expression of melanocortin 4 receptor mRNA in the central nervous system of the rat. Journal of Comparative Neurology, 457, 213–235.
Glassman, R. B. (1978). The logic of the lesion experiment and its role in the neural sciences. In S. Finger (Ed.), Recovery from brain damage: Research and theory (pp. 3–31). New York: Plenum Press. Grill, H. J., & Kaplan, J. M. (1990). Caudal brainstem participates in the distributed neural control of feeding. In E. M. Stricker (Ed.), Handbook of behavioral neurobiology: Vol. 10, Neurobiology of Food and Fluid Intake, (pp. 125–149). New York: Plenum Press. Grill, H. J., & Norgren, R. (1978, July 21). Chronically decerebrate rats demonstrate satiation but not bait shyness. Science, 201, 267–269.
c34.indd Sec7:678
Koopmans, H. S., McDonald, T. J., & DiGirolamo, M. (1997). Morphological and metabolic changes associated with large differences in daily food intake in crossed-intestines rats. Physiology and Behavior, 62, 129–136. Lefebvre, L., Reader, S. M., & Sol, D. (2004). Brains, innovations and evolution in birds and primates. Brain, Behavior and Evolution, 63, 233–246. Liu, Y., Gao, J. H., Liu, H. L., & Fox, P. T. (2000, June 29). The temporal responses of the brain after eating revealed by functional MRI. Nature, 405, 1058–1062.
8/19/09 3:36:49 PM
References 679 Liu, Y., & Gold, M. S. (2003). Human functional magnetic resonance imaging of eating and satiety in eating disorders and obesity. Psychiatric Annals, 33,127–132.
Sun, Y., Ahmed, S., & Smith, R. G. (2003). Deletion of ghrelin impairs neither growth nor appetite. Molecular and Cellular Biology, 23, 7973–7981.
Marshall, J. F., Richardson, J. S., & Teitelbaum, P. (1974). Nigrostriatal bundle damage and the lateral hypothalamic syndrome. Journal of Comparative and Physiological Psychology, 87, 808–830.
Swanson, L. W. (2000). Cerebral hemisphere regulation of motivated behavior. Brain Research, 886, 113–164.
Marshall, J. F., Turner, B. H., & Teitelbaum, P. (1971, October 29). Sensory neglect produced by lateral hypothalamic damage. Science, 174, 523–525. Martin, R. J., White, B. D., & Hulsey, M. G. (1991). The regulation of body weight. American Scientist, 79, 528–541. Marx, J. (2003, February 7). Cellular warriors at the battle of the bulge. Science, 299, 846–849. Mayer, J. (1953). Glucostatic mechanisms of regulation of food intake. New England Journal of Medicine, 249, 13–16. Morris, J. S., & Dolan, R. J. (2001). Involvement of human amygdala and orbitofrontal cortex in hunger-enhanced memory for food stimuli. Journal of Neuroscience, 21, 5304–5310. Neel, J. V. (1962). Diabetes mellitus: A “thrifty” genotype rendered detrimental by “progress”? American Journal of Human Genetics, 14, 353–362. Nicolakakis, N., & Lefebvre, L. (2000). Forebrain size and innovation rate in European birds: Feeding, nesting and confounding variables. Behaviour, 137, 1415–1429. Pollan, M. (2006). The omnivore’s dilemma. New York: Penguin Press. Powell, K. (2007, May 31). The two faces of fat. Nature, 447, 525–527.
Tataranni, P. A., Gautier, J.-F., Chen, K., Uecker, A., Bandy, D., Salbe, A. D., et al. (1999). Neuroanatomical correlates of hunger and satiation in humans using positron emission tomography. Proceedings of the National Academy of Sciences, USA, 96, 4569–4574. Teitlebaum, P. (1955). Sensory control of hypothalamic hyperphagia. Journal of Comparative and Physiological Psychology, 48, 156–163. Teitelbaum, P., & Epstein, A. N. (1962). The lateral hypothalamic syndrome: Recovery of feeding and drinking after lateral hypothalamic lesions. Psychological Review, 69, 74–90. Thomas, P. R. (Ed.). (1995). Weighing the options: Criteria for evaluating weight-management programs. Washington, DC: National Academy Press. Timmermans, S., Lefebvre, L., Boire, D., & Basu, P. (2000). Relative size of the hyperstriatum ventrale is the best predictor of feeding innovation rate in birds. Brain, Behavior, and Evolution, 56, 196–203.
Powley, T. L. (1977). The ventromedial hypothalamic syndrome, satiety, and a cephalic phase hypothesis. Psychological Review, 84, 89–126.
Ungerstedt, U. (1970). Is interruption of the nigro-striatal dopamine system producing the “lateral hypothalamus syndrome”? Acta Physiologica Scandinavia, 80, A35–A36.
Powley, T. L., & Keesey, R. E. (1970). Relationship of body weight to the lateral hypothalamic syndrome. Journal of Comparative and Physiological Psychology, 70, 25–36.
van den Pol, A. N. (2003). Weighing the role of hypothalamic feeding centers. Neuron, 40, 1059–1061.
Powley, T. L., & Phillips, R. J. (2002). Musings on the wanderer: What’s new in our understanding of vago-vagal reflexes? Pt. I. Morphology and topography of vagal afferents innervating the GI tract. American Journal of Physiology, 283, G1217–G1225. Qian, S., Chen, H., Weingarth, D., Trumbauer, M. E., Novi, D. E., Guan, X., et al. (2002). Neither agouti-related protein nor neuropeptide Y is critically required for the regulation of energy homeostasis in mice. Molecular and Cellular Biology, 22, 5027–5035. Rehfeld, J. F. (1998). The new biology of gastrointestinal hormones. Physiological Reviews, 78, 1087–1108. Rolls, E. T. (2005). Taste, olfactory, and food texture processing in the brain, and the control of food intake. Physiology and Behavior, 85 45–56. Rozin, P. (1976). The selection of foods by rats, humans, and other animals. In J. Rosenblatt, R. A. Hide, C. Beer, & E. Shaw (Eds.), Advances in the study of behavior, Volume 6 (pp. 21–76). New York: Academic Press.
Walgren, M. C., & Powley, T. L. (1985). Effects of intragastric hyperalimentation on pair-fed rats with ventromedial hypothalamic lesions. American Journal of Physiology, 248, R172–R180. Wang, G.-J., Volkow, N. D., Telang, F., Jayne, M., Ma, J., Rao, M., et al. (2004). Exposure to appetitive food stimuli markedly activates the human brain. NeuroImage, 21, 1790–1797. Wang, G.-J., Yang, J., Volkow, N. D., Telang, F., Ma, Y., Zhu, W., et al. (2006). Gastric stimulation in obese subjects activates the hippocampus and other regions involved in brain reward circuitry. Proceedings of the National Academy of the Sciences, USA, 103, 15641–15645. Williams, G., Bing, C., Cai, X. J., Harrold, J. A., King, P. J., & Liu, X. H. (2001). The hypothalamus and the control of energy homeostasis: Different circuits, different purposes. Physiology and Behavior, 74, 683–701. Williams, G. Cai, X. J., Elliott, J. C., & Harrold, J. A. (2004). Anabolic neuropeptides. Physiology & Behavior, 81, 211–222.
Saper, C. B., Chou, T. C., & Elmquist, J. K. (2002). The need to feed: Homeostatic and hedonic control of eating. Neuron, 36, 199–211.
Woods, S. C. (1991). The eating paradox: How we tolerate food. Psychological Review, 98, 488–505.
Sawchenko, P. E. (1998). Toward a new neurobiology of energy balance, appetite, and obesity: The anatomist weighs in. Journal of Comparative Neurology, 402, 435–441.
Woods, S. C., Schwartz, M. W., Baskin, D. G., & Seeley, R. J. (2000). Food intake and the regulation of body weight. Annual Review of Psychology, 51, 255–277.
Schwartz, M. W., Woods, S. C., Porte, D., Jr., Seeley, R. J., & Baskin, D. G. (2000, April 6). Central nervous system control of food intake. Nature, 404, 661–671.
Zaborszky, L., Wouterlood, F. G., & Lanciego, J. L. (Eds.). (2006). Neuroanatomical tract-tracing 3: Molecules, neurons, and systems. New York: Springer.
Schwartz, M. W., Woods, S. C., Seeley, R. J., Barsh, G. S., Baskin, D. G., & Leibel, R. L. (2003). Is the energy homeostasis system inherently biased toward weight gain? Diabetes, 52, 232–238.
Zarbin, M. A., Innis, R. B., Wamsley, J. K., Snyder, S. H., & Kuhar, M. J. (1983). Autoradiographic localization of cholecystokinin receptors in rodent brain. Journal of Neuroscience, 3, 877–906.
Scott, T. R., Yan, J., & Rolls, E. T. (1995). Brain mechanisms of satiety and taste in macaques. Neurobiology, 3, 281–292.
Zhang, Y., Proenca, R., Maffei, M., Barone, M., Leopold, L., & Friedman, J. M. (1994, December 1). Positional cloning of the mouse obese gene and its human homologue. Nature, 372, 425–432.
Sherrington, C.S. (1906). The integrative action of the nervous system. New York: Charles Scribner ’s Sons. Stellar, E. (1954). The physiology of motivation. Psychological Review, 61, 5–22.
c34.indd Sec7:679
Tartaglia, L. A., Dembski, M., Weng, X., Deng, N., Culpepper, J., Devos, R., et al. (1995). Identification and expression cloning of a leptin receptor, OB-R. Cell, 83, 1263–1271.
Zigman, J. M., Jones, J. E., Lee, C. E., Saper, C. B., & Elmquist, J. K. (2006). Expression of ghrelin receptor mRNA in the rat and mouse brain. Journal of Comparative Neurology, 494, 528–548.
8/19/09 3:36:50 PM
Chapter 35
Thirst MICHAEL J. MCKINLEY
THE NATURE OF THIRST
cardiovascular system and for thermoregulation. Obligatory losses of water from the body occur continually from the respiratory and gastrointestinal tracts, kidney, and skin. To replace these losses, some water is ingested in the form of food and some obtained from metabolic reactions, however, the drinking of aqueous fluid is the main source for replacing fluid deficits. While much of fluid intake is of a social, habitual, and prandial nature, rather than a response to thirst, if these sources of fluid are inadequate, thirst provides the fail-safe system to ensure that fluid deficits are replaced. No matter how effectively the kidney can concentrate urine and reduce water lost therein, this mechanism does not replace the obligatory fluid deficits mentioned. Analogous to the hunger for air ensuring sufficient oxygen intake and survival in the short term, thirst is a homeostatic emotion essential for survival in the longer term (Cannon, 1919).
A Homeostatic Emotion Thirst is an impelling urge to drink water or aqueous fluids. As a private, subjective state, thirst is difficult to define. Yet few who read this chapter have not experienced thirst. It can be classified with the urge to inhale air, the desire for sleep, feeling hot or cold, pain, hunger, fatigue, full bladder or bowel, and nausea as essential motivating emotions arising from interoceptive signals that lead subsequently to appropriate behavior to correct bodily deficits or surfeits, thereby restoring normal physiological set points. Thirst is an essential homeostatic emotion (Craig, 2003). Some have defined thirst as a disposition to drink (Booth, 1991), but such a definition includes the motivation to drink resulting from habit, advice, ritual, or social, cultural, and psychotic drives. Such dispositions to drink almost certainly do not reflect the same motivational emotion that arises from bodily dehydration and they should be differentiated from the homeostatic emotion of thirst that is the subject of this chapter. An essential feature of thirst is a degree of discomfort or craving, so that as thirst intensifies, it becomes more distressing, tormenting, and ultimately agonizing and overwhelming. Fortunately, most people will not experience thirst of this severity. It is not surprising that as fluid deficits increase, thirst and the motivation to ingest water increase concomitantly. Water is by far the most abundant molecule in the body, being 60% of its weight but approximately 98% of all the molecules in the body. Adequate intracellular water is essential for maintaining intracellular concentrations of dissolved enzymes, substrates, and ions that allow maximal functioning of cellular activity. Adequate extracellular water is essential for maintaining the integrity of the
Concepts of Thirst The development of ideas on how thirst is generated has been described in some detail by Fitzsimons (1979). At the beginning of the twentieth century, Mayer (1900) proposed that thirst was a sensation that arose essentially as a result of the osmotic pressure of the body fluids and tissues increasing. In attempting to define thirst, Cannon (1919) proposed a somatosensory explanation, attributing dryness in the pharynx and mouth as the essential feature of thirst. However, if dry mouth and throat are experimentally contrived by pharmacological means, subjects are not rendered thirsty, arguing against this explanation. Thus, the idea of thirst as a specific somatic sensation in the mouth and throat has lost favor. Following the demonstration that water drinking could be evoked by chemical or electrical stimulation of the hypothalamus (Andersson, 1953;
The author was supported by an Australian NHMRC Fellowship (ID 454369), and grants from the NHMRC (Project Grant ID 350437), Australian Research Council, the Robert J. Jr. & Helen C. Kleberg Foundation, and the G. Harold and Leila Y. Mathers Charitable Foundation. Thanks to Dr. Robin McAllen for comments and Julianna McKinley for artwork. 680
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c35.indd 680
8/17/09 3:06:06 PM
Regulatory Thirst 681
Figure 35.1 (Figure C. 34 in color section) A guide to the location within the brain of regions implicated in the generation and regulation of thirst. Note: Top left panel: specific regions are projected onto a longitudinal magnetic resonance (MR) image of the midline of the human brain. The other panels show several of these regions in transverse MR images of
Andersson & McCann, 1955; Greer, 1955) and hypothalamic lesions caused adipsia (Teitelbaum & Epstein, 1962), a hypothalamic thirst center became a popular theme, although the idea of a brain center mediating thirst had been advocated earlier by Nothnagel (1881) and Mayer (1900). In the past half century, identification of several relevant brain regions, hormonal stimuli, and neural pathways linking sensor and integrative regions have been elucidated. Nowadays, the concept that thirst is an emotion generated centrally by the integration within the brain stem, hypothalamus, and preoptic area of neural, osmotic, and hormonal signals, transmitted via multiple neural pathways to produce neural outputs that are distributed to cortical effector sites, holds sway. A guide to the cerebral locations of a number of brain regions implicated in the generation and regulation of thirst is provided in Figure 35.1. REGULATORY THIRST The Dual Depletion Theory Reduction in the volume of either the intracellular or extracellular fluid generates an urge to ingest water. Satisfaction of this urge by drinking fluid can replenish both of these fluid compartments. When a mammal is deprived of water
c35.indd Sec10:681
the same brain at three rostro-caudal levels (1,2,3) that are indicated by the white vertical lines in panel A. AC = Anterior cingulate cortex; INS = Insula; LH = Lateral hypothalamic area; LP = Lateral preoptic area; MP = Median preoptic nucleus; NTS = Nucleus of the solitary tract; OF = Orbito-frontal cortex; OV = Organum vasculosum of the lamina terminalis; P = Parabrachial nucleus; R = Midbrain raphé (dorsal and medial nuclei); S = Septal region; SFO = Subfornical organ.
and becomes dehydrated, water is usually lost from both intracellular and extracellular compartments. Separate bodily sensors detect changes in intracellular and extracellular volumes, and signals from these sensors subsequently initiate homeostatic responses that include thirst. Therefore, a dual depletion mechanism to drive thirst mechanisms in dehydrated animals has been advanced. The sensor and signaling mechanisms that respond to such “dual depletion” are discussed in the following section. Osmoregulatory Thirst and Intracellular Dehydration Under normal physiological conditions, the osmolality (i.e., total concentration of dissolved solutes in a liquid) of plasma is maintained within a narrow range in most mammals (280 to 290 mosmol/kg). Plasma osmolality will increase if ingested solutes such as sodium chloride are absorbed into the bloodstream, or if evaporative dehydration occurs. In both conditions, as the effective osmotic pressure of the circulating and interstitial fluids (i.e., the extracellular fluid within tissues) increases, water will move by osmosis from the interior of cells to the extracellular space, causing depletion of the intracellular fluid.
8/17/09 3:06:06 PM
682 Thirst
Plasma Sodium (mEq/L)
290
Thirst Score
146
Plasma Osmolality (mOsm/kg.H2O)
There is considerable evidence to show that depletion of the intracellular compartment is associated with thirst, and that specific brain sensors (osmoreceptors) detect intracellular dehydration and initiate compensatory mechanisms such as water drinking, vasopressin secretion, and natriuresis, all of which act to reduce the hypertonicity (McKinley et al., 1987a; Verney, 1947; Wolf, 1950). Investigation of the thirst-stimulating effects of intravenous infusions of hypertonic solutions led to the concept of osmoregulatory thirst. Intravenous infusion of concentrated solutions of sodium chloride, sucrose, fructose, or mannitol increase the tonicity of plasma and stimulate water drinking in species as diverse as dogs, rats, goats, sheep, iguanas, pigs, and pigeons (Fitzsimons, 1979). Because it is not possible to truly know the subjective perception experienced by animals that lead them to drink water, it is assumed that such drinking is the result of a thirst being generated in these species. However, in the case of human studies, thirst ratings can be obtained from experimental subjects, and the results are consistent with the animal investigations. Increasing the effective osmotic pressure of plasma by means of intravenous infusion of hypertonic sodium chloride (Figure 35.2), sorbitol, or mannitol stimulates thirst in humans (Wolf, 1950; Zerbe & Robertson, 1983). Not all types of systemically infused hyperosmolar solutions stimulate drinking or strong thirst. When concentrated solutions of urea, glycerol, glucose, or isomannide are administered systemically, thirst ratings or the volume of water drunk are considerably less than those observed with infusion of equivalent amounts of hypertonic
PET Scans
144 142 140
Figure 35.2 Increase of plasma sodium concentration, plasma osmolality, and thirst rating during an intravenous infusion of 0.51 mol/l saline for 25 or 50 minutes in 10 healthy adult human subjects, then subsequently after rinsing the mouth with water, and drinking water to satiate thirst.
288 286 284 282 7 6 5 4 3 2 1 0 Control
c35.indd Sec1:682
saline, sorbitol, fructose, mannitol, or sucrose solutions (Fitzsimons, 1963; Gilman, 1937; Holmes & Gregerson, 1950; McKinley, Denton, & Weisinger, 1978; Olsson, 1972; Zerbe & Roberston, 1983). The solutes that are effective dipsogenic agents are those that do not readily permeate into cells (sodium chloride, sucrose, fructose, mannitol) so that an osmotic gradient is established across the semi-permeable cell membranes. As a consequence of this gradient, movement of water out of cells by osmosis results in cellular dehydration. The movement of smaller solutes such as urea and glycerol into cells via specific urea and glycerol channels and glucose via a glucose transporter is relatively rapid, so that an osmotic gradient from outside to inside of the cell is not maintained, and significant cellular dehydration does not occur. Therefore, thirst is stimulated acutely by increases in the effective osmotic pressure (the tonicity) of plasma, a condition that causes cellular dehydration. As expressed initially by Gilman (1937), “The logical conclusion to draw from the above results is that cellular dehydration rather than an increase in cellular osmotic pressure per se is the stimulus of true thirst.” Olsson (1972) also showed that infusions of hypertonic sodium chloride or fructose into the carotid artery were far more effective than equivalent infusions into the jugular vein, indicating that the osmosensors were probably located in the brain. Contemporaneous with the investigations of Gilman, Wolf, and others on osmoregulatory thirst was the discovery of osmoreceptors that regulate the secretion of the antidiuretic hormone, more commonly termed vasopressin (Verney, 1947). Cellular dehydration
End of Infusion
Max. Wet Drink Drink Drink Thirst Mouth ⫹3 ⫹14 ⫹60 min min min
Note: The subjects were asked to rate thirst on a scale of 0 to 10 with 0 being no thirst, and 10 the thirstiest they had ever experienced. These data were obtained while subjects underwent positron emission topography scans indicated by the arrowheads. Asterisks at points indicate significant difference from control, and those between points indicate significant difference between those two observations. From “The Correlation of Regional Blood Flow (rCBF) and Plasma Sodium Concentration during Genesis and Satiation of Thirst,” by D. A. Denton et al., 1999, Proceedings of the National Academy of Sciences, USA, 96, p. 2533, Fig. 1. Reprinted with permission.
8/17/09 3:06:06 PM
Regulatory Thirst
was shown to stimulate vasopressin release from the posterior pituitary, and in addition, the location of the relevant osmoreceptors was shown to be within the hypothalamic region of the brain (Jewell & Verney, 1957). Renal water retention under the regulation of vasopressin is the complementary arm of body fluid homeostasis, in that although fluid losses from the body are not restored by vasopressin action on the kidney, further obligatory fluid losses in urine are minimized. The integrated neural and endocrine regulation of body fluids is summarized in Figure 35.3. Location of Osmoreceptors for Thirst Following the discovery of a cerebral location for osmoreceptors that regulate vasopressin secretion (Verney, 1947), interest focused on the hypothalamus as a probable site of osmoreceptors that drive thirst. Consistent with this idea was the observation that injection of a small amount of hypertonic sodium chloride into the hypothalamus in the region of the mammillothalamic tract stimulated copious water drinking in water-replete goats (Andersson, 1953). Relative to the physiological concentration of sodium
chloride in the extracellular fluid of the brain (0.15 M), the high concentrations of sodium chloride that were injected in these experiments may have nonspecifically stimulated neurons or fibers subserving drinking, or spread the stimulus to an adjacent brain region. Therefore, Andersson was conservative in the interpretation of these experiments, having reservations that the injection regions were the sites of thirst osmoreceptors. Subsequently, investigators in Sweden (Andersson, Leksell, & Lishajko, 1975; Rundgren & Fyhrquist, 1978) showed that ablation of a region rostral to these hypothalamic sites, the anterior wall of the third ventricle, resulted in adipsia in goats. As well, they observed that infusions of hypertonic saline but not hypertonic saccharide solutions into the third cerebral ventricle stimulated drinking. As a result, they proposed that cerebral sodium receptors located in the anterior wall of the third ventricle of the brain mediated osmoregulatory thirst, rather than hypothalamic osmoreceptors. However, injection of hypertonic sucrose into the third ventricle does stimulate drinking (in sheep) if the sucrose is prepared in an artificial cerebrospinal fluid (CSF), although the response was less than that with hypertonic saline injection
Glomerular filtration (kidney)
Atrial natriuretic peptide (released by the heart in response to increased blood volume)
683
Increased urine Na & water excretion
Thirst (brain)
Water intake
Vasopressin release (brain/post. pituitary)
Reduced urinary water loss
Osmoreceptor stimulation
Na depletion, low kidney Na Angiotensinogen (circulating protein from the liver) Enzyme action
Sym
path
etic
R per educe fus d ion rena pre l ssu re
Reduced arterial pressure, blood volume & central venous pressure
nerv
es Vasoconstriction (arterioles/veins)
Renal symp n.
Increased arterial pressure
Tubular Na & water reabsorption (kidney)
Renin release (kidney)
Blood pressure, volume & osmolality normalised
Reduced urinary Na losses
Angiotensin I (inactive circulating decapeptide)
Aldosterone secretion (adrenal cortex)
ACE (lungs) Salt appetite (brain)
Angiotensin II (circulating octapeptide)
Denotes stimulatory influence
Restored body NaCl
Denotes inhibitory influence
Figure 35.3 Diagrammatic summary of the main hormonal regulators of body fluid homeostasis in mammals. Note: ACE = Angiotensin converting enzyme.
c35.indd Sec1:683
8/17/09 3:06:07 PM
684 Thirst
Osmoreceptors in the Lamina Terminalis To test the hypothesis that brain sodium receptors are responsible for thirst (Andersson, 1978), studies were made in conscious sheep of the effects of infusions of various hypertonic solutions into the carotid artery on water intake and on CSF sodium concentration (McKinley et al., 1978). Infusions of hyperosmolar sodium chloride, sucrose, or urea into the carotid arterial blood supply to the brain all increased CSF Na concentration, even though plasma sodium concentration increased only with hypertonic saline. The increased CSF sodium levels occurring with all three solutes was attributed to the exclusion of urea as well as sodium chloride and sucrose from the brain interstitium by the blood-brain barrier (Oldendorf, 1971), creating an osmotic gradient, resulting in osmotic movement of water from brain interstitium to the bloodstream. The increased CSF sodium concentration was indicative that the brain had been osmotically dehydrated by all three infusions, but only two of the infusions rapidly stimulated drinking—hypertonic saline and sucrose. Hyperosmolar urea was much less effective as a dipsogen. While these results did not support a role for sodium sensors in osmoregulatory thirst, there was a paradox. Why do only hypertonic sucrose and sodium chloride stimulate drinking, while all three solutions dehydrate the brain? Therefore, it was proposed that at least some osmoreceptors must be located in regions of the brain devoid of a blood-brain barrier—specifically circumventricular organs such as the organum vasculosum of the lamina terminalis (OVLT) or subfornical organ (see Figure 35.1 for locations) which lack a blood-brain barrier. Within these sites, it should be possible for osmoreceptors to distinguish the different solutes (McKinley et al., 1978). Subsequent studies in sheep (McKinley et al., 1982) and dogs (Thrasher, Keil, & Ramsay, 1982b) showed that ablation of the OVLT reduced drinking responses to systemic infusion of hypertonic saline, consistent with the presence of thirst osmoreceptors in the OVLT. Complete ablation of the OVLT did not totally abolish osmoregulatory drinking. Injection of hypertonic 0.2 M sodium chloride into the subfornical stimulates drinking (Camargo, Menani, Saad, & Saad, 1984), and ablation of the subfornical organ in the rat and sheep (but not dog) has been shown to reduce
c35.indd Sec1:684
osmotically stimulated drinking also (Hosutt, Rowland, & Stricker, 1978; Lind, Thunhorst, & Johnson, 1984; Thrasher, Simpson, & Ramsay, 1982). Combined ablation of the subfornical organ and OVLT severely reduces osmoregulatory drinking (Figure 35.4) in sheep, but does not abolish it (McKinley, Mathai, Pennington, Rundgren, & Vivas, 1999). Only total or near total ablation of the lamina terminalis (i.e., subfornical organ, median preoptic nucleus, and OVLT) prevents drinking responses to acute intravenous infusion of hypertonic saline in sheep (Figure 35.4), and it was suggested that there may be considerable redundancy within the lamina terminalis for osmoreceptor function (McKinley et al., 1999). It is unlikely that the other sensory circumventricular organ, the area postrema (located in the hindbrain immediately dorsal to the nucleus of the solitary tract; see Figure 35.1), is a site of thirst osmoreceptors, because ablation of this structure had no inhibitory effect on drinking in response to intravenous hypertonic saline in sheep (Slavin & McKinley, 1989, unpublished observations). However, ablation of the area postrema does alter hypothalamic and drinking responses to intragastric or ingested hypertonic saline (Carlson, Collister, & Osborn, 1998; Curtis, Huang, Sved, Verbalis, & Stricker, 1999) suggesting it may relay signals from visceral osmoreceptors that could influence thirst.
1,500
Water Intake (ml)
(McKinley, Blaine, & Denton, 1974). Thus, while central osmoreceptors were indicated by this result, a specific effect of sodium chloride was also evident. Observations that ablation of the medial preoptic region including lamina terminalis or anteroventral third wall of the third ventricle (AV3V) region disrupted osmoregulatory thirst in rats (Black, 1976; Johnson & Buggy, 1978) were consistent with the evidence in goats of a role of the anterior wall of the third ventricle in osmoregulatory thirst.
1,000
500
0
SFO OVLT MnPO SFO/ SFO/ SFO/ OVLT/ LT DMS VL n⫽5 n⫽4 n⫽7 OVLT dMnPO MnPO MnPO n⫽7 n⫽6 n⫽4 n⫽6 n⫽4 n⫽4 n⫽4 Site of Lesion
Figure 35.4 Effect of ablation of different parts of the lamina terminalis, alone or in combination, on water drinking of sheep in response to intravenous infusion of hypertonic 4 mol/l saline at 1.3 ml/min that increased plasma osmolality by 12 mosmol/kg over 30 minutes. Note: Open bars show the drinking responses prelesion and the filled bars the postlesion responses. DB = Diagonal band; dMnPO = Dorsal median preoptic nucleus; LT = Total lamina terminalis; MS = Medial septum; OVLT = Organum vasculosum of the lamina terminalis; T = Septal triangularis nucleus; vMnPO = Ventral median preoptic nucleus; SFO = Subfornical organ; VL = Lateral to MnPO. From Figure 35.6, “The Effect of Individual or Combined Ablation of the Nuclear Groups of the Lamina Terminalis on Water Drinking in Sheep,” by M. J. McKinley, M. L. Mathai, G. L. Pennington, Rundgren M., and L. Vivas, 1999, American Journal of Physiology, 276, p. R678. Reprinted with permission.
8/17/09 3:06:07 PM
Regulatory Thirst
In regard to the other anatomical component of the lamina terminalis, the median preoptic nucleus (see Figure 35.1), which is behind the blood-brain barrier, ablation of this nucleus by either electrolytic or neurotoxin (ibotenic acid) techniques severely disrupts acute osmoregulatory water drinking (Cunningham et al., 1991; Mangiapane, Thrasher, Keil, Simpson, & Ganong, 1983; McKinley et al., 1999). While the median preoptic nucleus receives neural input from both the subfornical organ and OVLT (Miselis, 1981; Saper & Levisohn, 1983), and this may be the cause of the reduced osmoregulatory drinking when it is ablated, it seems likely that the median preoptic nucleus is also a sensor region for hypertonicity. This is because ablation of both OVLT and subfornical organ in combination do not completely abolish osmoregulatory drinking, and the residual response is abolished if the median preoptic nucleus (see Figure 35.1) is ablated as well (McKinley et al., 1999). Neurons within the median preoptic nucleus are responsive to directly applied hypertonic saline in vitro, although they are inhibited by hypertonicity and excited by hypotonicity (Travis & Johnson, 1993). Studies of c-fos expression, show median preoptic neurons are activated in vivo by systemic infusion of hypertonic saline when the subfornical organ and OVLT have been ablated (Hochstenbach & Ciriello, 1996). There is also reason to believe that some osmoreceptors exist on the brain side of the blood-brain barrier. For example, osmoreceptors may reside in the median preoptic nucleus, and these sensors could explain why intracarotid infusions of urea cause a dipsogenic response of much slower onset and much lesser magnitude, than the responses to hypertonic saline, sucrose, or fructose mentioned previously. Intracarotid infusion of hypertonic glucose is invariably ineffective as a dipsogen (McKinley et al., 1978; Olsson, 1972), infusions of hypertonic urea increase the sodium chloride concentration, and therefore the effective osmotic pressure, of the brain extracellular fluid (except in the circumventricular organs) whereas infusions of hypertonic glucose do not because it is rapidly transported across the blood-brain barrier as well as into cells (McKinley et al., 1978). An osmoreceptor (or sodium sensor) for thirst in the median preoptic nucleus should respond to hypertonic urea but not to hypertonic glucose infusion, while osmoreceptors in the subfornical organ and OVLT would not respond to either. This arrangement would explain the moderate drinking response to urea but the lack of response to glucose. Immunohistochemical detection of Fos, a protein that influences gene transcription in the nucleus of the cell, and identifies neurons that have been activated in response to a particular stimulus (Sagar, Sharp, & Curran, 1988), has allowed histological identification of the neurons within
c35.indd Sec1:685
685
the lamina terminalis that respond to systemic hypertonicity. This method shows that intravenously infused solutions of hypertonic saline or sucrose activate neurons throughout the lamina terminalis of the rat (Oldfield, Bicknell, McAllen, Weisinger, & McKinley, 1991); so too does dehydration resulting from water deprivation for 24 to 48 hours (McKinley, Hards, & Oldfield, 1994). However, in the rat, osmotically stimulated neurons are concentrated particularly in the dorsal cap of the OVLT, the periphery of the subfornical organ (Figure 35.5), and throughout the median preoptic nucleus (McKinley et al., 1994). Intense Fos immunoreactivity induced in mouse OVLT in response to an intraperitoneal injection of hypertonic saline can be seen in Figure 35.6. These results are consistent with earlier electrophysiological studies that all parts of the lamina terminalis are responsive to systemically infused hypertonic stimuli (Gutman, Ciriello, & Mogenson, 1988; McAllen, Pennington, & McKinley, 1990; Vivas, Chiaraviglio, & Carrer, 1990). Functional brain imaging studies in human subjects infused systemically with hypertonic saline also show the lamina terminalis (Figure 35.7) to be activated by hypertonicity (Egan et al., 2003). In contrast to the effects on dipsogenic responses to acute hypertonicity, drinking following water deprivation for 48 hours was not inhibited by ablation of considerable parts of the lamina terminalis; it was reduced but not abolished by complete destruction of the lamina terminalis (McKinley et al., 1999). These observations show that other brain regions may have a role in osmoregulatory thirst. They also indicate that the mechanism of the acute thirst response to hypertonicity, largely under the control of the lamina terminalis, may be different to the mechanism regulating thirst in response to long-term hypertonicity. Lateral Preoptic Area Another region that has been implicated as a site of thirst osmoreceptors is the lateral preoptic region (see panel 2 of Figure 35.1). Injections of hypertonic saline or sucrose, but not urea, into the lateral preoptic area stimulates drinking in rats and rabbits, while ablation of the lateral preoptic area disrupts osmoregulatory drinking in a 4-hour test following intraperitoneal injection of hypertonic saline. Drinking following 24-hour water deprivation was not affected by lateral preoptic lesions (Blass & Epstein, 1971; Peck & Novin, 1971). More detailed mapping studies using the microinjection technique showed that osmotically stimulated drinking sites in the preoptic region were more widespread than initially observed (Peck & Blass, 1975). Osmosensitive neurons have been detected in the lateral preoptic area (Malmo & Mundl, 1975). However, osmoregulatory thirst is delayed rather than blocked by ablation of the lateral preoptic area; as well, appropriate drinking
8/17/09 3:06:08 PM
686 Thirst
Figure 35.5 Activated neurons in coronal sections of the subfornical organ of rats as shown by Fos-immunoreactivity (black dots) induced by intravenous infusion of hypertonic saline (A), intravenous infusion of angiotensin II (B), intravenous infusion of relaxin (C), and control infusion of isotonic 0.15 mol/l saline.
Note: Magnification bar = 100 µm. Infusion of hypertonic saline or relaxin activated neurons mainly in the periphery of the subfornical organ, while angiotensin II stimulated neurons throughout this region.
Hepatic or Gastrointestinal Osmoreceptors
OVLT
OC
Figure 35.6 Activated neurons, shown by intense Fos-immunreactivity (black dots), in the OVLT of a mouse brain 2 hours after an intraperitoneal injection of hypertonic saline (0.8 mol/l). Note: The arrow in the inset indicates the site of the OVLT in the coronal section of the mouse brain. OC = Optic chiasma; OVLT = Organum vasculosum of the lamina terminalis.
occurs in lateral preoptic-lesioned rats in response to intravenous infusion of hypertonic saline, increased dietary sodium chloride intake, or water deprivation (Coburn & Stricker, 1978). It is proposed that neurons within the lateral preoptic area and the lamina terminalis may interact in the regulation of thirst (Camargo et al., 1984), but the relationship of putative osmoreceptors in the lateral preoptic area with those in the lamina terminalis remains to be elucidated.
c35.indd Sec1:686
Osmoreceptors in the hepatic portal vein or liver have been shown to influence vasopressin secretion and urine output (Baertschi & Vallet, 1981; Haberich, 1971). In regard to thirst, infusion of water but not saline into the portal vein inhibits water intake of dehydrated rats (Kobashi & Adachi, 1992). As well, water drinking by rats administered intragastric loads of hypertonic saline occurs before any measurable change in systemic plasma osmolality, suggesting that gastrointestinal or hepato-portal osmoreceptors may regulate thirst (Kraly, Kim, Dunham, & Tribuzio, 1995). Further support for this idea comes from studies in which intragastric hypertonic saline loads caused a potentiation of water drinking in response to dehydration or intravenous hypertonic saline infusion (Stricker, Callahan, Huang, & Sved, 2002). These investigators suggested that signals from central osmoreceptors interact with those from peripheral osmoreceptors, depending on the animal’s hydration state; the peripheral sensors providing early signaling of ingested fluid before any change in the tonicity of the general circulation occurs. Signals from portal or gastrointestinal osmoreceptors may be relayed via the area postrema (Curtis et al., 1999; Stricker et al., 2002). There is evidence that osmoreceptors within the lamina terminalis influence these signals (Freece, Van Bebber, Zierath, & Fitts, 2005).
8/17/09 3:06:08 PM
Hypovolemic Thirst and Extracellular Fluid Depletion 687
OrG AC Ins STG
LT
Cb
MC
AC
Cb
Figure 35.7 (Figure C. 35 in color section) Functional magnetic resonance imaging (BOLD signal) sections of a conscious subject experiencing maximum thirst resulting from intravenous infusion of hypertonic saline. Note: Activations (light gray regions): ACC = Anterior cingulate cortex; Cb = Cerebellum; Ins = Insula; LT = Lamina terminalis; MC = Orbital gyrus, mid cingulate region, posterior part; OrG = Orbital gyrus; STG = Superior temporal gyrus. From Figure 35.3, “Neural Correlates of the Emergence of Consciousness of Thirst,” by G. Egan et al., 2003, Proceedings of the National Academy of Sciences, USA, 100, p. 15245. Reprinted with permission.
TRPV Channels and Osmosensory Transduction Identification of the molecular characteristics of thirst osmoreceptors has been advanced recently with the discovery that ion channels of the transient receptor potential vanilloid (TRPV) class may play a role in transducing osmosensory function. Both TRPV1 and TRPV4 channels have been implicated in osmoregulatory transduction because they are located within osmosensory neurons of the lamina terminalis, and deletion of genes encoding these TRPV channels moderately reduces osmoregulatory drinking in mice (Ciura & Bourque, 2006; Liedtke, 2007). HYPOVOLEMIC THIRST AND EXTRACELLULAR FLUID DEPLETION Thirst and drinking of fluids can also result from depletion of the extracellular fluid without any intracellular
c35.indd Sec11:687
dehydration. Loss of fluid from the extracellular compartment (hypovolemia) may occur naturally under physiological or pathophysiological conditions that include hemorrhage, vomiting, diarrhea, burns, or sweating. If there is sufficient loss of extracellular fluid, thirst results that drives fluid intake (Fitzsimons, 1961). In the laboratory, several strategies have been employed in experimental animals to deplete the extracellular fluid without any apparent depletion of the intracellular compartment. These include hemorrhage, subcutaneous injection of colloid (e.g., polyethylene glycol) causing sequestration of extracellular fluid under the skin, diuretic treatment, peritoneal dialysis, diuretics, hemofiltration or loss of saliva from a parotid fistula (Abraham et al., 1976; Anderson & Houpt, 1990; Fitzsimons, 1961; Rabe, 1975; Stricker, 1966; Zimmerman, Blaine, & Stricker, 1981). These procedures are associated with increased water intake, although hemorrhage is an inconsistent dipsogenic stimulus (Fitzsimons, 1979; Wolf, 1958). The extracellular compartment is comprised of fluid within the circulation (plasma) and the interstices of tissues (interstitial fluid). When the extracellular compartment is depleted, the volume of fluid within the circulation falls, particularly on the venous side of the heart, resulting in reduced central venous pressure and reduced venous return to the heart. Experimental procedures (e.g., constriction of the vena cava) that mimic the changes in pressures that occur within the great veins returning blood to the heart during hypovolemic conditions also stimulate thirst and water intake (Thrasher, Keenan, & Ramsay, 1999). Sodium Appetite If there is depletion of the extracellular fluid without a concomitant increase of extracellular concentration of sodium chloride, a deficit in whole body sodium chloride as well as water occurs. Therefore, while ingestion of water may restore the volume of fluid lost, unless sodium chloride is also ingested, restoration of both volume and ionic concentrations of extracellular fluid will not result. Thus, it is not surprising that extracellular hypovolemia is also associated with the development of an appetite that is specific for sodium salts. Sodium appetite is much slower in onset than thirst, developing over several hours following the loss of extracellular fluid (Fitzsimons, 1979). Endocrine signaling mechanisms play a major role in the generation of sodium appetite during conditions of hypovolemia. Blood concentrations of both angiotensin II and aldosterone increase and may act synergistically within the brain to generate an appetite for salt (Figure 35.3). It is beyond the scope of this chapter to review the physiological mechanisms subserving sodium appetite and more details can be found in number of excellent reviews and monographs (Denton, 1982; Fitzsimons, 1979; Johnson & Thunhorst, 1997).
8/17/09 3:06:09 PM
688 Thirst
Additivity of Hyperosmotic and Hypovolemic Dipsogenic Signals As stated earlier, in a condition of dehydration, water is withdrawn from both intracellular and extracellular compartments of the body. Signals from both osmoreceptors and volume sensors contribute to the resultant thirst. While osmotic signals account for the majority of the dipsogenic response to dehydration, volume signals make a significant contribution (Ramsay, Rolls, & Wood, 1977). Further, studies in which simultaneous delivery of an osmotic load with a hypovolemic stimulus resulted in water intake that was the sum of that attributable to each independent stimulus show the probable additivity of hypovolemic and osmotic stimuli mediating dehydrational thirst (Blass & Fitzsimons, 1970; Fitzsimons, 1979). Sensors and Afferent Signaling of Hypovolemic Thirst With only a few exceptions, studies of the afferent signaling mechanisms mediating water drinking in response to hypovolemia have been confined to rats and dogs. The experimental model of hypovolemia that has been studied in the dog is constriction or obstruction of the inferior vena cava (caval ligation) to reduce venous return to the heart, lowering central venous pressure and arterial pressure. In the rat, subcutaneous sequestration of extracellular fluid under the skin has been utilized as a means of producing systemic hypovolemia. Neural Signaling Hypovolemic thirst results from both hormonal and neural signals being transmitted to the brain. These signals may act either singly, or in combination to generate thirst. Stretch receptors in the heart and blood vessels provide the neural signals that relate information to the CNS on the degree of filling and pressures within the circulation. These afferent signals, which are stimulated by increases in pressure and reduced when pressure decreases, are carried largely by the vagus and glossopharyngeal nerves and terminate in the nucleus of the solitary tract in the medulla oblongata (Dampney, 1994; see Figure 35.1, panel A). From there, polysynaptic neural pathways, that involve both excitatory and inhibitory synapses, relay signals to other sites that control a number of functions that include the baroreceptor reflex, vasopressin secretion, and thirst. It is possible that neural relays via the A1 cell group in the caudal ventrolateral medulla, and/or other hindbrain and midbrain sites send signals to neurons in the lamina terminalis to influence drinking (Johnson, Cunningham, & Thunhorst, 1996). However, the participation of these pathways in hypovolemic thirst remains to be proven.
c35.indd Sec11:688
Angiotensin-Mediated Hormonal Signaling When blood volume, arterial pressure and central venous pressure fall, circulating concentrations of angiotensin II increase (see Figure 35.3). The initial step in the generation of angiotensin II, the effector peptide of the reninangiotensin system, is the release of the proteolytic enzyme renin from the kidney. The signals that drive renin secretion in hypovolemic states are increased renal sympathetic nerve activity, reduced renal perfusion pressure, and altered sodium load at the macula densa of the distal tubule of the kidney. Once released into the bloodstream, renin catalyzes the formation of the decapeptide molecule angiotensin I in the circulation by cleaving it from a large (40,000 Dalton) plasma protein, angiotensinogen that is synthesized in the liver. Angiotensin converting enzyme (ACE) mainly in the lung, but also in other tissues, then splits off two more amino acids from the carboxyl terminus of angiotensin I causing the formation of the biologically active octapeptide angiotensin II (see Montani & Van Vliet, 2004, for a review). In regard to the humoral signals for hypovolemic thirst, evidence in both rat and dog favors a significant role for the renin-angiotensin system in thirst associated with depletion of the extracellular fluid. There are compelling data favoring a role for circulating angiotensin II, at least in combination with neural signals, in the genesis of hypovolemic thirst. First, removal by bilateral nephrectomy of the source of renin, the enzyme needed for generation of angiotensin I, reduces drinking in response to caval ligation in rats (Fitzsimons, 1969). Second, peripheral administrations of inhibitors of angiotensin converting enzyme (ACE inhibitors) or angiotensin receptor antagonists reduce drinking responses to caval ligation in dog or rat (Fitzsimons & Elfont, 1982; Fitzsimons & Moore-Gillon, 1980; Thrasher, Keil, & Ramsay, 1982). Third, the blood levels of angiotensin II that are reached following subcutaneous injection of polyethylene glycol (a large colloidal molecule) or caval ligation are above the threshold plasma concentrations of angiotensin II for drinking achieved by intravenous infusion of angiotensin II (Johnson & Thunhorst, 1997). Fourth, ablation of the subfornical organ, the site of angiotensin receptors mediating the dipsogenic action of this octapeptide, inhibits water intake in response to hypovolemia (Lind et al., 1984; Stratford & Wirtshafter, 2000). Intrathoracic Receptors Signaling Hypovolemia It is also clear that other signaling mechanisms besides the renin-angiotensin system play an important role in hypovolemic thirst. Data from three studies of the dipsogenic effect of caval ligation in the dog are particularly relevant to this point. First, significant drinking in response to caval ligation
8/17/09 3:06:10 PM
Hormonal Influences on Thirst
is still evident in dogs in which angiotensin receptors have been totally blocked pharmacologically (Thrasher, Simpson, et al., 1982). Second, the drinking response to caval ligation was reduced by approximately half when the heart was denervated so that putative low pressure atrial stretch receptors no longer send signals to the brain. Denervation of the high pressure baroreceptors in the aortic arch and carotid sinus also substantially reduced such drinking while combined cardiac and arterial baroreceptor denervation totally abolished the dipsogenic effect of caval ligation, despite high circulating angiotensin II levels being maintained (Quillen, Keil, & Reid, 1990). These data indicate that signals from baroreceptors in the heart and carotid sinus and aortic arch have an important role in mediating hypovolemic thirst in the dog. More recently, Thrasher et al. (1999) performed a series of intrathoracic vascular ligations of the inferior vena cava (IVC), the pulmonary artery, or the ascending aorta in conscious dogs so that blood pressure at the arterial baroreceptors fell by 25 mm Hg in each case. As expected, constriction of the IVC reduced left atrial, right atrial, and mean arterial pressure and stimulated drinking. So too did constriction of the pulmonary artery that reduced left atrial and arterial pressure, but increased right atrial pressure. However, constriction of the ascending aorta that also reduced arterial pressure and right atrial pressure, but increased left atrial pressure, did not stimulate thirst in the dogs. These results show either that loading left atrial receptors can override signals from other baroreceptors, or it is possible that unloading of low-pressure stretch receptors in the left atrium of the heart in combination with unloading of high-pressure arterial baroreceptors is the cause of hypovolemic thirst in the dog (Thrasher et al., 1999). In sheep, a crushing injury to the left atrial appendage caused hypovolemia-induced water intake to be depressed, evidence supporting a role for left atrial receptors mediating hypovolemic thirst in this species (Zimmerman et al., 1981). There are stretch receptors in the ventricles of the heart and coronary arteries as well as the atria that could also have role in mediating hypovolemic thirst; investigations have yet to be made in this regard. In rats, the right atrium appears to play an important sensor role for hypovolemic thirst. Since nephrectomized, polyethylene glycol-treated (i.e., hypovolemic) rats that cannot increase blood angiotensin II levels still increase water intake, nonangiotensin signaling must be involved (Fitzsimons, 1961). Inflation of a balloon at the junction of the superior vena cava and the right atrium abolished drinking responses to intraperitoneal polyethylene glycol, reduced dehydration-induced drinking, or the spontaneous overnight water intake, but had no effect on the drinking response to intravenous infusion of hypertonic saline (Kaufman, 1984). An interesting aspect of this study was that atrial stretch reduced the volume of water drunk in
c35.indd Sec2:689
689
response to 24 hours of water-deprivation by 30%, which is the proportion of water intake of dehydrated rats that Ramsay et al. (1977) attributed to the reduced extracellular volume, the rest being osmotically stimulated. Alternatively, drinking in response to subcutaneous injection of polyethylene glycol was totally blocked, consistent with this stimulus being a pure hypovolemia. Kaufman (1984) proposed that a direct nervous input from the right atrium to the central nervous system mediated hypovolemic drinking. However, it is also possible that release of atrial natriuretic peptide from the heart contributes to the inhibition of hypovolemic thirst by atrial balloon inflation. HORMONAL INFLUENCES ON THIRST Angiotensin II Following the discovery of a renal dipsogen that appeared to be renin (Fitzsimons, 1969), rapid progress was made in identifying the thirst-stimulating properties of angiotensin II, and the evidence for its role as a dipsogenic hormone has been detailed previously. Systemic administration of components of the renin-angiotensin system—renin, angiotensin I, or angiotensin II—stimulates water drinking in many mammals and reptiles (Fitzsimons, 1979). In some species (e.g., sheep, humans), the blood levels of infused angiotensin II needed to stimulate drinking were found to be high relative to the levels observed physiologically during hypovolemia (Abraham, Baker, Blaine, Denton, & McKinley, 1975; Phillips, Rolls, Ledingham, Morton, & Forsling, 1985). However, it is likely that in these species, the dipsogenic effect of angiotensin II that would occur normally during intravenous infusion of this octapeptide is offset by the simultaneous rise in blood pressure that inhibits thirst by stimulation of arterial baroreceptors (Evered, 1992; Klingbeil, Brooks, Quillen, & Reid, 1991). Because arterial pressure does not increase during hypovolemia, such an inhibitory influence on the dipsogenic action of angiotensin II is not a consideration in this condition and lower circulating levels of the peptide should induce thirst. Site in the Brain of the Dipsogenic Action of Angiotensin II Hydrophilic peptide molecules like angiotensin II do not normally gain rapid passage into the brain interstitium from the bloodstream because of the blood-brain barrier. Following the discovery of the dipsogenic action of angiotensin II, the question soon arose as to how a polar molecule like angiotensin II could act on the brain to stimulate thirst if did not cross the blood-brain barrier. Experiments in the rat showed that angiotensin II from
8/17/09 3:06:11 PM
690 Thirst
the bloodstream acted on neurons located in the subfornical organ to stimulate drinking behavior (Simpson & Routtenberg, 1973). This circumventricular organ lacks a normal blood-brain barrier (Wislocki & Leduc, 1952) and neurons within it express high concentrations of angiotensin AT1 receptors in all species studied (Allen et al., 2000) including humans (McKinley, Allen, Clevers, Paxinos, & Mendelsohn, 1987). Angiotensin II, directly applied or from blood, stimulates action potentials in subfornical neurons (Felix & Schlegel, 1978; Gutman et al., 1988) and circulating angiotensin II activates neurons throughout the subfornical organ (Figure 35.5B) as indicated by expression of the proto-oncogene c-fos (McKinley, Badoer, & Oldfield, 1992). For water drinking, the rat subfornical organ is exquisitively sensitive to minute quantities of directly injected angiotensin II, while ablation of the subfornical organ prevents drinking in response to intravenously infused angiotensin II and some hypovolemic stimuli (Simpson, Epstein, & Camardo, 1978). Paradoxical Potentiation of Thirst by Angiotensin Converting Enzyme Inhibitors An interesting aspect of angiotensin action on the subfornical organ is the very high concentrations of angiotensin converting enzyme (ACE) present there (Brownfield, Reid, Ganten, & Ganong, 1982). These high concentrations of ACE allow angiotensin I originating from the systemic circulation to be converted to angiotensin II locally within the subfornical organ. This probably explains why lower doses of ACE inhibitors, such as captopril, that block peripheral generation of angiotensin II, not only do not block drinking responses, but actually potentiate them (Lehr, Goldman, & Casner, 1973). This paradox arises because although peripheral ACE blockade reduces circulating angiotensin II levels, the concentration of blood-borne angiotensin I increases dramatically. This angiotensin I can then be converted to angiotensin II locally in the subfornical organ to stimulate thirst, because the doses of ACE inhibitors used may not be sufficient to block the high concentrations of ACE in the subfornical organ. In line with this interpretation, administration of a much higher concentration of ACE inhibitor directly into the subfornical organ, blocks drinking responses (Thunhorst, Fitts, & Simpson, 1989).
mammary duct development), relaxin can stimulate water drinking and the secretion of vasopressin. Intravenous infusion of relaxin (Sinnayah, Burns, Wade, Weisinger, & McKinley, 1999) or direct injection into the brain ventricles (Thornton & Fitzsimons, 1995) stimulates water drinking by rats of either sex. Administration of relaxin-neutralizing antibodies to pregnant rats reduced water intake during the second half of pregnancy in these animals, indicating a likely role for relaxin in their fluid intake (Zhao, Malmgren, Shanks, & Sherwood, 1985). Blood concentrations of angiotensin II as well as relaxin increase during pregnancy and a synergy between circulating angiotensin II and relaxin to stimulate water drinking in rats has been demonstrated (Sinnayah et al., 1999). Circulating relaxin also stimulates vasopressin secretion (Parry, Poterski, & Summerlee, 1994) and in combination with its dipsogenic action, relaxin would be expected to promote a positive fluid balance. Indeed, pregnancy is characterized by a reduction in plasma osmolality in many mammals, and it has been suggested that a resetting of the osmostat occurs (Durr, Stamotsos, & Lindheimer, 1981). It seems likely that this so-called resetting of the osmostat is due in part to the dipsogenic action of relaxin to maintain water intake in pregnancy despite the hyponatremia and hypotonicity of body fluids. Site in the Brain of the Dipsogenic Action of Relaxin The relaxin receptor (LGR-7) is present at relatively high concentrations in several regions of the brain that include the subfornical organ and OVLT, sites devoid of a bloodbrain barrier and accessible to circulating relaxin (Osheroff & Phillips, 1991). The dipsogenic action of relaxin is almost certainly initiated via a group of relaxin-sensitive neurons in the periphery of the subfornical organ because (a) relaxin acts directly on neurons of the isolated subfornical organ in vitro to increase the frequency of action potentials; (b) intravenous infusion of relaxin activates subfornical organ neurons as indicated by the increased expression of c-fos in a subgroup of neurons within its periphery (Figure 35.5C); and (c) ablation of the subfornical organ (but not the OVLT) abolishes drinking in response to systemically infused relaxin (Sunn et al., 2002). The efferent neural pathways from the subfornical organ mediating this relaxin-induced drinking are unknown.
Relaxin Source of Relaxin and Effects on Fluid Balance
Atrial Natriuretic Peptide
Relaxin is a peptide hormone that is synthesized in the corpus lutea of the ovary and secreted into the systemic circulation during most of pregnancy in many mammals. In addition to its actions on reproductive tissues (e.g., inhibition of uterine contractions, ripening of the cervix, and
Atrial natriuretic peptide (ANP) is one of three closely related natriuretic peptides released from cardiac myocytes in conditions of increased extracellular fluid volume (see Figure 35.3). As befits its release during states of fluid loading, ANP inhibits water drinking. ANP exerts inhibitory
c35.indd Sec2:690
8/17/09 3:06:11 PM
Integrative Brain Regions Relaying Thirst Signals 691
c35.indd Sec3:691
actions on angiotensin-related drinking and dehydrationinduced drinking in rats (Antunes-Rodrigues, McCann, Rogers, & Sampson, 1985) and inhibits osmoregulatory thirst in humans (Burrell, Lambert, & Bayliss, 1991). The actions of ANP on thirst are probably due to a direct inhibitory action on neurons of the subfornical organ because ANP, directly applied to subfornical neurons, inhibits their firing rate and excitatory response to angiotensin II (Hattori, Kasai, Uesugi, Kawata, & Yamashita, 1988); and direct injection of ANP into the subfornical organ of rats reduces water intake in response to water deprivation or angiotensin (Ehrlich & Fitts, 1990).
estrogen does not seem to affect osmotically stimulated drinking (Findlay et al., 1979).
Estrogen
Hormones Associated with Feeding
Ovarian steroid hormones such as estrodiol probably have a physiological role in regulating thirst in females. Day-today water drinking changes during the course of the estrus cycle in animals, and these alterations can be abolished by oophorectomy (Findlay, Fitzsimons, & Kucharczyk, 1979; Michell, 1979). If estrodiol benzoate is implanted into ovariectomized female rats, water intake falls, as it does in intact female rats at estrus when blood levels of endogenous estrogen increase (Findlay et al., 1979). Estrogen (but not progesterone) treatment in ovariectomized rats causes a reduction in water drinking elicited by angiotensin II administered peripherally or centrally, but does not affect osmoregulatory drinking (Findlay et al., 1979; Fregly, 1980; Kisley, Sakai, Ma, & Fluharty, 1999). A likely explanation of the estrogen-induced reduction in angiotensin-related drinking is a down regulation of angiotensin AT1 receptors in the subfornical organ. AT1 receptors are co-located on many neurons that also express estrogen receptors in the periphery of the subfornical organ of ovariectomized rats. The expression of AT1 receptors on these neurons is greatly reduced following estrogen administration for 5 days (Rosas-Arellano, Solano-Flores, & Ciriello, 1999). It has been shown also that estrogen treatment reduces the angiotensin responsiveness of subfornical neurons from ovariectomized rats (Tanaka, Miyakubo, Okamura, Sakamaki, & Hayashi, 2001). Water intake in response to injection of angiotensin II directly into the subfornical was attenuated in estrogen- treated rats, whereas these rats drank normally in response to injections of angiotensin II into the median preoptic nucleus. The authors propose that estrogen depresses the activity of angiotensin-responsive neurons in the subfornical organ projecting to the median preoptic nucleus (Tanaka, Miyakubo, Fujisawa, & Nomura, 2003). The estrogen receptor ER-α is expressed in osmoresponsive neurons in the periphery of the subfornical organ and dorsal cap of the OVLT, and this expression is greatly increased by hypertonicity resulting from water deprivation (Somponpun, Johnson, Beltz, & Sladek, 2004), however
Much of the normal day-to-day water drinking of mammals is closely associated with feeding, and it is possible that hormones from the gastrointestinal region may influence thirst and water intake (Kraly, 1991). Amylin is a hormone secreted from pancreatic islet beta cells following the intake of food. When administered peripherally, it stimulates water drinking in rats. Amylin causes excitation of neurons in the subfornical organ in vitro, and it has been suggested that it is a dipsogenic hormone that acts via the subfornical organ to stimulate prandial drinking (Riediger, Rauch, & Schmid, 1999). Amylin also stimulates renin secretion from the kidney (Wookey, Cao, & Cooper, 1998) and its dipsogenic action could also be mediated in part via increased angiotensin II levels in the circulation. Obestatin, a peptide from the gastrointestinal tract, is a posttranslational variant of the ghrelin gene. It inhibits drinking following feeding or centrally administered angiotensin II in rats. Obestatin depresses the activity of subfornical organs in vitro, suggesting that its antidipsogenic action may be mediated via an action on this circumventricular organ (Samson, White, Price, & Ferguson, 2007). Its physiological significance as an antidipsogenic hormone requires further evaluation.
Other Hormones Vasopressin Systemically infused vasopressin has been observed to increase the osmotic responsiveness of dogs to drink (Szczepanska-Sadowska, Sobocinska, & Sodowski, 1982). Despite the obvious association of vasopressin secretion and thirst, there is little if any evidence in other species to suggest that vasopressin is a dipsogenic hormone.
INTEGRATIVE BRAIN REGIONS RELAYING THIRST SIGNALS Neural signals from peripheral and central sensors are relayed and integrated within the central nervous system to generate the conscious emotion of thirst. Major neural pathways are summarized in Figure 35.8. Nucleus of the Soltary Tract and Area Postrema Vagal and glossopharyngeal afferent nerves transmitting sensory signals from visceral sensors that include baroreceptors, stretch receptors in the gastrointestinal tract, and
8/17/09 3:06:11 PM
692 Thirst
Circulating Hormones Angiotensin II Relaxin ANP THIRST
SFO
MnPO OVLT
R LH
LPBN NTS/AP CVLM
Systemic hypertonicity Arterial baroreceptors Cardiac receptors GI tract, liver
Figure 35.8 A diagram of major neural pathways (excitatory or inhibitory) linking sensors in the lamina terminalis for osmoreception and circulating hormones, regions of the medulla that receive afferent neural input from arterial baroreceptors, the heart, gastrointestinal tract, and liver, with integrative regions in the midbrain and hypothalamus.
Note: The influence of ANP (interrupted arrow) on the subfornical organ is inhibitory. ANP = Atrial natriuretic peptide; AP = Area postrema; CVLM = Caudal ventrolateral medulla; GI = Gastrointestinal; LH = Lateral hypothalamic area; LPBN = Lateral parabrachial nucleus; MnPO = Median preoptic nucleus; NTS = Nucleus of the solitary tract; OVLT = Organum vasculosum of the lamina terminalis; R = Midbrain raphe; SFO = Subfornical organ.
taste receptors terminate within the nucleus of the soltary tract (NTS) and area postrema (Contreras, Beckstead, & Norgren, 1982) and may influence thirst. Combined ablation of the area postrema and adjacent NTS increased water intake in response to angiotensin-related dipsogenic stimuli in rats (Edwards & Ritter, 1982; Ohman & Johnson, 1989). This effect was less if the lesion was restricted more to the area postrema, and greater if a larger proportion of the NTS was ablated (T. Wang & Edwards, 1997). It is proposed that signals from the viscera, that have an influence on thirst, are relayed via the NTS to more rostral brain regions via the lateral parabrachial nucleus and ventrolateral medulla (Johnson & Thunhorst, 1997). Ad libitum water drinking, or that resulting from hypovolemia, was not affected by lesions of the NTS designed to destroy neural input from intrathoracic baroreceptors, and it has been suggested that neural input from ascending spinal pathways transmitting signals from renal sensors could be influencing thirst in these animals (Schreihofer et al., 1999).
the NTS relaying afferent nerve signals from the viscera (Herbert, Moga, & Saper, 1991). In turn, neurons within the LPBN project efferent nerve fibers to regions known to influence thirst such as the median preoptic nucleus and subfornical organ in the lamina terminalis and the lateral hypothalamic area (Herbert, et al., 1991; Saper & Levisohn, 1983). Ablation of the LPBN, by either electrolysis or injection of neurotoxin, enhanced water intake in response to angiotensin-mediated dipsogenic stimuli, but did not change day-to-day water intake, or drinking responses to systemic hypertonicity or polyethylene glycol-induced hypovolemia (Edwards & Johnson, 1991; Ohman & Johnson, 1989). Therefore, neurons within the LPBN do not seem to influence ad libitum water drinking, but may relay signals that inhibit water drinking associated with angiotensin’s dipsogenic action, possibly preventing excessive water intake in response to angiotensin. It is possible that the LPBN restricts water drinking by engaging neural mechanisms mediating satiety. Microinjection of the GABA agonist muscimol into the LPBN to inhibit its neural activity in water-sated rats causes a small but significant increase in drinking, consistent with the idea that LPBN neurons play a role in thirst satiety. Another interesting aspect of LPBN function in relation to thirst mechanisms is the apparent switching of a thirst
Lateral Parabrachial Nucleus The lateral parabrachial nucleus (LPBN) in the dorsolateral midbrain is the major efferent target of neurons within
c35.indd Sec3:692
8/17/09 3:06:12 PM
Integrative Brain Regions Relaying Thirst Signals 693
to an appetite for salt when serotonergic antagonists are injected into the LPBN of rats administered various dipsogenic stimuli (Menani, Columbari, Beltz, Thunhorst, & Johnson, 1998). Midbrain Raphé Nuclei Located in the dorsal midline of the midbrain (see Figure 35.1, panel A), the dorsal and median raphé nuclei probably relay neural signals that exert inhibitory serotonergic influences on thirst mechanisms. Ablation of the dorsal raphé nucleus causes increased water intake in response to dehydration or an angiotensin II–mediated stimulus (isoproterenol treatment), and also changes the sodium/water preference of rats resulting in large increases in both salt and water intake (Olivares, Costa-e-Sousa, & CavalcanteLimal 2003). In regard to the median raphé nucleus, ablation of serotonergic neurons therein resulted in a gradual increase in water intake of rats (Barofsky, Grier, & Pradhan, 1980), while acute inhibition of neurons within this brain region by microinjection of muscimol (a drug that inhibits neurons by acting at GABA receptors) into it caused a rapid drinking response in normally hydrated rats. Inhibitory neural pathways from median raphé to the subfornical organ and/or lateral hypothalamic area may mediate its inhibitory influence on thirst because ablation of either of these regions disrupts the dipsogenic effect of injections of muscimol into the median raphé nucleus (Stratford & Wirtshafter, 2000). Zona Incerta Located ventral to the thalamus, the zona incerta has been implicated in thirst mechanisms because its ablation disrupts drinking responses. However, results are inconsistent as to the type of drinking that is affected by ablation of the zona incerta. Grossman (1984) reported that osmoregulatory but not hypovolemic drinking was inhibited by lesions in the rostral zona incerta, whereas Evered and Mogenson (1976) observed that water intake in response to hypertonicity or hypovolemia was normal in rats with zona incerta lesions, but secondary, nonhomeostatic drinking was impaired. The zona incerta is connected to many brain regions, including the subfornical organ, lateral hypothalamic area, and several thalamic sites (Miselis, Weiss, & Shapiro, 1987; Ricardo, 1981) and is well positioned to relay neural signals for thirst. Lateral Hypothalamic Area As mentioned in an earlier section, stimulation of the lateral hypothalamus (LH) stimulates drinking (Andersson &
c35.indd Sec3:693
McCann, 1955; Greer, 1955), while ablation of the LH caused severe hypodipsia and hypophagia (Teitelbaum & Epstein, 1962). The disruption of ascending catecholaminergic fibers of passage in the medial forebrain bundle passing through the LH was considered a crucial factor in the cause of adipsia and aphagia of the LH syndrome (Ungerstedt, 1971). However, later investigations in rats, in which the excitotoxin kainic acid was used to ablate LH neurons but leave fibers of passage intact, revealed that osmoregulatory, hypovolemic, and angiotensin-stimulated water-drinking responses were severely disrupted, but drinking following water deprivation was not (Stricker, Swerdloff, & Zigmond, 1978; Winn, Tarbuck, & Dunnett, 1984). The LH receives a strong afferent neural input from the lamina terminalis, LPBN, and midbrain raphe (Berk & Finkelstein, 1981; Herbert et al., 1991; Miselis et al., 1987) and has numerous efferent connections to thalamic and cortical regions. It is possible that it could relay thirst related signals to these cortical regions from sensors in the lamina terminalis. Median Preoptic Nucleus The median preoptic nucleus, located in the lamina terminalis between the subfornical organ and OVLT, has direct neural links with many brain regions that have been implicated in the control of body fluid homeostasis. These include a rich reciprocal neural connectivity with the OVLT and subfornical organ, neural input from the LPBN, ventrolateral medulla, midbrain raphé, and hypothalamic paraventricular nucleus (Saper & Levisohn, 1983; Zardetto-Smith & Johnson, 1995). As well, its strong efferent links to the lateral preoptic and lateral hypothalamic areas, parastrial nucleus, supraoptic nucleus, magno- and parvocellular parts of the hypothalamic paraventricular nucleus, midbrain and medullary raphé, periaqueductal grey, bed nucleus of the stria terminalis, and amygdala (Gu & Simerly, 1997) emphasize the potential of this nucleus for an integrative role in the regulation of thirst. Evidence that is consistent with an integrative role of the median preoptic nucleus in thirst is the severe disruption of osmoregulatory, angiotensin-stimulated or hypovolemic drinking responses caused by ablation of this nucleus by either neurotoxin or electrolytic methods (Cunningham et al., 1991; Johnson et al., 1996; Mangiapane et al., 1983; McKinley et al., 1999). Neurons within the median preoptic nucleus are activated when animals are dehydrated, infused systemically with hypertonic saline or angiotensin II, or intracerebroventricularly with angiotensin II or relaxin, which are all dipsogenic stimuli (Herbert, Forsling, Howes, Stacey, & Shiers, 1992;
8/17/09 3:06:12 PM
694 Thirst
McAllen et al., 1990; McKinley et al., 1992, 1994, 1997). Severing the neural connections between the subfornical organ and median preoptic nucleus disrupts drinking in response to systemically administered angiotensin II, as does cutting the efferent neural output from the median preoptic nucleus (Eng & Miselis, 1981; Lind & Johnson, 1982). The high concentration of angiotensin AT1 receptors in the median preoptic nucleus (Allen et al., 2000) that would not be directly accessed by circulating angiotensin II (because of the blood-brain barrrier), indicate that it probably receives afferent angiotensinergic input. It has been proposed that angiotensin-senstive neurons in the subfornical organ that are stimulated by blood-borne angiotensin relay neural signals out of the lamina terminalis via an angiotensergic synapse in the median preoptic nucleus (Johnson et al., 1996). Septal Nuclei Harvey and Hunt (1965) showed initially that ablation of the septum (see Figure 35.1, panel 2 for location) could cause large, prolonged (over months) increases in daily fluid intake in rats (termed septal hyperdipsia), suggesting that this region may relay inhibitory neural signals related to thirst. Polydipsia resulting from septal lesions persisted in rats with the ureter ligated to prevent urine loss, demonstrating that a primary polydipsia occurs with septal lesions (Blass & Hanson, 1970). In rats, day-to-day drinking increases, and angiotensin-stimulated but not osmoregulatory drinking is potentiated (Blass, Nussbaum, & Hanson, 1974). In sheep with septal lesions, neither angiotensin- or osmotically stimulated drinking is potentiated, but daily water intake may more than double. Such water drinking continues during the day when plasma osmolality has decreased below the normal set point (Smardencas & McKinley, 1994). It is possible that normal inhibitory influences of hypotonicity or hypovolemia on thirst may be relayed via the septum, being disrupted when the septal region is ablated. There is also evidence of a relay via the nucleus of the diagonal band from the hindbrain that is involved in hypovolemic thirst (Sullivan et al., 2003).
EFFECTOR REGIONS FOR THIRST IN THE CEREBRAL CORTEX Unlike vasopressin secretion, where the neuroendocrine motor output from neurons in the supraoptic and paraventricular nucleus is well defined, the effector sites in the brain that generate the emotion of thirst remain clouded in uncertainty. Thirst demands a behavioral response— fluid ingestion. As a function of the conscious brain, the
c35.indd Sec3:694
emotion of thirst has been assumed to be generated by the cerebral cortex. Cortical Stimulation In a survey of the cerebral cortex of conscious monkeys, Robinson and Mishkin (1966) electrically stimulated many cortical loci and obtained drinking responses at several sites. These included the substantia innominata, putamen, substantia nigra, preoptic region, lateral hypothalamus, and ventral tegmentum, but the region that most reliably yielded drinking behavior when stimulated was the anterior cingulate cortex. In the classical studies of Penfield and colleagues, many different superficial sites in the cerebral cortex of conscious human surgical patients were electrically stimulated, and their subjective responses recorded. The subjects, with scalp and skull locally anesthetized, although reporting many somatic and visceral sensations, rarely mention the induction of thirst during these stimulations (Penfield & Faulk, 1955; Penfield & Rasmussen, 1950). However, in two epileptic subjects, thirst or a need for water was associated with stimulation of the superior temporal gyrus (Penfield & Jasper, 1954). Brain Imaging Studies Functional brain imaging studies in human subjects infused intravenously with hypertonic saline to induce thirst have also provided information in this regard. Both positron emission tomography (PET) and functional magnetic resonance imaging (fMRI) that reflect changes in regional blood flow in the brain, show that the anterior cingulate region (see Figure 35.1, panels A and 1 for location) is activated in subjects made thirsty by this procedure (Figure 35.7), and that satiation of thirst by drinking, quickly led to this activation being extinguished (Denton et al., 1999; Egan et al., 2003). Activation of the anterior cingulate region was also correlated with thirst in another group of subjects (de Araujo, Kringelbach, Rolls, & McGlone, 2003). Activations in the posterior cingulate, parahippocampal gyrus, insular cortex (Figure 35.7), precentral gyrus, orbital gyrus, superior temporal gyrus (Figure 35.7), anterior perforated substance, and regions within the cerebellum also correlated with thirst scores. While the activity of several brain regions correlated with the thirst score, such correlations do not allow the precise function of these regions to be specified from these imaging experiments. The superior temporal region was implicated in thirst by Penfield and Jasper (1954). Three of the regions mentioned—,the anterior cingulate, insular and orbito-frontal cortices—have been implicated in the generation of homeostatic emotions (Craig, 2002).
8/17/09 3:06:13 PM
Neurochemistry of Thirst
Anterior Cingulate Cortex The brain imaging investigations in human subjects, and electrical stimulation studies in monkeys mentioned, consistently link the anterior cingulate cortex with thirst mechanisms. The anterior cingulate region has been shown to be activated by sensory (e.g., pain, temperature), emotional (e.g., depression), cognitive (e.g., mental arithmetic), autonomic and reward based stimuli (Bush, Luu & Posner, 2000; Craig, 2002; Critchley, Corfield, Chandler, Mathias, & Dolan, 2000; Gehring & Taylor, 2004; Mayberg et al., 1999). It appears to be activated when an adverse condition occurs and a decision regarding a response strategy needs to be made (Gehring & Taylor, 2004). A characteristic of humans and rats that have undergone destruction of the anterior cingulate region is an apparent apathy, or lack of concern to rectify an adverse condition (Eslinger & Damasio, 1985; Johansen, Fields, & Manning, 2001). Although there appears to have been no investigation of thirst in patients who have undergone surgical cingulotomy, reports of dehydration or disordered fluid homeostasis in such patients are not readily found. It is possible that the role of the anterior cingulate is to provide the motivational or impelling aspect of the emotion of thirst that will result in the drinking of water. Insular Cortex The insular cortex (see Figure 35.1, panels 2 and 3) receives visceral afferent sensory input via synapses in the NTS, parabrachial nucleus, and thalamus (Saper, 2002). The parasympathetic afferents carry neural signals to the anterior insular cortex from many different sensors that include baroreceptors and receptors in the gastrointestinal tract that are known to influence thirst. In the schema proposed by Craig (2003), the role of the insula in the generation of specific homeostatic emotions, such as thirst, is to give specificity to the emotion in regard to the homeostatic perturbation—the subjective feeling of a person’s homeostatic state. Orbito-Frontal Cortex de Araujo et al. (2003) utilized fMRI to show that water in the mouth activated part of the orbito-frontal cortex of thirsty subjects, but not water-replete subjects. The location of this region is shown in Figure 35.1 (panels A and 1). They interpreted this result as an indication that the orbitofrontal cortex provided a hedonic component to the behavioral response of drinking. Water in the mouth is pleasant if the subject is thirsty, but less so if the subject is not thirsty. Craig (2002) concluded that as the anterior cingulate and insular cortices are connected to the orbito-frontal cortex, sensory signals relating to the homeostatic state of the individual will reach this region and be interpreted there as to their pleasantness or reward value. In this schema, the
c35.indd Sec4:695
695
cortical generation of the emotion of thirst would involve activations of neurons within the anterior cingulate cortex for motivational intensity, the insula for homeostatic specificity, and the orbito-frontal region for reward.
NEUROCHEMISTRY OF THIRST Neurotransmitters Glutamate Glutamatergic neural pathways are likely to have an important role in mediating thirst. Intracerebroventricular injection of the NMDA receptor antagonist MK801 blocks drinking responses stimulated by angiotensin II, intragastric hypertonic saline or water deprivation for 22 hours. Increased c-fos expression in the median preoptic nucleus, but not the subfornical organ, in response to angiotensin II and water deprivation was reduced by the MK801 (Xu & Herbert, 1998; Xu, Lane, Zhu, & Herbert, 1997), suggesting that a glutamatergic input to the median preoptic nucleus mediates both osmoregulatory and angiotensin-stimulated thirst. The vesicular glutamate transporter vGlut2, a marker of glutamatergic neurons, is expressed in the median preoptic nucleus, periphery of the subfornical organ, and dorsal cap of the OVLT, consistent with excitatory glutamatergic output from each of these regions of the lamina terminalis (Grob, Trottier, Drolet, & Mouginot, 2003). The non-NMDA receptor antagonist drug CNQX stimulates drinking in the rat (Xu & Johnson, 1998), suggesting that a glutamatergic pathway also drives an inhibitory input to thirst. Acetylcholine Acetylcholine has long been considered to have a role as a neurotransmitter in the neural circuitry subserving thirst. Microinjection of acetylcholine or the cholinergic agonist carbachol into several regions of the brain that include the LH, the preoptic and septal regions, hypothalamic paraventricular nucleus, and the subfornical organ stimulates water intake in rats (Fisher & Levitt, 1967; Grossman, 1960; Mangiapane & Simpson, 1983; Swanson & Sharpe, 1973). Systemic administration of the cholinergic muscarinic receptor blocking drug atropine sulphate, that has passage across the blood-brain barrier, inhibits but does not abolish water drinking in response to hypertonicity resulting from either intraperitoneal injection of hypertonic saline or water deprivation, hypovolemia resulting from polyethylene glycol injection, or day to water drinking in normal and lactating rats (Blass & Chapman, 1971; Fitzsimons & Setler, 1975; Speth, Smith, & Grove, 2002). Whether acetylcholine has a role in thirst in species other than the rat has yet to be proven.
8/17/09 3:06:13 PM
696 Thirst
Regarding the dipsogenic action of carbachol on the subfornical organ of the rat (Mangiapane & Simpson, 1983), neurons within the outer annulus of the subfornical organ receive a strong cholinergic innervation (Xu, Pekarek, Ge, & Yao, 2001). This cholinergic input originates from the medial septal nucleus and diagonal band, exerting direct excitatory actions on subfornical neurons via the M3 muscarinic receptor subtype (Honda et al., 2003). Acetylcholine could also affect thirst by a presynaptic action to reduce GABAergic influences on the subfornical organ (Xu, Honda, Ono, & Inenaga, 2001). Dopamine Interest in possible dopaminergic involvement in thirst mechanisms has resulted from observations that polydipsia that is often observed in psychotic states is modified by neuroleptic drugs (Canuso & Goldman, 1996). Injection of dopamine intracerebroventricularly at relatively large dosage (Setler, 1973) or peripheral dopaminergic agonists (pergolide, bromocriptine) that enter the brain (Fregly & Rowland, 1988; Zabik, Sprague, & Odio, 1993) will either stimulate or augment water drinking in rats, although another dopaminergic agonist, quinpirole hydrochloride, inhibits a number of dipsogenic responses (Fregly & Rowland, 1986). Systemic administration of dopaminergic D2 receptor blocking drugs such as haloperidol, pimozide, or spiperone inhibit water intake in response to several different dipsogenic stimuli in rats (Fitzsimons & Setler, 1975; Fregly & Rowland, 1986, 1988; Zabik et al., 1993). However, while these data show that dopamine is probably influential in neural pathways of thirst, drawing any firm conclusions in regard to its exact role and locus of action as a transmitter in thirst circuitry is fraught with difficulty. This is because of the multiplicity of neural systems (e.g., sensory-motor, reward, neuroendocrine pathways) they influence and the lack of receptor specificity of most dopaminergic antagonists. Noradrenaline Noradrenergic pathways may influence thirst at more than one level of organization. Peripheral administration of clonidine, the presynaptically acting ·2-adrenoceptor agonist has a strong inhibitory action on drinking responses to osmoregulatory and hypovolemic dipsogenic stimuli in the rat. These actions are blocked by the α2-antagonist yohimbine, and this drug also augments angiotensininduced thirst (Fregly, Kelleher, & Greenleaf, 1981; Fregly & Rowland, 1986). As stimulation of ·2-adrenoceptors reduces the presynaptic release of noradrenaline at nerve terminals, these data suggest an important function of noradrenaline release within neural pathways of thirst, that may be regulated by adrenergic α2-autoreceptors. In
c35.indd Sec4:696
regard to central sites of noradrenaline action, depletion of catecholamines within the ventral lamina terminalis region by injections of the neurotoxin 6-hydroxydopamine therein disrupts angiotensin-stimulated drinking, and this effect is due to loss of noradrenergic rather than dopaminergic neurons (Bellin, Landas, & Johnson, 1988; Cunningham & Johnson, 1989, 1991). The origin of the noradrenergic input to the median preoptic nucleus is likely to be the A1 group in the caudal ventrolateral medulla (Kawano & Masuko, 1999), and it is proposed that this noradrenergic input combines an angiotensinergic pathway to drive thirst responses (Johnson & Thunhorst, 1997). There is also evidence of an ascending noradrenergic input from the A2 group in the nucleus of the solitary tract to the subfornical organ that could influence thirst (Tanaka, Hayashi, Shimamune, & Nomura, 1997). Serotonin Pharmacological studies of serotonergic agonist and antagonists injected into the CNS show that drinking responses can be influenced by these agents. They are particularly effective when injected into the lateral parabrachial nucleus in the midbrain where the nonselective serotonin-blocking drug methysergide causes enhanced angiotensin-induced water drinking, while injection there of serotonin agonists depress water drinking (Menani & Johnson, 1995). Gamma Amino Butyric Acid Inhibitory neural pathways influencing thirst presumably have an important role in satiety and baroreceptor inhibition of thirst, and gamma amino butyric acid (GABA) neurotransmission is likely to mediate these inhibitory influences. GABAergic mechanisms appear to play an important role within the lamina terminalis. The median preoptic nucleus receives GABAergic input from the subfornical organ as well as other brain regions. GABA may act presynaptically via GABAB receptors, and at postsynaptic sites through GABAA receptors in the median preoptic nucleus (Kolaj, Bai, & Renaud, 2004). GABAergic neurons are present in the median preptic nucleus (Grob et al., 2003), providing a significant output from it to vasopressin-containing neurons in the supraoptic nucleus (Nissen & Renaud, 1994), and possibly also to regions that influence thirst. Nitric Oxide Nitrous oxide (NO) may have a role as an inhibitory neurotransmitter or locally released influence on thirst. The enzyme that facilitates its production, neuronal nitric oxide synthase, is found in high concentration within the thirstmediating regions of the subfornical organ, OVLT, and median preoptic nucleus (Jurzak, Schmid, & Gerstberger,
8/17/09 3:06:13 PM
Neurochemistry of Thirst
1994). Water intake in response to angiotensin II or water deprivation is inhibited by central administration of l-arginine from which NO is synthesized, and this effect is blocked by inhibitors of nitric oxide synthase (Calapai & Caputi, 1996). Direct application of nitroprusside, an NO donor, to angiotensin-sensitive subfornical organ neurons in vitro depresses their electrical activity (Rauch, Schmid, DeVente, & Simon, 1997), showing it likely that NO exerts its action on thirst by inhibiting angiotensin-sensitive neurons in the lamina terminalis. On the other hand, it has also been shown that centrally administered inhibitors of neuronal nitric oxide synthase also cause an inhibition of angiotensin drinking in the rat (Kadekaro & Summy-Long, 2000), which is not consistent with the theory that NO inhibits thirst. However, results from experiments employing centrally administered nitric oxide synthase inhibitors should be viewed with caution because reduction in fluid intakes caused by central administration of these drugs could be secondary to other effects such as hyperthermia (Mathai, Arnold, Febbraio, & McKinley, 2004) and increased arterial pressure (Kadekaro & Summy-Long, 2000) that also result from NOS inhibition.
Neuropeptides A number of neuropeptides have been reported to influence water drinking when injected into the brain (Table 35.1). However, the physiological role in thirst of most of these peptides is still unclear. Angiotensin II is by far the most widely studied of the dipsogenic neuropeptides.
TABLE 35.1
c35.indd Sec4:697
697
Angiotensin II In addition to its function as the circulating effector hormone of the peripheral renin-angiotensin system, the octapetide angiotensin II is generated within the brain independently of the peripheral renin-angiotensin system. All components of a brain renin-angiotensin system (peptides, enzymes and receptors) exist within the brain. While astrocytes are the main site for synthesis in the brain of the precursor peptide angiotensinogen (Lynch, Hawelu-Johnson, & Guyenet, 1987), angiotensin peptides are probably generated within neurons. Angiotensin receptors, both AT1 and AT2 subtypes, are located on neurons in many brain regions associated with body fluid and cardiovascular homeostasis (Allen et al., 2000; Lenkei, Palkovits, Corvol, & Llorens-Cortes, 1997). Except for those receptors in the circumventricular organs that have been described in an earlier section of this chapter, these receptor sites are within the blood barrier. Therefore, they are not influenced directly by blood-borne angiotensin II, but are likely to have brain-generated angiotensin as their endogenous ligand. One of the most powerful of all experimental dipsogenic procedures, and one of the most investigated, is the injection of angiotensin II into the ventricular system or specific regions of the brain, whether the subject be rat, dog, goat, sheep, cow, or monkey (Fitzsimons, 1998). This effect is mediated largely by AT1 receptors in the region of the anteroventral wall of the third ventricle (AV3V region), and not the subfornical organ, the site that transduces the drinking response to circulating angiotensin II (Buggy & Johnson, 1978). Neurons within the median
Neuropeptides that influence water intake when injected into the cerebral ventricles or specific brain regions.
Neuropeptide
Effect on Water Intake
Reference
Adrenomedullin
Inhibition
Murphy & Samson (1995)
Angiotensin II
Stimulation
Fitzsimons (1998)
Angiotensin III
Stimulation
Wilson et al. (2005)
Appelin
Stimulation
Taheri et al. (2002)
Atrial natriuretic peptide
Inhibition
Antunes-Rodrigues et al. (1985)
Brain natriuretic peptide
Inhibition
Itoh et al. (1988)
Corticotropin-releasing hormone
Inhibition
Van Gaalen et al. (2002)
Dermorphin
Inhibition
De Caro (1986)
β-endorphin
Inhibition
Summy-Long et al. (1981)
Leu-enkephalin
Inhibition
Summy-Long et al. (1981)
Met-enkephalin
Inhibition
Summy-Long et al. (1981)
Endothelin
Inhibition
Samson et al. (1991)
Galanin
Inhibition
Brewer et al. (2005)
Melanin-concentrating hormone
Stimulation
Clegg et al. (2002)
Orexin A
Stimulation
Kunii et al. (1999)
Substance P
Inhibition (mammals)
De Caro (1986)
8/17/09 3:06:13 PM
698 Thirst
preoptic nucleus, which is situated within the AV3V region, are activated by intracerebroventricular angiotensin II (Herbert et al., 1992), and direct injection of angiotensin II into the median preoptic nucleus of the rat causes water drinking (O’Neill & Brody, 1987). Most likely, intracerebroventricularly injected angiotensin II acts on the median preoptic nucleus to stimulate drinking and mimics the effect of synaptically released angiotensin II at this site. Angiotensin III, the heptapeptide formed by degradation of angiotensin II is also dipsogenic when administered centrally (Wright, Morseth, Abhold, & Harding, 1985) and may act also at central angiotensinergic receptors for thirst. In regard to central angiotensinergic relays mediating thirst, an angiotensergic synapse within the median preoptic nucleus relaying signals to it from angiotensin II stimulated neurons in the subfornical organ has been proposed (Johnson et al., 1996). Pharmacological blockade of AT1 receptors in the brain by losartan inhibits water drinking in the rat in response to intracerebroventricular infusion of hypertonic saline in several species, suggesting that a central angiotensinergic relay mediates osmoregulatory drinking (Blair-West et al., 1994). However, recent observations that osmoregulatory drinking is entirely intact in genetically modified mice totally lacking angiotensin peptides due to deletion of the gene encoding angiotensinogen are not in agreement with this suggestion (McKinley, Alexiou, et al., 2006).
SATIATION OF THIRST Satiation of thirst, at least initially, is more than just an absence of thirst. The act of drinking by the thirsty person is pleasurable, and this response appears to be more than just the mere removal of the tormenting aspects of thirst. It is likely that extinguishing thirst involves the activation of dopaminergic reward pathways in the brain (Ettenberg & Camp, 1986). The Swedish explorer Sven Hedin (1865–1952) gives an account (cited by Wolf, 1958) of the exhilaration and joy that came from finding and ingesting water when he was extremely thirsty during a harsh journey through the arid Taklamakan desert of western China. He relates “I stood on the brink of a little pool of water—beautiful water! … I took the tin box out of my pocket and filled it, and drank. How sweet that water tasted! Nobody can conceive it who has not been within an ace of dying of thirst. I lifted the tin to my lips, calmly, slowly, deliberately, and drank, drank, drank, time after time. How delicious! What exquisite pleasure! The noblest wine pressed out of the grape, the divinest nectar ever made, was never half so sweet.” Thirst satiety provides the signal that prevents excessive hydration, and this satiety usually occurs before systemic
c35.indd Sec4:698
absorption of ingested water. The consequences of overhydration—hyponatremia and cerebral oedema—can be as lethal as severe dehydration. Thus, this satiating mechanism is a crucial homeostatic emotion that contributes to accurate repletion of body fluids without overhydration following a period of dehydration. Voluntary Dehydration The speed at which water is drunk and thirst quenched following a period of water deprivation varies considerably across mammals. Some species (e.g., dog, camel, sheep, goats, deer) when dehydrated, replace fluid deficits immediately on gaining access to drinking water. Others (e.g., rats, humans, horse) replace their fluid deficit more slowly and may take several hours to restore fluid balance (Adolph, 1950). In this latter group, after an initial drinking bout that is not sufficient to replace all the fluid lost, but sufficient to cause temporary satiety or loss of thirst, intermittent drinking bouts occur over a few hours until rehydration is complete. This phenomenon has been termed “voluntary dehydration.” The gradual replenishment of a fluid deficit may be protective against rapidly occurring hyponatremia and cerebral oedema. Scientific investigation of the signals that bring about thirst satiety goes back to Claude Bernard who studied the effects of “sham drinking” in dehydrated dogs with an esophageal fistula. This phenomenon has been investigated in several species (dog, rat, sheep, monkey, man) and it is clear that animals drink considerably more than their fluid deficit if the water imbibed immediately leaves the body through an oesophageal or gastric fistula (Bott, Denton, & Weller, 1965; Towbin, 1949; Wood, Maddison, Rolls, Rolls, & Gibbs, 1980). Indeed, animals with an open fistula will continue to drink until they appear fatigued from the effort. Therefore, the act of drinking and swallowing water per se is an insufficient stimulus to satiate thirst. Yet, if a quantity of water equivalent to a dehydrational deficit is placed by tube directly in the stomach without it touching the mouth, pharynx, and esophagus, thirst will not be relieved for some time (Figaro & Mack, 1997; Thrasher, Nistal-Herrera, Keil, & Ramsay, 1981). However, an essential aspect of the initial thirst-satiating mechanism is that fluid remains in the stomach after it has been ingested (Blass & Hall, 1976; Gibbs, Rolls, & Rolls, 1986). Therefore, a combination of oropharyngeal, esophageal, and gastric signals, in appropriate temporal sequence, appear to be necessary for thirst satiety to be achieved in the short term. The amount of water ingested during the initial rehydrating bout does not depend on its temperature or composition—water, isotonic saline, or isotonic glucose are
8/17/09 3:06:14 PM
Physiological and Pathophysiological Conditions Influencing Thirst
similarly ingested (Appelgren, Thrasher, Keil, & Ramsay, 1991; Hoffman, DenBleyker, Smith, & Stricker, 2005). These data suggest that mechanical distention signals from the throat, esophagus, and stomach mediate the initial stop signal for drinking. Capsaicin treatment, which damages vagal afferent nerves from the gut in rats, leads to lack of thirst satiation initially (Curtis & Stricker, 1997), suggesting that these neural signals are carried via this nerve to the brain. These preabsorptive signals from the gastrointestinal tract also influence other homeostatic responses; they rapidly suppress vasopressin release and stimulate sweating in humans and panting in animals when they rehydrate (Appelgren et al., 1991; Baker & Turlejska, 1989; Figaro & Mack, 1997; Takamata, Mack, Gillen, Jozsi, & Nadel, 1995; Thrasher et al., 1981). Thereafter, absorption of ingested water in the duodenum also provides a satiating signal for thirst (Gibbs et al., 1986) that could be vagally mediated because selective hepatic vagotomy causes dehydrated rats to overdrink (Smith & Jerome, 1983). In the longer term, absorption of fluid into the systemic circulation and reduction of plasma osmolality makes a minor contribution to satiety (Wood et al., 1980).
c35.indd Sec5:699
699
treatment with human chorionic gonadotrophin reduces the thirst threshold in mothers, and have suggested that this hormone has a signaling role in resetting the osmoreceptor. Relaxin may have a similar role in rats (Weisinger, Burns, Eddie, & Wintour, 1993). Lactation
PHYSIOLOGICAL AND PATHOPHYSIOLOGICAL CONDITIONS INFLUENCING THIRST
Lactation involves loss of fluid by mothers in the form of milk. This fluid loss needs to be replaced if dehydration is to be prevented and adequate milk supply maintained. There are many anecdotal reports of thirst being experienced by nursing mothers, and the daily water intake of animals with multiple offspring such as rats and rabbits increases considerably during lactation (Denton, et al., 1977; Richter & Barelare, 1938). Inhibition of the increased daily water intake in lactating rats by a centrally administered angiotensin antagonist suggests a role for brain angiotensinergic pathways in lactation-induced thirst (Speth et al., 2002). James, Irons, Holmes, Drewett, and Bayliss (1995) observed that thirst and water intake increased during suckling periods in 10 nursing mothers, and the increased thirst corresponded with oxytocin secretion and milk letdown. No change in plasma osmolality or vasopressin levels occurred. These investigators suggested that suckling may send afferent nerve signals to the hypothalamus that generate thirst as well as oxytocin secretion.
Pregnancy
Age
Pregnancy is a condition that alters body fluid balance. Plasma osmolality falls by approximately 10 mosmol/kg within 5 to 10 weeks following conception; the reduced plasma osmolality is maintained throughout pregnancy (Davison, Gilmore, Durr, Robertson, & Lindheimer, 1984). Plasma osmolality also falls during pregnancy in rats, yet vasopressin continues to be secreted, as it does in pregnant women, in the face of the plasma hypotonicity. It is proposed that the osmostat for vasopressin secretion is reset to a lower level (Davison et al., 1984; Durr et al., 1981). Despite plasma osmolality falling, animals increase daily water intake during pregnancy (Denton, McKinley, Nelson, & Weisinger, 1977; Richter & Barelare, 1938), which suggests that the osmoreceptor for thirst is also reset during pregnancy. The observation that pregnant homozygous Brattleboro rats, which are devoid of vasopressin, double daily water intake and lower plasma osmolality from 310 to 292 mosmol/kg, is consistent with a resetting of the thirst osmostat in these animals. The threshold osmolality for thirst in pregnant women is 10 mosmol/kg lower than it is preconception or postpartum (Davison, Shiells, Phillips, & Lindheimer, 1988). The same authors have shown also that
As animals and humans age, thirst may become impaired. As a consequence, they drink less fluid in response to being dehydrated, and this impaired thirst renders them liable to the deleterious effects of dehydration such as heat stroke in hot weather. The reasons for the waning of thirst with age are unclear. Some but not all investigators report that thirst ratings and amount of water drunk in response to a purely osmotic stimulus (e.g., intravenous infusion of hypertonic saline) are depressed in elderly subjects (Kenney & Chiu, 2001; Phillips et al., 1984). More consistent are observations that thirst and fluid intake following a period of water deprivation are less intense in elderly subjects in comparison to young adults (Kenney & Chiu, 2001). Animal models of the influence of aging on thirst have been described recently. Brown-Norway and Munich Wistar (but not Sprague Dawley or Fischer 344) strains of rat exhibit progressively impaired drinking responses to water deprivation and hypertonicity as they age, while hypovolemic thirst was not reduced until advanced age. The dipsogenic reponse to angiotensin II, however, was not diminished with age (McKinley, Denton, et al., 2006;
8/17/09 3:06:14 PM
700 Thirst
Thunhorst & Johnson, 2003). Several influences on thirst are changed with age (Ferrari, Radaelli, & Centola, 2003; Kenney & Chiu, 2001), and could contribute to age-impaired thirst. First, cardiovascular reflexes arising from arterial baroreceptors and cardiopulmonary receptors are known to be reduced in elderly humans and this may depress thirst in response to hypovolemia; second, elevated plasma concentrations of atrial natriuretic peptide have been observed in aged humans and rats; third, the reninangiotensin system is depressed in the aged, and this may influence thirst mechanisms in older subjects. In a recent study (Farrell et al., in press), young adults and elderly human subjects were infused intravenously with hypertonic saline, then allowed to satiate their thirst while undergoing PET imaging of the brain. Both groups reported similar thirst ratings as a result of the hypertonic stimulus, but the elderly only drank half the volume of water to quench their thirst and reduce activity in the cingulate cortex compared to the younger group. The results suggest that mechanisms of thirst satiety change with age; this may contribute to the susceptibility of the elderly to become dehydrated. An increase in core body temperature induces evaporative cooling responses of sweating and panting (depending on the species) that result in loss of fluid from the body. If thirst is a response to increased core temperature, resulting fluid intake could be an anticipatory mechanism to prevent dehydration. While most water intake that occurs with hyperthermia is secondary to dehydration (Barney & Folkerts, 1995; Hainsworth, Stricker, & Epstein, 1968), there is evidence that direct thermal stimulation of the preoptic-hypothalamic region of the brain can stimulate water drinking (Andersson & Larsson, 1961). An increase of 0.8oC in core temperature potentiated the thirst and water ingested by human subjects in response to intravenous hypertonic saline (Takamata, Mack, Stachenfeld, & Nadel, 1995), suggesting that there is a physiological effect of increased body temperature to enhance thirst. Thirst in Some Disease States Diabetes Insipidus The water intake of patients with diabetes insipidus is enormous, often exceeding 12 litres per day (Blotner, 1951). It is driven by thirst that arises from continual dehydration caused by excessive loss of dilute urine due to lack of vasopressin (cranial diabetes insipidus) or insensitivity of the kidney to its action (nephrogenic diabetes insipidus). Administration of pitressin or the vasopressin analogue dDAVP ameliorates the diuresis of central diabetes insipidus and fluid intake correspondingly decreases. The sensitivity of the thirst mechanism to hypertonicity is normal
c35.indd Sec5:700
in patients with diabetes insipidus (Thompson & Bayliss, 1987). Indeed, normal thirst and subsequent fluid intake are essential for preventing lethal dehydration in untreated diabetes insipidus. Diabetes Mellitus Untreated diabetes mellitus (type 1 or 2) is characterized by strong thirst, often the first indication of its onset. Excessive urine output in the form of a glycosuria also occurs due to failure of the kidney to reabsorb the high filtered glucose load that results from lack of insulin, or insensitivity of tissues to insulin. Dehydration of both intracellular and extracellular compartments may result, but because high blood glucose levels do not usually stimulate thirst, diabetic thirst has been considered more likely to be the result of extracellular fluid loss (Fitzsimons, 1998). A recent study of water intake in rats with diabetes mellitus (induced by treatment with the drug streptozotocin) showed that systemic administration of an angiotensin antagonist had only a small inhibitory effect on their water drinking. Increased expression of c-fos was observed in osmoregulatory regions of the lamina terminalis: the dorsal cap of the OVLT, periphery of subfornical organ, and median preoptic nucleus. Moreover, while intravenous infusion of hypertonic glucose is not normally dipsogenic, it does stimulate drinking in diabetic rats, suggesting that diabetic thirst involves osmoreceptor stimulation (McKinley, Burns, Oldfield, Sunigawa, & Weisinger, 2004). Intracellular dehydration of the osmoreceptor could occur in diabetic rats because the osmoreceptor ’s glucose transporter is saturated by high plasma glucose concentrations, and/or it is normally insulin sensitive. If so, when insulin is absent or ineffective, hypertonic glucose would be excluded from the osmoreceptor thereby creating an osmotic gradient and cellular dehydration. Heart Failure When the left ventricle of the heart is damaged and fails to adequately perfuse the tissues of the body with blood, compensatory autonomic and neuroendocrine mechanisms are engaged. These include increased activity of sympathetic nerves and the renin-angiotensin-aldosterone system, and vasopressin release. Congestive heart failure is characterized by excessive fluid retention and hyponatremia, that together with increased sympathetic- and angiotensinmediated vasoconstriction, deleteriously increase the load on the failing heart (Kalra, Anker, & Coats, 2001; Packer, 1992). Fluid intake is maintained in the face of low plasma osmolality, high blood levels of ANP, and expanded extracellular fluid volume. Studies of thirst in such patients have not been undertaken, but it has been shown in rats with heart failure caused experimentally by coronary
8/17/09 3:06:14 PM
Summary
artery ligation, that water intake in response to dehydration is excessive (De Smet, Menadue, Oliver, & Phillips, 2003). The authors suggested that increased sensitivity of the osmoreceptor, increased blood angiotensin II levels, or changed baroreceptor stimulation could be driving the increased thirst in these animals. Fever There are reports that administration of the pyrogenic agent lipopolysaccharide (a molecule derived from bacterial cell walls) stimulates drinking behavior, particularly in the early phase of fever (Szczepanska-Sadowska, Sobocinska, & Kozlowski, 1979; Wang & Evered, 1993). However, once lipopolysaccharide-induced fever has peaked and stabilized, thirst is depressed (Nava, Calapai, De Sarro, & Caputi, 1996; Szczepanska-Sadowska et al., 1979). The inhibition of thirst that occurs following pyrogen administration can be dissociated from the febrile response in that repeated treatment with pyrogen can result in tolerance to fever, but not to the antidipsogenic action of lipopolysaccharide (Nava & Carta, 2000). While a centrally administered interleukin-1 antagonist blocked pyrogen-induced fever, it did not block its antidipsogenic effect (Nava et al., 1996). Central administration of the nitric oxide oxide synthase inhibitor L-NAME, while normally inhibitory to thirst, has been shown to reverse the inhibition of dehydration-induced thirst caused by fever in rats (Raghavendra, Agrewala, & Kulkarni, 1999), suggesting that an inhibitory influence of NO on the neural circuitry of thirst results from the administration of a pyrogen.
c35.indd Sec6:701
701
mechanism operates normally around a lower set point of plasma osmolality. The cause of the alteration in thirst threshold plasma osmolality in SIADH is unknown. Hemodialysis Many patients with end stage renal failure undergoing hemodialysis report strong thirst, water drinking, weight gain, and extracellular fluid volume expansion between periods of dialysis. There is evidence that high plasma angiotensin II levels contribute to this thirst because it has been eliminated by bilateral nephrectomy (Rogers & Kurtzman, 1973) and reduced by treatment with an angiotensin converting enzyme inhibitor (Oldenburg, MacDonald, & Shelley, 1988). Increase of plasma sodium and urea concentrations during the interdialytic period, and the concentration of sodium in the dialysis fluid may also be dipsogenic factors in these patients (Giovannetti et al., 1994). Schizophrenia Compulsive water drinking is observed in 10% to 20% of schizophrenic subjects. Excessive water drinking and fluid retention can lead to lethal hyponatremia in these patients (Illowski & Kirch, 1988). Whether the compulsion to drink water is the result of altered thirst is debatable (Goldman, Robertson, Luchin, & Hedeker, 1996). However, it has been reported that thirst ratings in compulsive water drinkers in response to infusion of hypertonic saline are higher than normal, and remained elevated following drinking episodes, unlike control subjects in which drinking quickly reduced the thirst rating (Thompson, Edwards, & Bayliss, 1991).
Syndrome of Inappropriate Antidiuretic Hormone Secretion
SUMMARY
The syndrome of inappropriate antidiuretic hormone secretion (SIADH) is characterized by hyponatremia, normovolemia, and abnormally high plasma vasopressin concentrations relative to the low plasma osmolality. This condition is associated with various diseases such as lung cancer, pancreatic cancer, and bronchiectasis and is side effect of some therapeutic drugs (Robertson, 1989; Smith, Moore, Tormey, Bayliss, & Thompson, 2004). While vasopressin secretion is excessive relative to plasma osmolality (approximately 270 mosmol/kg) in SIADH, the maintenance of normal fluid intake and thirst by SIADH patients is also inappropriate. Smith et al. (2004) studied thirst in eight patients with SIADH of varying etiology, and consistently observed that the osmotic threshold for thirst was reduced from 288 in normal control subjects to 270 mosmol/kg in SIADH. The intensity of thirst increased from these different thresholds similarly in both groups, and the drinking of water immediately and normally suppressed thirst in both groups. Therefore, in SIADH, the thirst
This chapter focused on the physiological mechanisms that regulate the emotion of thirst. However, in regard to the physiological and psychological mechanisms regulating the amount of water ingested by humans and animals, thirst is but one of a number of interacting factors that determine fluid intake. In some instances, thirst may have no influence at all on the amount of water ingested. A striking and tragic example of this in recent times is the slavish adherence to advice of some athletes to drink excessive amounts of water prior to endurance sporting events. Fluid intake in the face of overhydration, and therefore lack of any thirst, has caused severe hyponatremia (low plasma sodium concentration), cerebral edema, and rapid death in some cases (Noakes et al., 2005; Verbalis, Goldsmith, Greenberg, Schrier, & Sterns, 2007). The popular belief that it is necessary to ingest eight glasses of water per day, regardless of physical activity or ambient temperature, is another example, albeit less dangerous, of water intake
8/17/09 3:06:15 PM
702 Thirst
based not on thirst, but on advice that appears to lack supporting empirical evidence (Valtin, 2002). In many human societies, water is ingested most often in the form of beverages. While thirst may drive the intake of beverages in certain circumstances, most often such intake is determined by reward aspects of the ingested fluid. Taste, odor, and temperature of ingested beverages can be reinforcers of beverage ingestion, as can the pharmacological properties of alcohol- and caffeinecontaining drinks and the social consequences of drinking behavior (Booth, 1991). Learning also probably plays an important role in determining regulatory (homeostatic) and nonregulatory drinking behavor. Complex neural pathways involving the orbito-frontal cortex and amygdala (McDannald, Saddoris, Gallagher, & Holland, 2005; Rolls, 2000) as well as mesolimbic dopaminergic and opiate reward system (Kelley & Berridge, 2002; Wise, 2002) may drive a large part of the fluid intake of sedentary humans. Nevertheless, a major proportion of the world’s population still resides in tropical and subtropical climates and participates in relatively intense physical activity in the course of earning a living or recreational pursuits. It is in conditions where nonregulatory intake of fluid as a beverage fails to deliver sufficient water to maintain adequate fluid balance, that the emotion of thirst provides the fail-safe signal to ingest fluid and restore a water deficit. The remarkable constancy of plasma osmolality that occurs throughout life, and across many mammalian species, attests to the effectiveness of the neural, endocrine and behavioral mechanisms (Figure 35.3) that regulate body fluid homeostasis. Thirst has a pivotal role in these mechanisms.
Andersson, B. (1953). The effect of injections of hypertonic NaCl solutions into different parts of the hypothalamus of goats. Acta Physiologica Scandinavica, 81, 188–201. Andersson, B. (1978). Regulation of water intake. Physiological Reviews, 58, 582–683. Andersson, B., & Larsson, S. (1961). Influence of local temperature changes in the preoptic area and rostral hypothalamus on the regulation of food and water intake. Acta Physiologica Scandinavica, 52, 75–89. Andersson, B., & McCann, S. M. (1955). A further study of polydipsia evoked by hypothalamic stimulation. Acta Physiologica Scandinavica, 33, 333–346. Andersson, B., Leksell, L. G., & Lishajko, F. (1975). Perturbations in fluid balance induced by medially placed forebrain lesions. Brain Research, 99, 261–275. Antunes-Rodrigues, J., McCann, S. M., Rogers, L. C., & Sampson, W. K. (1985). Atrial natriuretic factor inhibits dehydration and angiotensin II-induced water intake in the conscious unrestrained rat. Proceedings of the National Academy of Sciences, USA, 82, 8720–8723. Appelgren, B. H., Thrasher, T. N., Keil, L. C., & Ramsay, L. C. (1991). Mechanism of drinking: Induced inhibition of vasopressin secretion in dehydrated dogs. American Journal of Physiology, 261, R1226–R1233. Baertschi, A. J., & Vallet, P. G. (1981). Osmosensitivity of the hepatic portal vein area and vasopressin release in rats. Journal of Physiology, 215, 217–230. Baker, M. A., & Turlejska, E. (1989). Thermal panting in dehydrated dogs: Effects of plasma volume expansion and drinking. Pflugers Archive, 413, 511–515. Barney, C. C., & Folkerts, M. M. (1995). Thermal dehydration-induced thirst in rats: Role of body temperature. American Journal of Physiology, 269, R557–R564. Barofsky, A. L., Grier, H. C., & Pradhan, T. K. (1980). Evidence for regulation of water intake by median raphe serotoninergic neurons. Physiology and Behavior, 24, 951–955. Bellin, S. I., Landas, S. K., & Johnson, A. K. (1988). Selective catecholamine depletion of structures along the ventral lamina terminalis: Effects on experimentally induced drinking and pressor responses. Brain Research, 456, 9–16. Berk, M. L., & Finkelstein, J. A. (1981). Afferent projections to the preoptic and hypothalamic regions of the rat brain. Neuroscience, 6, 1601–1624.
REFERENCES Abraham, S. F., Baker, R. M., Blaine, E. H., Denton, D. A., & McKinley, M. J. (1975). Water drinking induced in sheep by angiotensin: A physiological or pharmacological effect? Journal of Comparative and Physiological Psychology, 88, 503–518. Abraham, S. F., Coghlan, J. P., Denton, D. A., McDougall, J. G., Mouw, D. R., & Scoggins, B. A. (1976). Increased water drinking induced by sodium depletion in sheep. Quarterly Journal of Experimental Physiology, 61, 185–192. Adolph, E. F. (1950). Thirst and its inhibition in the stomach. American Journal of Physiology, 161, 374–386. Allen, A. M., Oldfield, B. J., Giles, M. E., Paxinos, G., McKinley, M. J., & Mendelsohn, F. A. O. (2000). Localization of angiotensin receptors in the nervous system. In R. Quirion, A. Bjorklund, & T. Hokfelt (Eds.), Handbook of chemical neuroanatomy: Peptide Receptors, Pt. I (Vol. 16, pp. 79–124). Amsterdam: Elsevier Science. Anderson, C. R., & Houpt, T. R. (1990). Hypertonic and hypovolemic stimulation of thirst in pigs. American Journal of Physiology, 258, R149–R154.
c35.indd Sec6:702
Black, S. L. (1976). Preoptic hypernatremic syndrome and the regulation of water balance in the rat. Physiology and Behavior, 17, 473–482. Blair-West, J. R., Burns, P., Denton, D. A., Ferraro, T., McBurnie, M. I., Tarjan, E., et al. (1994). Thirst induced by increasing brain sodium concentration is mediated by brain angiotensin. Brain Research, 637, 335–338. Blass, E. M., & Chapman, H. W. (1971). An evaluation of cholinergic mechanisms to thirst. Physiology and Behavior, 7, 679–686. Blass, E. M., & Epstein, A. N. (1971). A lateral preoptic osmosensitive zone for thirst in the rat. Journal of Comparative and Physiological Psychology, 76, 378–394. Blass, E. M., & Fitzsimons, J. T. (1970). Additivity of effect and interaction of a cellular and an extracellular stimulus of drinking. Journal of Comparative and Physiological Psychology, 70, 200–205. Blass, E. M., & Hall, W. G. (1976). Drinking termination: Interactions among hydrational, orogastric, and behavioural controls in rats. Psychological Review, 83, 356–374. Blass, E. M., & Hanson, D. G. (1970). Primary hyperdipsia in the rat following septal lesions. Journal of Comparative and Physiological Psychology, 70, 87–93.
8/17/09 3:06:15 PM
References 703 Blass, E. M., Nusssbaum, A. I., & Hanson, D. G. (1974). Septal hyperdipsia: Specific enhancement of drinking to angiotensin in rats. Journal of Comparative and Physiological Psychology, 87, 422–439. Blotner, H. (1951). Diabetes insipidus. New York: Oxford University Press. Booth, D. (1991). Influences on human fluid consumption. In D. J. Ramsay & D. A. Booth (Eds.), Thirst: Physiological and psychological aspects (pp. 53–73). London: Springer Verlag. Bott, E., Denton, D. A., & Weller, S. (1965). Water drinking in sheep with oesophageal fistulae. Journal of Physiology, 176, 323–336. Brewer, A., Langel, U., & Robinson, J. K. (2005). Intracerebroventricular administration of galanin decreases free water intake and operant water reinforcer efficacy in water-restricted rats. Neuropeptides, 39, 117–124. Brownfield, M. S., Reid, I. A., Ganten, D., & Ganong, W. F. (1982). Differential distribution of immunoreactive angiotensin and angiotensinconverting enzyme in rat brain. Neuroscience, 7, 1759–1769.
Cunningham, J. T., & Johnson, A. K. (1991). The effects of central norepinephrine infusions on drinking behaviour induced by angiotensin after 6-hydroxydopamine injections into the anteroventral region of the third ventricle (AV3V). Brain Research, 558, 112–116. Cunningham, J. T., Sullivan, M. J., Edwards, G. L., Farinpour, R., Beltz, T., & Johnson, A. K. (1991). Dissociation of experimentally induced drinking behavior by ibotenate injection into the median preoptic nucleus. Brain Research, 554, 153–158. Curtis, K. S., Huang, W., Sved, A. F., Verbalis, J. G., & Stricker, E. M. (1999). Impaired osmoregulatory responses in rats with area postrema lesions. American Journal of Physiology, 277, R209–R219. Curtis, K., & Stricker, E. M. (1997). Enhanced fluid intake by rats after capsaicin treatment. American Journal of Physiology, 272, R704–R709.
Buggy, J. W., & Johnson, A. K. (1978). Angiotensin induced thirst: Effects of third ventricular obstruction and periventricular ablation. Brain Research, 149, 117–128.
Dampney, R. A. (1994). Functional organization of central pathways regulating the cardiovascular system. Physiological Reviews, 74, 323–364.
Burrell, L. M., Lambert, H. J., & Bayliss, P. H. (1991). Effect of atrial natriuetic peptide on thirst and arginine vasopressin release in humans. American Journal of Physiology, 260, R475–R479.
Davison, J. M., Gilmore, E. A., Durr, J., Robertson, G. L., & Lindheimer, M. D. (1984). Altered osmotic thresholds for vasopressin secretion and thirst in human pregnancy. American Journal of Physiology, 247, F105–F109.
Bush, G., Luu, P., & Posner, M. I. (2000). Cognitive and emotional influences in anterior cingulate cortex. Trends in Cognitive Science, 4, 215–222. Calapai, G., & Caputi, A. P. (1996). Nitric oxide and drinking behaviour. Regulatory Peptides, 66, 117–121.
Davison, J. M., Shiells, E. A., Phillips, P. R., & Lindheimer, M. D. (1988). Serial release of vasopressin and thirst in human pregnancy: Role of human gonadotrophin in the osmoregulatory changes of gestation. Journal of Clinical Investigation, 81, 798–806.
Camargo, L. A. A., Menani, J. V., Saad, W. A., & Saad, W. A. (1984). Interaction between areas of the central nervous system in the control of water intake and arterial pressure in rats. Journal of Physiology, 350, 1–8.
de Araujo, I. E. T., Kringelbach, M. L., Rolls, E. T., & McGlone, F. (2003). Human cortical reponses to water in the mouth and the effects of thirst. Journal of Neurophysiology, 90, 1865–1876.
Cannon, W. B. (1919). The physiological basis of thirst. Proceedings of the Royal Society, B, 90, 283–301.
De Caro, G. (1986). Effects of peptides of the gut-brain-skin-triangle on drinking behaviour ofv rats and birds. In G. de Caro, A. N. Epstein, & M. Massi (Eds.), The physiology of thirst and sodium appetite (pp. 213–226). New York: Plenum Press.
Canuso, C. M., & Goldman, M. B. (1996). Does minimizing neuroleptic dosage influence hyponatremia. Psychiatry Research, 63, 227–229. Carlson, S. H., Collister, J. P., & Osborn, J. W. (1998). The area postrema modulates hypothalamic fos responses to intragastric hypertonic saline in conscious rats. American Journal of Physiology, 275, R1921–R1927. Ciura, S., & Bourque, C. W. (2006). Transient receptor vanilloid I is required for intrinsic osmoreceptionin organum vasculosum lamina terminalis neurons for normal thirst responses to systemic hyperosmolalit. Journal of Neuroscience, 26, 9069–9075. Clegg, D. J., Air, E. L., Benoit, S. C., Sakai, R. S., Seeley, R. J., & Woods, S. C. (2003). Intraventricular melanin-concentrating hormone stimulates water intake independently of food intake. American Journal of Physiology, 284, R494–R499. Coburn, P. C., & Stricker, E. M. (1978). Osmoregulatory thirst after lateral preoptic lesions. Journal of Comparative and Physiological Psychology, 92, 350–361. Contreras, R. J., Beckstead, R. M., & Norgren, R. (1982). The central projections of the trigeminal, facial, glossopharyngeal and vagus nerves: An autoradiographic study in the rat. Journal of the Autonomic Nervous System, 6, 303–322. Craig, A. D. (2002). How do you feel? Interoception: The sense of the physiological condition of the body. Nature Reviews Neuroscience, 3, 655–666.
c35.indd Sec7:703
Cunningham, J. T., & Johnson, A. K. (1989). Decreased norepinephrine in the ventral lamina terminalis region is associated with angiotensin II drinking response deficits following local 6-hydroxydopamine injections. Brain Research, 480, 65–71.
Denton, D. A. (1982). The hunger for salt. Berlin: Springer. Denton, D. A., McKinley, M. J., Nelson, J. F., & Weisinger, R. S. (1977). Pregnancy, lactation and hormone induced mineral appetite. In Y. Katsuki, M. Sato, S. Takagi, & Y. Oomura (Eds.), Food intake and chemical senses (pp. 247–262). Tokyo: Japan Scientific Societies Press. Denton, D. A., Shade, R., Zamarippa, F., Egan G., Blair-West, J., McKinley, M., et al. (1999). The correlation of regional blood flow (rCBF) and plasma sodium concentration during genesis and satiation of thirst. Proceedings of the National Academy of Sciences, USA, 96, 2532–2537. De Smet, H. R., Menadue, M. F., Oliver, J. R., & Phillips, P. A. (2003). Increased thirst and vasopressin secretion after myocardial infarction in rats. American Journal of Physiology, 285, R1203–R1211. Durr, J. A., Stamotsos, B., & Lindheimer, M. D. (1981). Osmoregulation durng pregnancy in the rat: Evidence of the resetting of the threshold for vasopressin secretion during gestation. Journal of Clinical Investigation, 68, 337–346. Edwards, G. L., & Johnson, A. K. (1991). Enhanced drinking after excitotoxic lesions of the parabrachial nucleus in the rat. American Journal of Physiology, 261, R1039–R1044.
Craig, A. D. (2003). A new view of pain as a homeostatic emotion. Trends in Neurosciences, 26, 303–307.
Edwards, G. L., & Ritter, R. C. (1982). Area postrema lesions increase drinking to angiotensin and extracellular dehydration. Physiology and Behavior, 29, 943–947.
Critchley, H. D., Corfield, D. R., Chandler, M. P., Mathias, C. J., & Dolan, R. J. (2000). Cerebral correlates of autonomic cardiovascular arousal: A functional neuroimaging investigation in humans. Journal of Physiology, 523, 259–270.
Egan, G., Silk, T., Zammaripa, F., Williams, J., Federico, P., Cunnington, R., et al. (2003). Neural correlates of the emergence of consciousness of thirst. Proceedings of the National Academy of Sciences, USA, 100, 15241–15246.
8/17/09 3:06:15 PM
704 Thirst Ehrlich, K. J., & Fitts, D. A. (1990). Atrial natriuretic peptide in the subfornical organ reduces drinking induced by angiotensin or in response to water deprivation. Behavioral Neuroscience, 104, 365–372.
Fregly, M. J. (1980). Effect of chronic treatment with estrogen on the dipsogenic response of rats to angiotensin. Pharmacology Biochemistry and Behavior, 12, 131–136.
Eng, R., & Miselis, R. R. (1981). Ploydipsia and abolition of angiotensininduced drinking after transections of subfornical organ efferent projections in the rat. Brain Research, 225, 200–206.
Fregly, M. J., Kelleher, D. L., & Greenleaf, J. E. (1981). Antidipsogenic effect of clonidine on angiotensin II-, hypertonic saline-, pilocarpine-, and dehydration-induced water intakes. Brain Research Bulletin, 7, 661–664.
Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after bilateral frontal lobe ablation: Patient EVR. Neurology, 35, 1731–1741. Ettenberg, A., & Camp, C. H. (1986). A partial reinforcement extinction effect in water-reinforced rats intermittently treated with haloperidol. Pharmacology, Biochemistry and Behavior, 25, 1231–1235. Evered, M. D. (1992). Investigating the role of angiotensin II in thirst: Interactions between arterial pressure and the control of drinking. Canadian Journal of Physiology and Pharmacology, 70, 791–797. Evered, M. D., & Mogenson, G. J. (1976). Regulatory and secondary water intake in rats with lesions of the zonz incerta. American Journal of Physiology, 230, 1049–1057. Farrell, M. J., Zamarripa, F., Shade, R., Phillips, P. A., McKinley, M., Fox, P. T., et al. (in press). Effect of ageing on regional cerebral blood flow responses associated with osmotic thirst and its satiation by water drinking: A PET study. Proceedings of the National Academy of Sciences, USA.
Fregly, M. J., & Rowland, N. E. (1986). Role for ·2 adrenoceptors in experimentally-induced drinking in rats. In G. de Caro, A. N. Epstein, & M. Massi (Eds.), The physiology of thirst and sodium appetite (pp. 509–519). New York: Plenum Press. Fregly, M. J., & Rowland, N. E. (1988). Augmentation of isoproterenolinduced drinking by acute treatment with certain dopaminergic agonists. Physiology and Behavior, 44, 473–481. Gehring, W. J., & Taylor, S. F. (2004). When the going gets tough, the cingulate gets going. Nature Neuroscience, 7, 1285–1287. Gibbs, J., Rolls, B. J., & Rolls, E. T. (1986). Preabsorptive and postabsorptive factors in termination of drinking in the rhesus monkey. In G. de Caro, A. N. Epstein, & M. Massi (Eds.), The physiology of thirst and sodium appetite (pp. 287–294). New York: Plenum Press. Gilman, A. (1937). The relation between blood osmotic pressure, fluid distribution and voluntary water intake. American Journal of Physiology, 120, 323–328.
Felix, D., & Schlegel, W. (1978). Angiotensin receptive neurons in the subfornical organ: Structure activity relations. Brain Research, 149, 107–116.
Giovannetti, S., Barsotti, G., Cupista, A., Morelli, E., Agostini, B., Posella, L., et al. (1994). Dipsogenic factors operating in chronic uremics on maintenance hemodialysis. Nephron, 66, 413–420.
Ferrari, A. U., Radaelli, A., & Centola, M. (2003). Aging and the cardovascular system. Journal of Applied Physiology, 95, 2591–2597.
Goldman, M. B., Robertson, G. L., Luchin, D. J., & Hedeker, D. (1996). The influence of polydipsia on water excretion in hyponatremic, polydipsic, schizophrenic patients. Journal of Clinical and Endocrinology and Metabolism, 81, 1465–1470.
Figaro, M. K., & Mack, G. W. (1997). Regulation of fluid intake in dehydrated humans: Role of oropharyngeal stimulation. American Journal of Physiology, 272, R1740–R1746. Findlay, A. L. R., Fitzsimons, J. T., & Kucharczyk, J. (1979). Dependence of spontaneous and angiotensin-induced drinking in the rat upon the oestrous cycle and ovarian hormones. Journal of Endocrinology, 82, 215–225. Fisher, A. N., & Levitt, R. A. (1967). Drinking induced by carbachol: Thirst circuit or ventricular modification? Science, 157, 839–841. Fitzsimons, J. T. (1961). Drinking by rats depleted of body fluid without increase in osmotic pressure. Journal of Physiology, 159, 297–309. Fitzsimons, J. T. (1963). The effects of slow infusions of hypertonic solutions on drinking and drinking thresholds in rats. Journal of Physiology, 167, 344–354. Fitzsimons, J. T. (1969). The role of a renal thirst factor in drinking induced by extracellular stimuli. Journal of Physiology, 201, 349–368. Fitzsimons, J. T. (1979). The physiology of thirst and sodium appetite. Cambridge: Cambridge University Press. Fitzsimons, J. T. (1998). Angiotensin, thirst and sodium appetite. Physiological Reviews, 78, 583–686. Fitzsimons, J. T., & Elfont, R. M. (1982). Angiotensin does contribute to drinking induced by caval ligation in rat. American Journal of Physiology, 243, R558–R562. Fitzsimons, J. T., & Moore-Gillon, M. J. (1980). Drinking and antidiuresis in response to reductions in venous return in the dog: Neural and endocrine mechanisms. Journal of Physiology, 308, 403–416. Fitzsimons, J. T., & Setler, P. (1975). The relative importance of central nervous catecholaminergic and cholinergic mechanisms in drinking in response to angiotenein and other thirst stimuli. Journal of Physiology, 250, 613–631. Freece, J. A., Van Bebber, J. E., Zierath, D. K., & Fitts, D. A. (2005). Subfornical organ disconnection alters Fos expression in the lamina terminalis, supraoptic nucleus, and area postrema after intragastric hypertonic NaCl. American Journal of Physiology, 288, R947–R955.
c35.indd Sec7:704
Greer, M. A. (1955). Suggestive evidence of a primary “drinking center.” Proceedings of the Society of Experimental Biology, 89, 59–62. Grob, M., Trottier, J.-F., Drolet, G., & Mouginot, D. (2003). Characterization of the neurochemical content of neuronal populations of the lamina terminalis activated by acute hydromineral challenge. Neuroscience, 122, 247–257. Grossman, S. P. (1960, July 29). Eating or drinking elicited by direct adrenergic or cholinergic stimulation of the lateral hypothalamus. Science, 132, 301–302. Grossman, S. P. (1984). A reassessment of the brain mechanisms that control thirst. Neuroscience Biobehavioral Review, 8, 95–104. Gu, G. B., & Simerly, R. B. (1997). Projections of the sexually dimorphic anteroventral periventricular nucleus in the female rat. Journal of Comparative Neurology, 384, 142–164. Gutman, M. B., Ciriello, J., & Mogenson, G. J. (1988). Effects of plasma angiotensin: Pt. II. Hypernatremia on subfornical organ neurons. American Journal of Physiology, 254, R746–R754. Haberich, F. J. (1971). Osmoreceptors in the portal circulation and their significance for the regulation of water balance. Triangle, 10, 123–130. Hainsworth, F. R., Stricker, E. M., & Epstein, A. N. (1968). Water metabolism of rats in the heat. American Journal of Physiology, 214, 983–989. Harvey, J. A., & Hunt, H. F. (1965). Effect of septal lesions on thirst in the rat as indicated by water consumption and operant responding for water reward. Journal of Comparative and Physiological Psychology, 59, 49–56. Hattori, Y., Kasai, M., Uesugi, S., Kawata, M., & Yamashita, H. (1988). Atrial natriuretic polypeptide depresses angiotensin II-induced excitation of neurons in the rat subfornical organ in vitro. Brain Research, 443, 355–359. Herbert, H., Moga, M. M., & Saper, C. B. (1991). Connections of the parabrachial nucleus with the nucleus of the solitary tract and medullary
8/17/09 3:06:16 PM
References 705 reticular formation in the rat. Journal of Comparative Neurology, 293, 540–580.
Kaufman, S. (1984). Role of right atrial receptors in the control of drinking in the rat. Journal of Physiology, 349, 389–396.
Herbert, J., Forsling, M. L., Howes, S. R., Stacey, P. M., & Shiers, H. M. (1992). Regional expression of c-fos antigen in the basal forebrain following intraventricular infusions of angiotensin and its modulation by drinking either water or saline. Neuroscience, 51, 857–882.
Kawano, H., & Masuko, S. (1999). Synaptic contacts between nerve terminals originating from the ventrolateral medullary catecholaminergic area and median preoptic neurons projecting to the paraventricular hypothalamic nucleus. Brain Research, 817, 110–116.
Hochstenbach, S. L., & Ciriello, J. (1996). Effect of lesions of forebrain circumventricular organs on c-fos expression in the central nervous system to plasma hypernatremia. Brain Research, 713, 17–28.
Kelley, A. E., & Berridge, K. C. (2002). The neuroscience of natural rewards: Relevance to addictive drugs. Journal of Neuroscience, 22, 3306–3311.
Hoffman, M. L., DenBleyker, M., Smith, J. C., & Stricker, E. M. (2005). Inhibition of thirst when dehydrated rats drink water or saline. American Journal of Physiology, 290, R1199–R1207. Holmes, J. H., & Gregerson, M. I. (1950). Observations on drinking induced by hypertonic solutions. American Journal of Physiology, 162, 326–337. Honda, E., Ono, K., Toyono, T., Kawano, H., Masuko, S., & Inenaga, K. (2003). Activation of muscarinic receptors in rat subfornical organ neurones. Journal of Neuroendocrinology, 15, 770–777. Hosutt, J. A., Rowland, N., & Stricker, E. M. (1978). Impaired drinking responses of rats with lesions of the subfornical organ. Journal of Comparative and Physiological Psychology, 95, 104–113. Illowski, B. P., & Kirch, D. G. (1988). Polydipsia and hyponatremia in psychiatric patients. American Journal of Psychiatry, 145, 675–683. Itoh, H., Nakao, K., Yamada, T., Shirakami, G., Kangawa, K., Minamino, N., et al. (1988). Antidipsogenic activity of a novel peptide, “brain natriuretic peptide” in rats. European Journal of Pharmacology, 150, 193–196. James, R. J. A., Irons, D. W., Holmes, C., Drewett, R. F., & Bayliss, P. H. (1995). Thirst induced by a suckling episode during breast feeding and its relation with plasma vasopressin, oxytocin and osmoregulation. Clinical Endocrinology, 43, 277–282. Jewell, P. A., & Verney, E. B. (1957). An experimental attempt to determine the site of the neurohypohysial osmoreceptors in the dog. Philosophical Transactions of the Royal Society. Series B, Biological Sciences, 240, 197–324. Johansen, J. P., Fields, H. L., & Manning, B. H. (2001). The affective component of pan in rodents: Direct evidence for a contribution of the anterior cingulate cortex. Proceedings of the National Academy of Sciences, USA, 98, 8077–8082. Johnson, A. K., & Buggy, J. (1978). Periventricular preoptic-hypothalamus is vital for thirst and normal water economy. American Journal of Physiology, 23, R122–R129.
Kisley, L. R., Sakai, R. R., Ma, L. Y., & Fluharty, S. J. (1999). Ovarian steroid regulation of angiotensin II-induced water intake in the rat. American Journal of Physiology, 276, R90–R96. Klingbeil, C. K., Brooks, V. L., Quillen, E. W., & Reid, I. A. (1991). Effect of baroreceptor denervation on stimulation of drinking by angiotensin II in conscious dogs. American Journal of Physiology, 260, E333–E337. Kobashi, M., & Adachi, A. (1992). Effect of hepatic portal infusion of water on water intake by water-deprived rats. Physiology and Behavior, 52, 885–888. Kolaj, M., Bai, D., & Renaud, L. P. (2004). GABAB receptor modification of rapid inhibitory and excitatory neurotransmission from the subfornical organ and other afferents tomedian preoptic nucleus neurons. Journal of Neurophysiology, 92, 111–122. Kraly, F. S. (1991). Effects of eating on drinking. In D. J. Ramsay & D. A. Booth (Eds.), Thirst: Physiological and psychological aspects (pp. 296–312). London: Springer Verlag. Kraly, F. S., Kim, Y. M., Dunham, L. M., & Tribuzio, R. A. (1995). Drinking after intragastric NaCl without increase in systemic plasma osmolality in rats. American Journal of Physiology, 276, R1085–R1092. Kunii, K., Yamanaka, A., Nambu, T., Matsuzaki, I., Goto, K., & Sakurai, T. (1999). Orexins/hypocretins regulate drinking behaviour. Brain Research, 842, 256–261. Lehr, D., Goldman, H. W., & Casner, P. (1973, December 7). Renin angiotensin role in thirst: Paradoxical enhancement of drinking by angiotensin converting enzyme inhibitor. Science, 182, 1031–1034. Lenkei, Z., Palkovits, M., Corvol, P., & Llorens-Cortes, C. (1997). Expression of angiotensin type-1 (AT1) and type-2 (AT2) receptor mRNAs in the adult rat brain: A functional neuroanatomical review. Frontiers in Neuroendocrinology, 18, 383–439. Liedtke, W. (2007). Role of TRPV ions channels in sensory transduction of osmotic stimuli in mammals. Experimental Physiology, 92, 507–512. Lind, R. W., & Johnson, A. K. (1982). Subfornical organ-median preoptic connections and drinking and pressor responses to angiotensin II. Journal of Neuroscience, 2, 1043–1051.
Johnson, A. K., Cunningham, J. T., & Thunhorst, R. L. (1996). Integrative role of the lamina terminalis in the regulation of cardiovascular and body fluid homeostasis. Clinical and Experimental Pharmacology and Physiology, 23, 183–191.
Lind, R. W., Thunhorst, R. L., & Johnson, A. K. (1984). The subfornical organ and the integration of multiple factors in thirst. Physiology and Behavior, 32, 69–74.
Johnson, A. K., & Thunhorst, R. L. (1997). The neuroendocrinoloy of thirst and salt appetite: Visceral sensory signals and mechanisms of central integration. Frontiers in Neuroendocrinology, 18, 292–353.
Lynch, K. R., Hawelu-Johnson, C. L., & Guyenet, P. G. (1987). Localization of brain angiotensinogen mRNA by hybridization histochemistry. Brain Research, 388, 149–158.
Jurzak, M., Schmid, H., & Gerstberger R. (1994). NADPH-diaphorase staining and NO-synthase immunoreactivity in circumventricular organs of the rat brain. In K. Pleschka & R. Gerstberger (Eds.), Integrative and cellular aspects of autonomic functions: Temperature and osmoregulation (pp. 451–459). Paris: John Libbey Eurotext.
Malmo, R. B., & Mundl, W. J. (1975). Osmosensitive neurons in the rat’s preoptic area: Medial-lateral comparison. Journal of Comparative and Physiological Psychology, 88, 161–175.
Kadekaro, M., & Summy-Long, J. (2000). Centrally produced nitric oxide and the regulation of body fluid and blood pressure homeostasis. Clinical and Experimental Pharmacology and Physiology, 27, 450–459. Kalra, P. R., Anker, S. D., & Coats, A. J. S. (2001). Water and sodium regulation in chronic heart failure: The role of natriuretic peptides and vasopressin. Cardiovascular Research, 51, 495–509.
c35.indd Sec7:705
Kenney, W. L., & Chiu, P. (2001). Influence of age on thirst and fluid intake. Medical Science Sports Exercise, 33, 1524–1532.
Mangiapane, M. L., & Simpson, J. B. (1983). Drinking and pressor responses after acetylcholine injection into the subfornical organ. American Journal of Physiology, 24, R508–R513. Mangiapane, M. L., Thrasher, T. N., Keil, L. C., Simpson, J. B., & Ganong, F. (1983). Deficits in drinking and vasopressin secretion after lesions of the nucleus medianus. Neuroendocrinology, 37, 73–77. Mathai, M. L., Arnold, I., Febbraio, M. A., & McKinley, M. J. (2004). Central blockade of nitric oxide synthesis induces hyperthermia that is prevented by indomethacin in rats. Journal of Thermal Biology, 29, 401–405.
8/17/09 3:06:16 PM
706 Thirst Mayberg, H. S., Liotti, M., Brannan, S. K., McGinnis, S., Mahurin, R. K., Jerabik, P. A., et al. (1999). Reciprocal limbic-cortical function and negative mood: Converging PET signals in depression and normal sadness. American Journal of Psychiatry, 156, 675–682. Mayer, A. (1900). Variations de la tension osmotique du sang chez les animaux prives de liquids. Comptes Rendus de la Société Biologie (Paris), 52, 153–155. McAllen, R. M., Pennington, G. L., & McKinley, M. J. (1990). Osmoresponsive units in sheep median preoptic nucleus. American Journal of Physiology, 259, R593–R600. McDannald, M. A., Saddoris, M. P., Gallagher, M, & Holland, P. C. (2005). Lesions of orbitofrontal cortex impair rats’ differential outcome expectancy learning but not conditioned stimulus-potentiated feeding. Journal of Neuroscience, 25, 4628–4632. McKinley, M. J., Alexiou, T., Boon, W. M., Campbell, D. J., Denton, D. A., Dinicolantonio, R., et al. (2006). Osmoregulatory thirst in mice lacking angiotensin. Procedings of the Australian Neuroscience Society, 17, 45. McKinley, M. J., Allen, A. M., Clevers, J., Paxinos, G., & Mendelsohn, F. A. O. (1987). Angiotensin receptor binding in human hypothalamus: Autoradiographic localization. Brain Research, 420, 375–379. McKinley, M. J., Badoer, E., & Oldfield, B. J. (1992). Intravenous angiotensin II induces Fos-immunoreactivity in circumventricular organs of the lamina terminalis. Brain Research, 594, 295–300.
Michell, A. R. (1979). Water and electrolyte excretion during the oestrous cycle in sheep. Quarterly Journal of Experimental Physiology, 64, 79–88. Miselis, R. R. (1981). The efferent projections of the subfornical organ of the rat: A circumventricular organ within a neural network subserving water balance. Brain Research, 230, 1–23. Miselis, R. R., Weiss, M., & Shapiro, R. F. (1987). Modulation of the visceral neuraxis. In P. M. Gross (Ed.), Circumventricular organs and body fluids (Vol. III, pp. 143–162). Boca Raton: CRC Press. Montani, J.-P., & Van Vliet, B. N. (2004). General physiology and pathophysiology of the renin-angiotensin system. In T. Unger & B. A. Scholkens (Eds.), Handbook of experimental pharmacology (Vol. 163, Angiotensin Vol. I, pp. 3–30). Berlin: Springer. Murphy, T., & Samson, W. K. (1995). The novel vasoactive hormone, adrenomedullin, inhibits water drinking in the rat. Endocrinology, 136, 2459–2463. Nava, F., Calapai, G., De Sarro, A., & Caputi, A. P. (1996). Interleukin-1 receptor antagonist does not reverse lipopolysaccharide-induced inhibition of water inake in rat. European Journal of Pharmacology, 309, 223–227. Nava, F., & Carta, G. (2000). Repeated lipopolysaccharide administration produces tolerance to anorexia and fever but not to inhibition of thirst in rat. International Journal of Immunopharmacology, 22, 943–953.
McKinley, M. J., Blaine, E. H., & Denton, D. A. (1974). Brain osmoreceptors, cerebrospinal fluid electrolyte composition and thirst. Brain Research, 70, 532–537.
Nissen, R., & Renaud, L. P. (1994). GABA receptor mediation of median preoptic nucleus-evoked inhibition of supraoptic neurosecretory neurones in rat. Journal of Physiology, 479, 207–216.
McKinley, M. J., Burns, P., Colvill, L. M., Oldfield, B. J., Wade, J. D., Weisinger, R. S., et al. (1997). Distribution of Fos immunoreactivity in the lamina terminalis and hypothalamus induced by centrally administered relaxin in conscious rats. Journal of Neuroendocrinology, 9, 431–438.
Noakes, T. D. (2003). Overconsumption of fluids by athletes. British Medical Journal, 327, 113–114.
McKinley, M. J., Burns, P., Oldfield, B. J., Sunigawa, K., & Weisinger, R. S. (2004). Diabetic thirst: Osmoreceptor stimulation by hyperglycemia in streptozotocin-induced diabetic rats. Appetite, 42, 384. McKinley, M. J., Denton, D. A., Coghlan, J. P., Harvey, R. B., McDougall, J. G., Rundgren, M., et al. (1987). Cerebral osmoregulation of renal sodium excretion: A response analogous to thirst and vasopressin release. Canadian Journal of Physiology and Pharmacology, 65, 1724–1729. McKinley, M. J., Denton, D. A., Leksell, L. G., Mouw, D. R., Scoggins, B. A., Smith, M. H., et al. (1982). Osmoregulatory thirst in sheep is disrupted by ablation of the anterior wall of the optic recess. Brain Research, 236, 210–215. McKinley, M. J., Denton, D. A., Thomas, C., Woods, R., & Mathai, M. L. (2006). Differential effects of aging on fluid intake in response to hypovolemia, hypertonicity and hormonal stimuli in hypertonicity and hormonal stimuli in Munich Wistar rats. Proceedings of the National Academy of Sciences, USA, 103, 3450–3455.
Nothnagel, H. (1881). Durst und polydipsia. Archiv fur pathologische Anatomie und Physiologie, 86, 435–437. Ohman, L. E., & Johnson, A. K. (1989). Brain stem mechanisms and the inhibition of angiotensin-induced drinking. American Journal of Physiology, 256, R264–R269. Oldenburg, B., MacDonald, G. J., & Shelley, S. (1988). Controlled trial of enalapril inpatients with chronic fluid overload undergoing dialysis. British Medical Journal, 296, 1089–1091. Oldendorf, W. H. (1971). Brain uptake of radiolabeled amino acids, amines, and hexoses after arterial injection. American Journal of Physiology, 221, 1629–1639. Oldfield, B. J., Bicknell, R. J., McAllen, R. M., Weisinger, R. S., & McKinley, M. J. (1991). Intravenous hypertonic saline induces Fos immunoreactivity in neurons throughout the lamina terminalis. Brain Research, 561, 151–156. Olivares, E. L., Costa-e-Sousa, R. H., & Cavalcante-Lima, H. R. (2003). Effect of electrolytic lesions of the dorsal raphe nucleus on water intake and sodium appetite. Brazilian Journal of Medical Biological Research, 36, 1709–1716.
McKinley, M. J., Denton, D. A., & Weisinger, R. S. (1978). Sensors for thirst and antidiuresis: Osmoreceptors or CSF sodium detectors. Brain Research, 141, 89–103.
Olsson, K. (1972). Dipsogenic effects of intracaotid infusions of various hypertosmolar solutions. Acta Physiologica Scandinavica, 85, 517–522.
McKinley, M. J., Hards, D. K., & Oldfield, B. J. (1994). Identification of neural pathways activated in dehydrated rats by means of Fos-immunohistochemistry and neural tracing. Brain Research, 653, 305–314.
O’Neill, T. P., & Brody, M. J. (1987). Role for the median preoptic nucleus in centrally evoked pressor responses. American Journal of Physiology, 252, R1165–R1172.
McKinley, M. J., Mathai, M. L., Pennington, G. L., Rundgren M., & Vivas, L. (1999). The effect of individual or combined ablation of the nuclear groups of the lamina terminalis on water drinking in sheep. American Journal of Physiology, 276, R673–R683.
Osheroff, P. L., & Phillips, H. S. (1991). Autoradiographic localization of relaxin binding sites in rat brain. Proceedings of the National Academy of Sciences, USA, 88, 6413–6417.
Menani, J. V., Columbari, D. S. A., Beltz, T. G., Thunhorst, R. L., & Johnson, A. K. (1998). Salt appetite: Interaction of forebrain angiotensinergic and hindbrain serotonergic mechanisms. Brain Research, 801, 29–35. Menani, J. V., & Johnson, A. K. (1995). Lateral parabrachial serotonergic mechanisms: Angiotensin-induced pressor and drinking responses. American Journal of Physiology, 269, R1044–R1049.
c35.indd Sec7:706
Packer, M. (1992). The neurohumoral hypothesis: A theory to explain the mechanism of disease progression in heart failure. Journal of the American College of Cardiology, 20, 248–252. Parry, L. J., Poterski, R. S., & Summerlee, A. J. (1994). Effects of relaxin on blood pressure and the release of vasopressin and oxytocin in anesthetised rats during pregnancy and lactation. Biology of Reproduction, 50, 622–628.
8/17/09 3:06:17 PM
References 707 Peck, J. W., & Blass, E. M. (1975). Localization of thirst and antidiuretic osmoreceptors by intracranial injections in rats. American Journal of Physiology, 228, 1501–1509.
Samson, W. K., Skala, K., Huang, F. L. S., Gluntz, S., Alexander, B., & Gomez-Sanchez, C. E. (1991). Central nervous system of endothelin-3 to inhibit water drinking in the rat. Brain Research, 539, 347–351.
Peck, J. W., & Novin, D. (1971). Evidence that osmoreceptors mediating drinking are in the lateral preoptic area. Journal of Comparative and Physiological Psychology, 74, 134–147.
Samson, W. K., White, M. M., Price, C., & Ferguson, A. V. (2007). Obestatin acts in the brain to inhibit thirst. American Journal of Physiology, 292, R637–R643.
Penfield, W., & Faulk, M. E. (1955). The insula: Further observations on its function. Brain, 78, 445–470.
Saper, C. B. (2002). The central autonomic nervous system: Conscious visceral perception and autonomic pattern generation. Annual Review of Neuroscience, 25, 433–469.
Penfield, W., & Jasper, H. (1954). Epilepsy and the functional anatomy of the human brain. Boston: Little, Brown. Penfield, W., & Rasmussen, T. (1950). The cerebral cortex of man. New York: Macmillan. Phillips, P. A., Rolls, B. J., Ledingham, J. G. G., Forsling, M. L., Morton, J. J., Crowe, M. J., et al. (1984). Reduced thirst after water deprivation in healthy elderly men. New England Journal of Medicine, 311, 753–759. Phillips, P. A., Rolls, B. J., Ledingham, J. G. G., Morton, J. J., & Forsling, M. L. (1985). Angiotensin-induced thirst and vasopressin release in man. Clinical Science, 68, 669–674. Quillen, E.W., Keil, L. C., & Reid, I. A. (1990). Effects of baroreceptor denervation on endocrine and drinking responses to caval constriction in dogs. American Journal of Physiology, 259, R618–R626. Rabe, E. F. (1975). Relationship between absolute body deficits and fluid intake in the rat. Journal of Comparative and Physiological Psychology, 89, 468–477. Raghavendra, V., Agrewala, J. N., & Kulkarni, S. K. (1999). Role of centrally administered melatonin and inhibitors of COX, and NOS in LPS-induced hyperthermia and adipsia. Prostaglandins Leukotrienes Essential Fatty Acids, 60, 249–253. Ramsay, D. J., Rolls, B. J., & Wood, R. J. (1977). Body fluid changes which influence drinking in the water deprived rat. Journal of Physiology, 266, 453–469. Rauch, M., Schmid, H., DeVente, J., & Simon, E. (1997). Electrophysiological and immunocytochemical evidence for a cGMP-mediated inhibition of subfornical organ neurons by nitric oxide. Journal of Neuroscience, 17, 363–371. Ricardo, J. A. (1981). Efferent connections of the subthalamic region in the rat: Pt. II. The zona incerta. Brain Research, 214, 43–60. Richter, C. P., & Barelare, B. (1938). Nutritional requirements of pregnant and lactating rats studied by self-selection method. Endocrinology, 23, 15–24. Riediger, T., Rauch, M., & Schmid, H. A. (1999). Actions of amylin on subfornical organ neurons and on drinking behavior in rats. American Journal of Physiology, 276, R514–R521. Robertson, G. L. (1989). Syndrome of inappropriate antidiuresis. New England Journal of Medicine, 321, 538–539. Robinson, B. W., & Mishkin, M. (1966). Alimentary responses to forebrain stimulation in monkeys. Experimental Brain Research, 4, 330–366. Rogers, P. W., & Kurtzman, N. A. (1973). Renal failure, uncontrollable thirst, and hyperreninaemia: Cessation of thirst with bilateral nephrectomy. Journal of the American Medical Association, 225, 1236–1238. Rolls, E. T. (2000). The orbitofrontal cortex and rewaed. Cerebral Cortex, 10, 284–294.
c35.indd Sec7:707
Saper, C. B., & Levisohn, D. (1983). Afferent connections of the median preoptic nucleus in the rat: Anatomical evidence for a cardiovascular integrative mechanism in the anteroventral third ventricular (AV3V) region. Brain Research, 288, 21–31. Schreihofer, A. M., Anderson, B. K., Schiltz, J. C., Xu, L., Sved, A. F., & Stricker, E. M. (1999). Thirst and salt appetite elicited by hypovolemia in rats with chronic lesions of the nucleus of the solitary tract. American Journal of Physiology, 276, R251–R258. Setler, P. (1973). The role of catecholamines in thirst. In A. N. Epstein, H. R. Kissileff, & E. Stellar (Eds.), The neuropsychology of thirst: New findings and advances in concepts (pp. 279–291). Washington, DC: Winston. Simpson, J. B., Epstein, A. N., & Camardo, J. S. (1978). Localization of receptors for the dipsogenic action of angiotensin II in the subfornical organ of the rat. Journal of Comparative and Physiological Psychology, 92, 581–608. Simpson, J. B., & Routtenberg, A. (1973). Subfornical organ: Site of dipsogenic action of angiotensin II. Science, 181, 1172–1175. Sinnayah, P., Burns, P., Wade, J., Weisinger, R. S., & McKinley, M. J. (1999). Water drinking in rats resulting from intravenous relaxin and its modification by other dipsogenic factors. Endocrinology, 140, 5082–5086. Slavin, A., & McKinley, M. J. (1989). [The physiology and anatomy of the area postrema of the sheep]Unpublished observations. Smardencas, A, & McKinley, M. J. (1994). unpublished observations Smith, D., Moore, K., Tormey, W., Bayliss, P. H., & Thompson, C. J. (2004). Downward resetting of the osmotic threshold for thirst in patients with SIADH. American Journal of Physiology, 287, E1019–E1023. Smith, G. P., & Jerome, C. (1983). Effects on total and selective vagotomies on water intake in rats. Journal of the Autonomic Nervous System, 9, 259–271. Somponpun, S. J., Johnson, A. K., Beltz, T., & Sladek, C. D. (2004). Estrogen receptor-alpha expression in osmosensitive elements of the lamina terminalis: Regulation by hypertonicity. American Journal of Physiology, 287, R661–R669. Speth, R. C., Smith, M. S., & Grove, K. L. (2002). Brain angiotensinergic mediation of enhanced water consumption in lactating rats. American Journal of Physiology, 282, R695–R701. Stratford, T. R., & Wirtshafter, D. (2000). Forebrain lesions differentially affect drinking eleicited by dipsogenic challenges and injections of muscimol into the median raphe nucleus. Behavioral Neuroscience, 114, 760–771. Stricker, E. M. (1966). Extracellular fluid volume and thirst. American Journal of Physiology, 211, 232–238.
Rosas-Arellano, M. P., Solano-Flores, L. P., & Ciriello, J. (1999). Co-localization of estrogen and angiotensin receptors within subfornical organ neurons. Brain Research, 837, 254–262.
Stricker, E. M., Callahan, J. B., Huang, W., & Sved, A. F. (2002). Early osmoregulatory stimulation of neurohypohysial hormone secretion and thirst after gastric NaCl loads. American Journal of Physiology, 282, R1710–R1717.
Rundgren, M., & Fyhrquist, F. (1978). A study of permanent adipsia induced by medial forebrain lesions. Acta Physiologica Scandinavica, 103, 463–467.
Stricker, E. M., Swerdloff, A. F., & Zigmond, M. J. (1978). Intrahypothalamic injections of kainic acid produce feeding and drinking deficits in rat. Brain Research, 158, 470–473.
Sagar, S. M., Sharp, F. R., & Curran, T. (1988, June 3). Expression of c-fos protein in brain: Metabolic mapping at the cellular level. Science, 240, 1328–1331.
Sullivan, M. J., Cunningham, J. T., Mazzella, D., Allen, A. M., Nissen, R., & Renaud, L. P. (2003). Lesions of the diagonal band of Broca enhance drinking in the rat. Journal of Neuroendocrinology, 15, 907–915.
8/17/09 3:06:17 PM
708 Thirst Summy-Long, J. Y., Keil, L. C., Deen, K., Rosella, L., & Severs, W. B. (1981). Endogenous opioid peptide inhibition of the central action of angiotensin. Journal of Pharmacology and Experimental Therapeutics, 217, 619–629. Sunn, N., Egli, M., Burazin, T., Burns, P., Colvill, L., Davern, P., et al. (2002). Circulating relaxin acts on the subfornical organ to stimulate water drinking in the rat. Proceedings of the National Academy of Sciences, USA, 99, 1701–1706.
induced drinking and vasopressin secretion in the dog. Endocrinology, 110, 1837–1839. Thrasher, T. N., Nistal-Herrera, J. F., Keil, L. C., & Ramsay, D. J. (1981). Satiety and inhibition of vasopressin secretion after drinking in dehydrated dogs. American Journal of Physiology, 240, E394–E401. Thrasher, T. N., Simpson, J. B., & Ramsay, D. J. (1982). Lesions of the subfornical organ block angiotensin-induced drinking in the dog. Neuroendocrinology, 35, 68–72.
Swanson, L. W., & Sharpe, L. G. (1973). Centrally induced drinking: Comparison of angiotensin II- and carbachol-sensitive sites in rats. American Journal of Physiology, 225, 566–573.
Thunhorst, R. L., Fitts, D. A., & Simpson, J. B. (1989). Angiotensin-converting enzyme in subfornical organ mediates captopril- induced drinking. Behavioral Neuroscience, 103, 1302–1310.
Szczepanska-Sadowska, E., Sobocinska, J., & Kozlowski, S. (1979). Thirst and renal excretion of water and electrolytes during pyrogen fever in dogs. Archives of the International Physiologie Biochimie, 87, 673–686.
Thunhorst, R. L., & Johnson, A. K. (2003). Thirst and salt appetite responses in young and old brown Norway rats. American Journal of Physiology, 284, R417–R327.
Szczepanska-Sadowska, E., Sobocinska, J., & Sadowski, S. (1982). Central dipsogenic effect of vasopressin. American Journal of Physiology, 242, R372–R379. Taheri, S., Murphy, K., Cohen, M., Sujkovic, E., Kennedt, A., Dhillo, W., et al. (2002). The effects of centrally administered apelin-13 on food intake, water intake and pituitary hormone release in rats. Biochemical Biophysical Research Communication, 291, 1208–1212. Takamata, A., Mack, G. W., Gillen, C. M., Jozsi, A. C., & Nadel, E. R. (1995). Osmoregulatory modulation of thermal sweating in humans: Reflex effects of drinking. American Journal of Physiology, 268, R414–R442. Takamata, A., Mack, G. W., Stachenfeld, N. S., & Nadel, E. R. (1995). Body temperature modification of osmotically induced vasopressin secretion and thirst in humans. American Journal of Physiology, 269, R874–R880. Tanaka, J., Hayashi, Y., Shimamune, S., Hori, K., & Nomura, H. (1997). Subfornical efferents enhance extracellular noradrenaline concentration in the median preoptic nucleus area of rats. Neuroscience Letters, 230, 171–174. Tanaka, J., Miyakubo, H., Fujisawa, S., & Nomura, M. (2003). Reduced dipsogenic response to angiotensin II activation of subfornical organ projections to the median preoptic nucleus in estrogen-treated rats. Experimental Neurology, 179, 83–89. Tanaka, J., Miyakubo, H., Okamura, T., Sakamaki, K., & Hayashi, Y. (2001). Estrogen decreases the responsiveness of subfornical organ neurons projecting to the hypothalamic paraventricular nucleus to angiotensin II in female rats. Neuroscience Letters, 307, 155–158.
Towbin, E. J. (1949). Gastrc distention as a factor in the satiation of thirstin esophagostomized dogs. American Journal of Physiology, 159, 533–541. Travis, K. A., & Johnson, A. K. (1993). In vitro sensitivity of median preoptic neurons to angiotensin II, osmotic pressure, and temperature. American Journal of Physiology, 264, R1200–R1205. Ungerstedt, U. (1971). Adipsia and aphagia after 6-hydroxydopamine induced degeneration of the nigro-striatal dopamine system. Acta Physiologica Scandinavica, 367, 95–122. Valtin, H. (2002). “Drink at least eight glasses of water a day.” Really? Is there scientific evidence for “8 ⫻ 8”? American Journal of Physiology, 283, R993–R1004. van Gaalen, M. M., Stenzel-Poore, M. P., Holsboer, F., & Steckler, T. (2002). Effects of transgenic overproduction of CRH on anxiety-like behaviour. European Journal of Neuroscience, 15, 2007–2015. Verbalis, J., Goldsmith, S. R., Greenberg, A., Schrier, R. W., & Sterns, R. H. (2007). Hyponatremia treatment guidelines: Expert panel recommendations. American Journal of Medicine, 120, S1–S21. Verney, E. B. (1947). The antidiuretic hormone and factors which determine its release. Proceedings of the Royal Society, B, 135, 25–106. Vivas, L., Chiaraviglio, E., & Carrer, H. F. (1990). Rat organum vasculosum laminae terminalis in vitro: Responses to changes in sodium concentration. Brain Research, 519, 294–300. Wang, K., & Evered, M. D. (1993). Endotoxin stimulates drinking in rat without changing dehydrational signals controlling thirst. American Journal of Physiology, 265, R1043–R1051. Wang, T., & Edwards, G. L. (1997). Differential effects of dorsomedial medulla lesion size on ingestive behavior in rats. American Journal of Physiology, 273, R1299–R1308.
Teitelbaum, P., & Epstein, A. N. (1962). The lateral hypothalamic syndrome: Recovery of feeding and drinking after lateral hypothalamic lesions. Psychological Review, 69, 74–90.
Weisinger, R. S., Burns, P., Eddie, L. W., & Wintour, E. M. (1993). Relaxin alters the plasma osmolality-arginine vasopressin relationship in the rat. Journal of Endocrinology, 137, 505–510.
Thompson, C. J., & Bayliss, P. H. (1987). Thirst in diabetes insipidus: Clinical relevance of quantitative assessment. Quarterly Journal of Medicine, 65, 853–862.
Wilson, W. L., Roques, B. P., Llorens-Cortes, C., Speth, R. C., Harding, J. W., & Wright, J. W. (2005). Roles of brain angiotensins II and III in thirst and sodium appetite. Brain Research, 1060, 108–117.
Thompson, C. J., Edwards, C. R., & Bayliss, P. H. (1991). Osmotic and non-osmotic regulation of thirst and vasopressin secretion in patients with compulsive water drinking. Clinical Endocrinology, 35, 221–228.
Winn, P., Tarbuck, A., & Dunnett, S. B. (1984). Ibotenic acid lesions of the lateral hypothalamus: Comparison with the electrolytic syndrome. Neuroscience, 12, 225–240.
Thornton, S. N., & Fitzsimons, J. T. (1995). The effects of centrally administered porcine relaxin on drinking behaviour in male and female rats. Journal of Neuroendocrinology, 7, 165–170.
Wise, R. A. (2002). Brain reward circuitry: Insights from unsensed incentives. Neuron, 36, 229–240.
Thrasher, T. N., Keenan, C. R., & Ramsay, D. (1999). Cardiovascular afferent signals and drinking in response to hypotension in dogs. American Journal of Physiology, 277, R795–R801. Thrasher, T. N., Keil, L. C., & Ramsay, D. J. (1982a). Hemodynamic, hormonal, and drinking responses to reduced venous return in the dog. American Journal of Physiology, 243, R354–R362. Thrasher, T. N., Keil, L. C., & Ramsay, D. (1982b). Lesions of the organum vasculosum of the lamina terminalis (OVLT) attenuate osmotically-
c35.indd Sec7:708
Wislocki, G. B., & Leduc, E. H. (1952). Vital staining of the hematoencephalic barrier by silver nitrate and trypan blue, and cytological comparisons of the neurohypophysis, pineal body, area postrema, intercolumnar tubercle and supraoptic crest. Journal of Comparative Neurology, 96, 371–414. Wolf, A. V. (1950). Osmometric analysis of thirst in man and dog. American Journal of Physiology, 161, 75–86. Wolf, A. V. (1958). Thirst: Physiology of the urge to drink and problems of water lack. Springfield, IL: Charles C Thomas.
8/17/09 3:06:17 PM
References 709 Wood, R. J., Maddison, S., Rolls, E. T., Rolls, B. J., & Gibbs, J. (1980). Drinking in rhesus monkeys: Roles of presystemic and systemic factors in control of drinking. Journal of Comparative and Physiological Psychology, 94, 1135–1148.
Xu, Z., Lane, J. M., Zhu, B., & Herbert, J. (1997). Dizocilpine maleate an N-methyl-D-aspartate antagonist inhibits dipsogenic reponses and C-Fos expression induced by intracerebroventricular infusion of angiotensin Pt. II. Neuroscience, 78, 203–214.
Wookey, P. J., Cao, Z. & Cooper, M.E. (1998). Interaction of the renal amylin and renin-angiotensin system in animal models of diabetes and hypertension. Mineral and Electrolyte Metabolism, 24, 389–399.
Zabik, J. E., Sprague, J. E., & Odio, M. (1993). Interactive dopaminergic and noradrenergic systems in the regulation of thirst in the rat. Physiology and Behavior, 54, 29–33.
Wright, J. W., Morseth, S. L., Abhold, R. H., & Harding, J. W. (1985). Pressor action and dipsogenicity induced by angiotensin: Pts. II & III in rats. American Journal of Physiology, 249, R514–R521.
Zardetto-Smith, A. M., & Johnson, A. K. (1995). Chemical topography of efferent projections from the median preoptic nucleus to pontine monoaminergic groups in the rat. Neuroscience Letters, 199, 215–219.
Xu, D. S. H., Honda, E., Ono, K., & Inenaga, K. (2001). Muscarinic modulation of GABAergic transmission to neurons in the rat subfornical organ. American Journal of Physiology, 280, R1657–R1664. Xu, J., Pekarek, E., Ge, J., & Yao, J. (2001). Functional relationship between subfornical organ cholinergic stimulation and cellular activation in the hypothalamus and AV3V region. Brain Research, 922, 191–200. Xu, Z., & Herbert, J. (1998). Effects of intracerebroventricular dizocilpine (MK801) on dehydration-induced dipsogenic responses, plasma vasopressin and c-Fos expression in the rat forebrain. Brain Research, 784, 91–99. Xu, Z., & Johnson, A. K. (1998). Non-NMDA receptor antagonist-induced drinking in rat. Brain Research, 808, 124–127.
c35.indd Sec7:709
Zerbe, R. L., & Robertson, G. L. (1983). Osmoregulation of thirst and vasopressin secretion in human subjects: Effect of various solutes. American Journal of Physiology, 244, E607–E614. Zhao, S., Malmgren, C. H., Shanks, R. D., & Sherwood, O. D. (1985). Monclonal antibodies specific for rat relaxin: Pt. VII. Passive immunization with monoclonal antibodies throughout the second half of pregnancy reduces water consumption in rats. Endocrinology, 136, 1892–1897. Zimmerman, M. B., Blaine, E. H., & Stricker, E. M. (1981, January 30). Water intake in hypovolemic sheep: Effects of crushing the left atrial appendage. Science, 211, 489–491.
8/17/09 3:06:18 PM
Chapter 36
Central Theories of Motivation and Emotion NEIL McNAUGHTON AND PHILIP J. CORR
and negative reinforcement, ignoring the specific nature of the reinforcer. Further, it is variation in the sensitivities of the systems that control positive and negative affect generally that appears to make the greatest contribution to human personality and to the risk of psychopathology—areas of human psychology where we clearly see the importance, or at least the prominence, of emotion and motivation. In this chapter, we present emotion as a cluster of reactions, including motivation, that are linked to specific classes of affordances (the aspects of an object or situation that make certain actions available) of stimuli in the world—where both the nature of the external stimulus and the animal’s internal state combine to determine the precise affordance at any particular point in time. In the process, it will be necessary to consider neural plasticity resulting from:
The concept of emotion has aroused extreme theoretical positions: from Skinner ’s (1953) denouncement of it as a muddle-minded causal fiction to the view that it is fundamental to the whole of psychology (Panksepp, 1998). Although it is more than 120 years since William James (1884) asked, “What is an emotion?” the question proved so difficult to answer that for a long period the word emotion virtually disappeared from psychology textbooks and even from more specialized books on learning or cognition. For those with a strongly behaviorist perspective, there might seem to be no reason to regret this; nor, indeed, to concern yourself with theories, central or otherwise, of emotion and motivation. For those focusing on cognitive processes also, motivation and emotion may seem peripheral. However, we believe that behavioral observations can best be integrated, and cognitive processes best understood, if we see behavior as the result of activation of one or more of a set of distinct hierarchically organized systems in the brain, where each system has evolved under pressure from a different specific class of adaptive requirements. Critically, we believe we can identify the resultant emotion, and associated motivation, with the general adaptive function that defines a class of behaviors even when the specific behaviors produced differ across occasions or species. By this route, we can achieve theoretical integration along the phylogenetic scale. The emotion systems controlling such behaviors, and their interaction with cognitive processes, such as working memory, have now become the subject of intense and detailed study (LeDoux, 1993).
• Simple association: Where no specialized reinforcer is required to generate plasticity and where behavior undergoes relatively little modification but engages in stimulus substitution; • Stimulus-reinforcer pairing: Where the result will often be observationally classical conditioning, but where response to the conditional stimulus may not be the same as those to the unconditional stimulus, and where the result can also be observationally instrumental conditioning; and • Stimulus-response-reinforcer pairings: Where the result will be observationally instrumental conditioning.
These adaptation-specific (emotional) systems are also connected with two general systems that control approach and avoidance motivations, respectively—as well as a third system that resolves conflicts between these motivations. In this context, motivation is an ambiguous term. A motivation (e.g., thirst) is specific and distinct from other motivations (e.g., hunger). But the specificity is most obvious in terms of elicited behavior and, when we talk about motivation rather than emotion, we are most often thinking of it in terms of general approach and avoidance tendencies, or positive
Particularly in this latter case, learning itself is initially associated with strong emotional reactions but well-learned responding need not be. Thus, there is a strong link between emotion and motivation (with the latter apparently embedded in the former). But, emotional reactions have many semi- or actually independent parts and so, at the limit, all that may apparently be left is a motivation. The relation between motivation and emotion, as linguistic terms, may be murky but, as we shall see, the phenomenology, and the use of the terms, can be anchored through central (neurally based) theories. 710
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c36.indd 710
8/19/09 4:13:08 PM
Value of Central (Neurally Based) Theories
VALUE OF CENTRAL (NEURALLY BASED) THEORIES Recently, rather than being the topic that cannot be named, emotion (often without any definition of the term) has become a focus of study of a wide variety of phenomena in behavioral neuroscience. But there is still no consensus as to what an emotion is (and, as we shall see, the term may not refer to any single coherent internal entity). Motivation is also not clearly defined. The root of both words implies that the construct being referred to is something that produces movement—and yet most psychologists contrast emotion with motivation. Despite this, it is difficult to think of motivationally significant stimuli that are not characterized by the capacity to elicit emotion. In this chapter, we hope to show that a neuroscientific approach can clarify the nature of emotion, motivation, drive, and related constructs in ways that, if not impossible for a purely behaviorist approach, are at least very difficult if all that is measured is behavior. The focus of this chapter on central theories of motivation and emotion is to a large extent predicated on taking a neuroscientific approach. Behaviorally-based theories of, for example, a central motivational state have been proposed in the past (Bindra, 1969). However, the dissection of the parts of which emotional and motivational reactions are composed and the linking of those parts into coherent, predictive theory is very difficult with purely behavioral methodologies. By contrast, a neuroscientist can, often literally, dissect classes of behavior and their control systems. They can also do so without first defining, or even proving the existence of, the higher order entity that they are dissecting. If a particular drug or brain lesion changes one set of behaviors, but not another, then clearly these sets represent different functional classes. That said, proper behavioral analysis will also then be required to determine the functional nature of the classes that have been so separated. Neurally grounded theories of emotion and motivation have the key advantage, then, that they are anchored in specific anatomically identifiable systems. Their accounts do not depend on the superficial characters of behaviors and, indeed, can treat superficially quite different behaviors in different species as homologous. Neural homology and evolutionary (functional) homology, therefore, go hand in hand. When one is discerned, the other can usually be discovered—and vice versa. Evolutionary (and thus psychological function) become, then, things that must be extracted from the nature of known neural systems. With this approach, the definition of a psychological construct should map to a specific aspect of a coherent neural and functional system. In some cases, achieving this mapping
c36.indd Sec5:711
711
requires elimination of an older psychological term and creation of a new one. However, neural analysis cannot proceed by itself. While it can anchor and dissect constructs derived from the experimental analysis of behavior and from ethological analysis, the brain is so complex that, without preliminary behavioral analysis, functional systems cannot easily be identified. Neural analysis of circuits that show lateral inhibition, for example, allows explanation of a wide range of sensory illusions—including those where the presence of lateral inhibition in the relevant circuits is inferred rather than directly measured. However, one could not have easily predicted any of these illusions (or any aspect of our experience of “normal” perception) from the simple observation of lateral inhibition at the neural level. So, central theories of emotion and motivation are the result of continuous interaction between behavioral and neuroscientific approaches. The neuroscientist provides anchors and mechanisms for genuine central (nervous system) theories of motivation and emotion; but, when these theories are properly developed, they are also central theories from a more psychological perspective. The patterns of activity in their higher order neural elements are central cognitive and emotional states. The behavioral neuroscientist, then, can integrate behavioral observations in terms of higher order internal states (something that all but the most radical behaviourist would see as desirable), but does so in terms of direct measures of those central states and so avoids the problems (which drove the development of the radical behaviourist philosophy) of inferring specific complex central states solely from patterns of behavior or, worse, introspection. Perhaps the most important feature of central theories of motivation and emotion for higher-level psychological analysis is one that is usually implicit rather than explicit at the level of neuroscientific analysis. Central motivationalemotional states need to be viewed at the neural level as complex compounds. This is true in two senses. On the one hand, they are complexes of emotional reactions and motivation: Initial elicitation of emotional reactions also generates motivation; but, particularly with well-learned responses, motivation can drive behavior in the absence of major emotional reactions. On the other hand, an emotion can be the result of parallel independent processes rather than of output from a single central control system. It will be seen, later, that current central theories share a tendency to see the critical elements of neural/cognitive processing as “goals.” Neither purely cognitive nor purely emotional/ motivational attributes are given primacy; and simple stimulus-response reactions are rejected. The key drivers of behavior are seen as cognitive-emotional compounds (Hinde, 1998).
8/19/09 4:13:08 PM
712
Central Theories of Motivation and Emotion
ROAD MAP TO CENTRAL THEORIES OF EMOTION AND MOTIVATION We start with the esoteric and microscopic. We look at the bits and pieces from which evolution has formed emotions and motivations. We then move to the general, the basic reinforcement systems through which the stimuli that elicit highly specific “fixed action patterns” can, through learning, shape general, flexible, emotion-independent behavior. We then compare and contrast some current central theories of emotion and motivation that amalgamate these specific and general aspects of behavioral control. Finally, we indulge ourselves—and hopefully show that our previous dry, didactic analysis has significant mundane applications—by looking at some possibly unexpected implications of current central theories.
EMOTION, MOTIVATION, AND EVOLUTION The behavioral neuroscientist thinks in terms of specific neural networks that deliver, often complex, patterns of behavior in response to appropriate environmental circumstance. Such networks cannot appear in evolution or development fully formed. They must result from progressive, incremental changes. In evolution, these changes occur as the result of random mutations interacting with selection pressures. As mentioned earlier, we would equate a specific emotion with the nature of the consistent selection pressure (functional requirement) that has driven the evolution of a set of reactions. But this means that the underlying control of behaviors (and other, e.g., autonomic, reactions) need not map simply to their superficial organization. Evolution and “Rules of Thumb” as a Problem for Behavioral Analysis The selection pressure driving evolution can be understood in terms of models such as those of optimal foraging theory. These are theoretical analyses that determine the behavioral rules required to maximize such things as the amount of food that an animal can obtain given specific starting assumptions about the environmental constraints (McNamara & Houston, 1980). It should be noted that these analyses are not predictions as to the rules that an animal will use, but define the boundary conditions toward which an animal should evolve if there is sufficient mutation and if selection of advantageous mutations is not blocked in some way. The important concept here is that the animal can use rules of thumb (ROT) of a relatively simple sort to achieve behavior, under normal ecological conditions, that
c36.indd Sec5:712
approaches optimality—but where, in phylogenetically unusual conditions, responding may be suboptimal. For example, the parasite Nemeritis canescens “allocates its searching time in relation to host density approximately as predicted by an optimal foraging model [but] the decision rule used by Nemeritis . . . is a simple mechanism based on habituation to host scent—a far cry from the Lagrange multipliers and Newton’s iterative approximations used by the theorist to solve the problem” (Krebs, Stephens, & Sutherland, 1983, p. 188). ROT originate because, in the absence of any adaptive behavior, any mutation that results in any increase in adaptive value, however limited, will be selected. A later mutation can then provide a further increase in adaptive value—and so on. The result is that emotional control mechanisms may involve both serial and parallel ROT. In some cases, specific ROT may produce conflicting responses to the same stimulus (freezing and escape when faced with a threat, for example). These present no problem for behavior analysis as the distinct behaviors can be analyzed separately. In other cases, specific ROT may not conflict but may nonetheless fulfill quite different functions (increased blood-clotting factor is only required if escape is not successful). Again, because the responses are different, they can be identified as such and analyzed separately. The critical problem for behavior analysis is that in some cases multiple ROT can deliver essentially the same superficial behavior. They then provide the appearance, but not actuality, of a single generalized pattern of adaptive responding resulting from the application of a single, higher order, functional rule. This is exemplified by the partial reinforcement extinction effect. The Partial Reinforcement Extinction Effect and Serial ROT The partial reinforcement extinction effect (PREE) is a greater persistence of responding in extinction after prior training on partial (intermittent) reinforcement than after prior training with continuous (consistent) reinforcement. It is one of the more reliable phenomena in behavioral analysis. McNamara and Houston (1980) analyzed the general problem of how long to persist when responses no longer yield rewards. They looked at the specific case (which occurs with extinction of any positively reinforced response) of a number of initial responses that are rewarded with some probability p that are followed by a number of later responses that deliver no reward. The response is assumed to have some cost (e.g., loss of energy in making the response). Absolute optimality (which cannot be achieved in the real world without precognition) is to cease responding as soon as reward is no longer available. The theoretical
8/19/09 4:13:08 PM
Emotion, Motivation, and Evolution
optimality problem is, then, to determine the rule that defines the point when an animal should decide that reward has actually become unavailable rather than the alternative possibility that it is faced with an unusually long run of nonrewarded responses in a sequence with average probability p. The precise answer to this question depends on the cost of responding and the value of p. Under realistic conditions, the value of p is not known and so it must be estimated from the pattern of rewards. Further p—and reward value and even cost value—are likely to vary from response to response. This presents a highly complex set of adaptive requirements. However, it turns out that “regardless of the exact [values of these parameters], the optimal policy for this sort of problem involves persisting for far more trials in the face of failure if [the original] p [of reward] is low. This provides an explanation of the PREE in terms of optimality theory” (McNamara & Houston, 1980, p. 687). The explanation of the PREE by optimality theory is not a mechanistic explanation. It is, rather, a description of the general functional requirements that provide a background against which any mechanism that results in persistent responding will be selected. It is not a prediction as to how an animal will actually solve the problem. Further, it does not give us any insight into what ROT the animal uses; whether more than one ROT is required; or even whether extinction and resistance to extinction are derived from the same ROT. This is where attempts to determine the central mechanisms underlying the PREE provide some surprising answers. Behavioral analysis of the PREE suggested that it could depend on simple associative effects (Sutherland, 1966), including those based on conditioning to the after-effects of reward and nonreward (Capaldi, 1967) or, alternatively, could involve more emotionally mediated effects resulting from the generation, by nonreward, of frustration (Amsel, 1992). Consistent with the idea that independent ROT can control apparently similar behavior under different conditions, the PREE is differentially sensitive to drugs. With short inter-trials intervals (when associative explanations appear to explain the behavioral phenomena best) the PREE is not sensitive to anxiolytic drugs; whereas at long inter-trial intervals (when frustration appears to explain the behavioral phenomena best) the PREE can be essentially eliminated by anxiolytic drugs (Feldon, Guillamon, Gray, De Wit, & McNaughton, 1979; Ziff & Capaldi, 1971). However, if we ask about the psychological nature of the neural systems specifically affected by these drugs, we discover some interesting properties of the processes involved. Emotional explanations of the PREE have often focused on counterconditioning—the reduction in negative affective value when negative stimuli are paired with positive
c36.indd Sec1:713
713
ones. Anxiolytic drugs do not reduce counterconditioning (McNaughton & Gray, 1983). The drugs appear, instead, to reduce a nonassociative “toughening up” process (McNaughton, 1989b, chap. 7). Further, although the drugs affect both extinction (which could be viewed as dependent on conditioned frustration) and the PREE (which could be viewed as dependent on toughening up to the experience of conditioned frustration) in ways that could seem to depend simply on changes in sensitivity to the emotional experience of conditioned frustration, it turns out that extinction and the PREE depend on quite distinct neural systems and are, in a sense, unrelated to each other (Gray & McNaughton, 2000, appendix 9, table 1). Extinction in continuously reinforced rats is retarded by fiber-sparing lesions of the hippocampus proper, which do not reduce the PREE. Conversely, extinction in continuously reinforced rats is unaffected by lesion of the pathway connecting the subiculum of the hippocampus to the nucleus accumbens but these same lesions abolish the PREE. Thus, both extinction and the PREE each appear to depend on a number of mechanisms (each one based on a particular ROT) and, in at least some cases, the mechanisms delivering extinction are quite distinct from those delivering the PREE. We thus have evidence for a variety of parallel ROT delivering adaptive extinction responding under a variety of situational circumstances (in particular, varying schedules of reward and reward omission). Separation Anxiety and Parallel ROT In one sense, the idea of parallel ROT—that is parallel systems concurrently activated—seems trivial. Autonomic and skeletal reactions, for example, must have evolved separately and are certainly represented in separate parts of the brain once we get “below” command centers such as the periaqueductal grey (Bandler, Keay, Floyd, & Price, 2000; Bandler, Price, & Keay, 2000). However, this issue is only trivial if a single command center controls both aspects of output. At least in the case of separation anxiety, this is not the case. Separation anxiety is clearly identifiable, both by the means of producing it (removal of the primary caregiver, usually the mother) and by its characteristic pattern of autonomic and behavioral changes. It can be seen, in much the same form, in human children and the young of other mammals, such as rats, dogs, and primates. When the “reaction is beyond that expected for the child’s developmental level,” it becomes Separation Anxiety Disorder (American Psychiatric Association, 1987). The behavioral and autonomic components of this emotion give the appearance of joint outputs from a single command center—and, if either output were missing, the
8/19/09 4:13:09 PM
714
Central Theories of Motivation and Emotion
result would not be what is generally recognized as separation anxiety. However, it has been shown that, in rats, the behavioral reactions (locomotion, grooming, defecation, and urination elicited by a novel environment) can be eliminated by the presence of a nonlactating foster mother, whereas the autonomic reaction (a reduction in heart rate) can be eliminated by regular feeding with milk—but not, in either case, vice versa (Hofer, 1972). Thus, the two effector aspects of the one emotion can be doubly dissociated in the laboratory. It appears that rather than available stimuli each activating a single cognitive center (detecting, say, threat in general), it is possible that each recognizable aspect of an emotion could result from a different aspect of the available stimulus input (Figure 36.1). Each emotion could consist of multiple parallel ROT. As with serial ROT, this does not create a problem for our naming of the phenomena. Separation anxiety remains a nameable set of entities that are coherent under normal ecological circumstances and our analysis does not require any change in the everyday use of the term. But, for scientific purposes, we must view the term S1 S2 S3 S4 S5 S6 S7 S8 or S9
S1 S2 S3 S4 S5 S6 S7 S8 S9
THREAT
Respiration (+) Muscle energy (+) Blood clotting (+) Freeze Gesture Heart rate Flight
Respiration (+) Blood clotting (+) Freeze Gesture Heart rate Muscle energy (+) Flight-A Flight-B
Figure 36.1 The extremes of the possible neural relations that could have evolved to control responses to threat. Note: The top half of the figure shows the functional relations linking stimuli (S1–S9) to responses where the stimuli are either regular predictors of threat (S1–S7) or where different stimuli are predictive of threat at different times (S8, S9). It can also be viewed as a representation of the simplest view of emotional states, namely that all stimuli activate a single neural representation of threat and this in turn activates the separate response systems. The bottom half of the figure shows, in its most extreme form, the opposite type of neural organization suggested by Hofer ’s experiments (see text). Here, each response system is under its own private stimulus control. Some stimuli (S2) may have not acquired control over any response system and some stimuli (S8, S9) may have acquired control over a particular response (flight) but only under some circumstances (-A, -B). Redrawn from “Anxiety: One label for many processes.” New Zealand Journal of Psychology, 18, Figure 1, p53 by McNaughton, 1989a.
c36.indd Sec1:714
as grounded in a particular class of evolutionarily recurring situations (loss of parents) that give rise to a consistent set of adaptive requirements and so a usually consistent effector pattern (behavioral and autonomic) that constitutes a fairly consistent distributed central state—but without the need for a single command center or any other internal link between the components. Evolution, ROT and Functional Definitions of Emotional Systems If parts of a functional system can be independent, whether as a result of serial or parallel ROT, how can we understand or define the system—or even refer to it as a system at all? Rather than being a major problem, inverting this question allows us not only a convenient way to refer to, and to distinguish among, central emotional and motivational systems but also as well as a means of dealing with the fact that these systems involve multiple hierarchically organized layers: [This] approach to [emotion] stems from analysis of its possible functional significance. This approach is based on the premise that important and pervasive human action tendencies, particularly those which occur across a wide range of cultures and specific learning situations, are very likely to have their origin in the functionally significant behavior patterns of nonhuman animals. . . . This approach, working through the characteristic behavior patterns seen in response to important ecological demands (e.g., feeding, reproduction, defense) when animals are given the rather wide range of behavioral choices typical of most natural habitats, is called ethoexperimental analysis. It involves a view that the functional significance of behavior attributed to anxiety (or other emotions) needs to be taken into account; and that this functional significance reflects the dynamics of that behavior in interaction with the ecological systems in which the species has evolved, implying that these dynamics . . . can be determined far more efficiently when the behavior is studied under conditions typical of life for the particular species. (R. J. Blanchard & Blanchard, 1990b, p. 125)
Detailed ethological analysis of defensive responses obtained under experimentally controlled conditions by the Blanchards has demonstrated a categorical separation of a set of reactions that can be grouped together under the rubric of “fear” from a quite distinct “anxiety” set (R. J. Blanchard & Blanchard, 1988; R. J. Blanchard & Blanchard, 1989, 1990a, 1990b; R. J. Blanchard, Griebel, Henrie, & Blanchard, 1997). The Blanchards elicited their set of “fear” behaviors with a predator. These behaviors, originally all linked solely through ethology, turn out to be sensitive to drugs that are panicolytic but not to those that are only anxiolytic
8/19/09 4:13:09 PM
Emotion, Motivation, and Evolution
(R. J. Blanchard et al., 1997). The Blanchards elicited their set of “anxiety” behaviors (especially risk assessment; see Chapter 49) with stimuli that only suggested the potential presence of a predator. These behaviors, again originally all linked solely through ethology, turn out to be sensitive to anxiolytic drugs. The Blanchards’ detailed analysis, and its pharmacological validation, provides a basis for coherent conceptualization of a vast animal literature. For example, their analysis of fear predicts the well-demonstrated insensitivity to anxiolytic drugs of active avoidance in a wide variety of species and of phobia in humans (Sartory, MacDonald, & Gray, 1990). Because of the detailed effects of anxiolytic drugs on operant and other behavior (Gray, 1977), we have argued (Gray & McNaughton, 2000; McNaughton & Corr, 2004) that the key factor distinguishing fear and anxiety is one of “defensive direction.” Fear is that set of reactions that have evolved to allow the animal to leave a dangerous situation (predator escape; operant active avoidance); anxiety is that set of reactions that have evolved to allow the animal to enter a dangerous situation (e.g., cautious “risk assessment” approach behavior) or to withhold entrance (passive avoidance). Evolution, ROT and Hierarchical Organization With the PREE, we simply accepted the fact that, where there is a single high-level general rule for optimal behavior, there may be multiple ROT that deliver the appropriate behavior under different circumstances. However, when the functional requirement is something as general as “escape,” different circumstances may not only require different ROT to produce essentially the same behavior pattern under those different circumstances but also require noticeably different behavior patterns to achieve the result. Here we can link the evolution of serial ROT to the hierarchical organization of emotional systems. At the perceptual level, there are both “quick and dirty” as well as “slow and sophisticated” sensory mechanisms for detecting predators (LeDoux, 1994). There are also simpler and more complex behaviors that can be generated depending on the time available for execution (and other constraints). We can see all these mechanisms as parallel ROT that have evolved to improve survival in the face of threat, each new one filling a gap left by existing mechanisms. But these ROT have not evolved entirely independently of each other. First, simpler mechanisms will have evolved before more complex ones, providing a substrate for the development of the more complex and also providing a partial solution to the global problem that leaves a gap in adaptive advantage that later ROT must fill. Second, it makes no sense to have available a slow and sophisticated
c36.indd Sec1:715
715
strategy for, say, escape if an evolutionarily older panic reaction takes command of the motor apparatus. When it is activated, a higher and slower mechanism must be capable of inhibiting inconvenient aspects of the lower and faster mechanisms. The result, with defensive behavior, has been the evolution of a hierarchically ordered series of defensive reactions (each appropriate to a particular “defensive distance,” see the discussion that follows) that, in turn, map to lower and higher levels of the nervous systems, respectively. While behaviorally and neurally complex, all these reactions fulfill the same basic function and so can all be seen as part of a single “fear system.” The Blanchards developed the concept of defensive distance as part and parcel of their analysis of the differences between fear and anxiety, mentioned earlier. Operationally, with the most basic defensive reactions, it can be viewed as the literal distance between the subject and a predator. It is a dimension controlling the type of defensive behavior observed—that is, specific behaviors appear consistently at particular distances. In the case of defensive avoidance, the smallest defensive distances result in explosive attack, intermediate defensive distances result in freezing and flight, and very great defensive distances (i.e., absence of the predator) result in normal nondefensive behavior. However, defensive distance is not related directly to distance per se. It operationalizes an internal cognitive construct of intensity of perceived threat. For a particular individual in a particular situation, defensive distance equates with real distance. But, in a more dangerous situation, a greater real distance will be required to achieve the same defensive distance. Likewise, in the same situation, but with a braver individual, a smaller real distance will be required to achieve the same defensive distance. This concept can resolve otherwise unexpected findings in, for example, behavioral pharmacology. It is tempting for those who focus on behavior as the thing to be studied in itself, as opposed to being a sign of states within the organism, to expect particular pharmacological interventions to affect specific behaviors in a consistent way. That this is not the case is shown by the effects of anti-anxiety drugs on risk assessment behavior. If perceived intensity of threat is high (small defensive distance), an undrugged rat is likely to remain still. Under these conditions, an anxiolytic drug will increase risk assessment (this will increase approach to the source of threat). But, if perceived threat is medium, an undrugged rat is likely to engage in risk assessment behavior. Under these conditions, an anxiolytic drug will decrease risk assessment (which again increases approach to the source of threat as it releases normal appetitive behavior). Thus, the drug does not alter specific observable risk assessment behaviors consistently but instead produces
8/19/09 4:13:09 PM
716
Central Theories of Motivation and Emotion
changes in behavior that depend on the animal’s initial state and are consistent with a pharmacological increase in defensive distance (R. J. Blanchard & Blanchard, 1990; R. J. Blanchard, Blanchard, Tom, & Rodgers, 1990). This leaves us with a picture of ROT (in this case various levels of defense reaction) that have accumulated hierarchically. Their evolution has been accompanied not only by mechanisms controlling which ROT control behavior at any particular moment in time but also by mechanisms that can adjust which level of the system is selected by any particular external stimulus configuration (or rather the cognitions engendered by the stimuli). In the case of the defense system, the hierarchical levels of responding can be mapped to levels of the nervous system
Defensive Distance
Defensive Avoidance Prefrontal Ventral Stream
Posterior Cingulate
OCD1 Surface Obsession
Septo-HippoCampal System
Generalized Amxiety Cognition/Aversion
⫹ Amygdala
PhobiaArousal
Generalized Amxiety Arousal/Startle
⫹ Medial Hypothalamus
Agoraphobia Cognition/Rumination
⫹
PhobiaAvoid
⫹ Amygdala
Social Anxiety Complex Cognition
⫹
⫹ Amygdala
Defensive Approach Prefrontal Dorsal Stream
OCD2 Deep Obsession
⫹ Anterior Cingulate
and, at least, some of the overall control mechanisms identified. This is shown in Figure 36.2. The precise details contained in the figure are not important for our current argument and are dealt with in detail elsewhere (Gray & McNaughton, 2000; McNaughton & Corr, 2004; see Chapter 36) and are also briefly summarized in the section on specific central theories that follows. The important point is that a central theory of emotion, such as this, can treat different classes of behavior as, in one sense, discrete—each controlled by a particular different part of the brain—but at the same time can show that these different classes contribute to a more generalized functional system with control of the different parts that is at least sometimes integrated.
⫹ Medial Hypothalamus
Phobia Escape
Focussed Anxiety Risk Assessment
⫹ Periaqueductal Gray
Periaqueductal Gray
Panic Explode/Freeze
⫺
5HT NA
Figure 36.2 The two-dimensional defense system. Note: The two columns of structures represent subsystems controlling defensive avoidance and defensive approach, respectively. Each subsystem is divided, from top to bottom, into a number of hierarchical levels, both with respect to neural level (and cytoarchitectonic complexity) and to functional level (i.e., defensive distance—small at the bottom, large at the top). Each level is associated with specific classes of normal behavior and so, also, symptom and syndrome of abnormal behavior. Each level is interconnected with adjacent levels (vertical arrows shown) and also with higher and lower levels (connections not shown) and these connections allow integrated control of the whole subsystem. The subsystems are also connected with each other (horizontal arrows shown) allowing for control of behavior to pass between one and the other. Superimposed on the levels
c36.indd Sec1:716
“Anticipatory Panic”Defensive Quiescence
BDZ
of each system is input from monoamines systems. The monoamines modulate activity, essentially altering defensive distance generally, and so which level of a subsystem will be in control of behavior at any particular point in time. Endogenous hormones binding to the benzodiazepine receptor (BDZ) can similar alter defensive distance but only in relation to structures in the defensive approach subsystem and to a lesser extent at the highest and lowest levels of the system than at the middle levels (as indicated by the width of the stippled oval as it intersects a structure). NA 5 Noradrenaline; 5HT 5 Serotonin. For details see “A Two-Dimensional Neuropsychology of Defense: Fear/Anxiety and Defensive Distance,” by McNaughton and Corr, 2004, Neuroscience and Biobehavioral Reviews, 28, pp. 285–305. Adapted with permission from Figure 3, p. 293.
8/19/09 4:13:10 PM
Emotion, Motivation, and Learning 717
EMOTION, MOTIVATION, AND LEARNING Emotional systems have multiple parts that are several and distinct. Each involves a particular proximal form of appetitive or aversive behavior. But emotional stimuli are also reinforcing and, here, there is a surprising functional unity. Before proceeding to a consideration of the link between motivation and emotion, it will be helpful to clarify what modern neuroscience can tell us about the central mechanisms of reinforcement. Much analysis of emotion and motivation in the experimental literature has used learned responses because of their analytical simplicity. This can create problems when we attempt to link emotional concepts developed via ethological analysis with theories of learning and motivation developed via the experimental analysis of behavior. Association versus Classical Conditioning versus Instrumental Conditioning at the Neural Level The dominant paradigm for the study of synaptic processes underlying learning and memory is long-term potentiation (LTP), a phenomenon discovered by Bliss and Lomo (Bliss, Gardner-Medwin, & Lomo, 1973; Bliss & Lomo, 1973). Although LTP is usually studied electrophysiologically by high-frequency stimulation of a single neural pathway, its molecular mechanisms can clearly support strengthening of a single synapse that is driven by the coincidence of a previously weak input at that synapse with the firing of the cell produced by a strong input. The key aspect of this strengthening (which at most junctions depends on a specific receptor, the NMDA receptor) is that it is associative. Only currently active synapses (essentially acting as CS⫹) are strengthened and other inputs to the same target cell that are not active (CS⫺) are not strengthened. This strengthening appears, ultimately, to involve structural changes in the synapses and not merely depend on modification of biochemical pathways (see Chapter 27). LTP has attracted particular attention because it conforms very tightly to the requirements for memory formation postulated of cortical neural processes by Hebb (1949). Hebb’s rule (as it has come to be known) can be summarized as “cells that fire together wire together” and was postulated simply on the basis of psychological findings with no evidence for a matching real neural process until the discovery of LTP. An important point to note is that Hebb’s original example discussed the linking of two stimuli within the visual cortex. His postulated mechanism was, therefore, purely associative, requiring no additional reinforcer to strengthen the connection. A light paired with a light
c36.indd Sec7:717
would become associated via connections within the visual cortex, as could a light with a tone—given the existence of “silent” connections between visual and auditory areas (Figure 36.3). Thus Hebbian learning is best exemplified by what is normally known as “sensory preconditioning.” (“Sensory preconditioning” is essentially a misnomer based on the, false, assumptions that learning requires a reinforcer and that without a change in behavior conditioning has not occurred.) The typical sensory preconditioning experiment can be confusing because it requires a reinforcer in order to demonstrate learning that did not itself depend on one. (With humans, we can omit the reinforcer by asking people to report their knowledge verbally.) The typical phases of a sensory preconditioning experiment are: Phase 1: Stimulus A (a light) is paired with stimulus B (a tone) in a series of classical (Pavlovian) conditioning-like trials. Neither A nor B produces any observable response, before or after the conditioninglike trials. Phase 2: Stimulus B (the tone) is next paired with a food in a series of conditioning trials. Initially, the subject salivates when the food is presented; after a number of trials, they salivate when B is presented. Phase 3: Stimulus A is now presented to the subject without any previous pairing of A with food. In experiments of this type it is usually found that the subject will salivate when A is presented. Yet, A has never been paired with food. The conclusion from these results has to be that, during Phase 1, an association was formed between A and B. In Hebb’s version of events, there initially exists a weak connection between a cell assembly activated by A and a cell assembly activated by B. When A is presented close in time to B, its weak synapses on the cell assembly encoding B will be activated at the same time that the cell assembly fires and so the connection will be strengthened. On later presentation of A, this connection activates (at least partially) the B cell assembly—and so produces, although perhaps weakly, the neural effects of the presentation of B (Figure 36.3A). In Phase 2, stimulus B acquires observable consequences. These consequences are therefore likely to follow from the subsequent activation of the A assembly even in the absence of direct input by the B stimulus to the B assembly. This effect of A is demonstrated in Phase 3. The purely associative process of long-term potentiation can also explain “classical conditioning” involving
8/19/09 4:13:11 PM
718
Central Theories of Motivation and Emotion A: LTP – Sensory Preconditioning
B: LTP – Stimulus Substitution
Condition A
“A”
Condition A
“A”
B
“B”
!
“!”
Test A
“A”
A
Test
“B”
Condition A
“A”
!
“A”
! “R”
B
Test A
“A”
“R”
CR
Figure 36.3 Different ways in which neural plasticity can result in associative learning. Note: A: Long-term potentiation (LTP) resulting in sensory preconditioning. Pairing of a neutral stimulus A with a second neutral stimulus B strengthens the connection between the representation of A and B such that presentation of A activates the representation of B when B is not physically present. B: As in A but with the second neutral stimulus (B) substituted by a reinforcer. The unconditioned response (UR) to ! undergoes Pavlovian stimulus substitution with the result that it, or some component of it, appears as the conditioned response (CR) when A is later presented alone. C: Activity dependent facilitation (ADF) as a basis for reinforced classical conditioning. Pairing of a neutral stimulus with a reinforcer results in strengthening of the connection of A with the neural
Pavlovian stimulus substitution without the need to invoke a specific reinforcement process (Figure 36.3B). If B is a motivationally significant stimulus prior to pairing with A, then activation of its stimulus representation by A will result in the same responses to A as previously occurred to B. This is like sensory preconditioning but with the link between B and an observable response having been established previously by evolution rather than later by an experimenter. In the case of tone-shock conditioning, the specific synaptic junction generating the conditioned fear reaction has been identified as a monosynaptic connection
c36.indd Sec7:718
CR
D: DA LTP – Response Reinforcement
Condition
Test A
“A”
“!”
C: ADF –Stimulus Reinforcement
A
UR
“R”
UR
“A”
“R”
CR
representation of a response (R), independent of whether the response is currently activated. The result is classical conditioning that can produce a response that was not elicited by the unconditioned stimulus and so need not involve stimulus substitution. D: Dopamine-dependent-LTP (DA왘LTP) as a basis for reinforcement of instrumental responding. A low baseline emission of an operant response R is supported by the presence of an eliciting stimulus B. A conditional stimulus A is paired with the delivery of reinforcement (!) when the response is emitted as a UR. This strengthens the connection between the neural representation of A and the neural center controlling the emission of the response. This results in the response being emitted as a conditioned response when A is presented in future.
between the thalamus (containing the tone representation) and the amygdala (which is activated by the shock and generates the unconditioned response). Injection of an NMDA antagonist into the amygdala blocks LTP and so acquisition of the conditioned response but has no effect if injected once conditioning is complete (LeDoux, 1994; for a more detailed analysis of fear conditioning see Chapter 39; for a comparison of the neural circuits involved in fear conditioning and eyeblink conditioning see Chapter 26). Simple LTP-based association can also explain what appears to be instrumental conditioning but is in fact
8/19/09 4:13:12 PM
Emotion, Motivation, and Learning 719
disguised classical conditioning with stimulus substitution. Pigeons are typically conditioned to peck keys that are lit prior to delivery of the reward. Under these conditions, autoshaping occurs. The pigeon comes to peck the key, essentially because its lit state predicts reward and not because the pecking is instrumentally reinforced. This is shown by two pieces of evidence. First, autoshaping with a superimposed instrumental omission contingency (which pits classical autoshaping against instrumental omission of reward if the pigeon pecks the key) results in behavior cycling between pecking and not pecking. The attractiveness of the lit key overrides any instrumental learning that pecking cancels reward; and the cyclical loss of responding can be attributed to extinction of the classical contingency rather than any effect of the instrumental one. Second, the nature of the key peck is determined by the reinforcer. The pigeon, effectively “drinks” a key paired with water and “eats” a key paired with food (Jenkins & Moore, 1973). With so much possible with simple LTP-dependent association and its resultant stimulus substitution, we might be inclined to abandon the idea of reinforcement altogether. However, neuroscience provides at least two cases where true reinforcement mechanisms can be invoked. The first reinforcement mechanism has been demonstrated in classical conditioning in the sea slug Aplysia californica. This animal is so simple that specific neurons can be identified and named reliably from animal to animal and be shown to control the same responses in each individual. This has allowed detailed analysis of the entire neural circuit involved in conditioning (Chapter 27; Kandel & Hawkins, 1992). Shock to the tail activates a single neuron that can release transmitter presynaptically onto the terminals connecting sensory neurons with a motor neuron that controls gill withdrawal. Pairing of a light touch to the mantle of Aplysia with a shock to the tail can then strengthen the connection between a sensory neuron that detects the touch and the motor neuron—a process referred to as activity dependent facilitation (ADF). The activity dependence of ADF results in a conditioned withdrawal of the gill to subsequent touching of the mantle (the CS⫹), but not of other sensory inputs, for example, a touch to the siphon. As with LTP, ADF is truly associative in that previous CS⫺ can be conditioned if they are later paired with the shock. An important feature of ADF is that, in contrast to LTP, it allows true reinforcement in the sense of production of a new response that is not elicited by the unconditional stimulus (e.g., freezing to a CS for a shock, in contrast to the movement and vocalization normally produced by the shock). The second reinforcement mechanism combines features of both standard LTP and ADF (Figure 36.3D). Like LTP, it requires the coincidence of the release of transmitter from
c36.indd Sec7:719
the presynaptic neuron with the firing of the postsynaptic cell. However, in addition, LTP only occurs if dopamine is released presynaptically as a result of activation of the brain’s “reward system” (Reynolds, Hyland, & Wickens, 2001). Notably the postsynaptic cell controls responding rather than encoding a stimulus. Its initial activation (on which responding and so reward-delivery are dependent) results from the presence in the environment of appropriate eliciting stimuli (unless the response can be spontaneously generated). As discussed next, this allows responses to continue to be produced on some occasions even when reinforcement conditions are changed or when the reinforcer is devalued. That is, a response can be habitual and its cessation will depend on active extinction as a result of negative reinforcement rather than simply fading away in the absence of significant events. (The phasic release of dopamine, relating to reinforcement and tonic release that can be identified with hedonic changes appear to activate different networks; and dopamine may not underlie all rewarding effects, see Chapter 40). From Emotion to Motivation Our argument, so far, is that specific ROT (controlled by specific neural mechanisms) have evolved in a not entirely piecemeal fashion so that, in at least some cases, they become organized into functional systems. In the case of defensive avoidance, we have a hierarchically organized system, each part of which can generate appropriate defensive behavior (e.g., freezing, aggression, escape, avoidance) within a specific range of environmental circumstances. A large part of the theoretical structure of Figure 36.2 is devoted to an account of a fairly large number of particular situation-typical behaviors, which we group together not because of their specific form but because they share the same general function: removing the animal from danger. Aversive stimuli—both natural stimuli, such as a cat, and artificial stimuli, such as presentation of a shock, as well as the omission of expected rewards (frustrative nonreward)—all tend to have similar eliciting properties. Presentation of a cat elicits autonomic arousal, freezing, or attack, if defensive distance is short, and escape where this is possible. Much the same pattern is produced by both presentation of shock and frustrative nonreward: autonomic arousal, attack if there is a conspecific close by to attack, and escape if this is available (Gray, 1987, chap. 10). More general avoidance behavior is appropriate not only for a wide range of dangers, in the sense of things that can cause pain, but also for other stimuli that are merely disgusting, or even simply of no current interest. Fear conditioning, learned escape, and learned avoidance of the simplest sort can all be viewed, in this context, as
8/19/09 4:13:12 PM
720
Central Theories of Motivation and Emotion
the result of simple Pavlovian stimulus substitution. Pure associative conditioning results in a previously neutral stimulus becoming a signal for an upcoming noxious event and resulting in the class of defensive response appropriate to the level of threat signaled. Whether the unconditioned stimulus is the presentation of a natural or artificial punishment or the removal of a natural or artificial reward, we can view avoidance behavior in general as resulting from activity in what has been known as the fight-flight system (Gray, 1987) but is probably better called the fight-flight-freeze system (FFFS; Gray & McNaughton, 2000). It is at this point that we must distinguish between two quite distinct ways in which the words fear and conditioning can be combined. In the first conjunction of fear and conditioning, fear conditioning, a neutral stimulus is paired with a shock and responses such as freezing are conditioned. Critically, the shock is inescapable and so the conditioned form of the previously unconditioned fear responses remains even after many trials. This conditioning is purely associative, as with the learning of a light-tone pairing of the type evidenced in experiments on sensory preconditioning. It is dependent simply on the coincidence of the two critical stimuli that then become associated via the process of long term potentiation (Fanselow & LeDoux, 1999). The stimulus we often refer to as the reinforcer is necessary if a response of some type is to be observed but the learned association can be formed even with neutral stimuli and so does not depend on reinforcement in the strict Pavlovian meaning of the term. In the second conjunction of fear and conditioning, conditioning of avoidance by fear (with, for example, a lever press as the avoidance response), something quite different happens. In the initial phases of training, there is both a high level of autonomic arousal and the release of the stress hormone corticosterone (Brady, 1975a, 1975b). However, once avoidance is well established, all these signs of emotional reaction disappear and the only obvious difference in behavior, as compared with behavior observed before training, is that the avoidance response is reliably produced. This leaves us in the apparently odd situation of maintaining that although an avoidance response is being made (as a result of the motivation of fear) the animal is not afraid (in the sense of showing emotional reactions). The commonsense view is that there is no reason for the animal to be afraid because it knows the avoidance response will prevent it from receiving a shock. There are two levels at which we need to take this idea seriously. The more trivial level at which a learned avoidance response is not driven by fear is that well-learned responses are, in a very real sense, habits. Even with positive reinforcers that are physically present on every trial (such as food for
c36.indd Sec7:720
a hungry animal), sufficiently long training results in the animal continuing to respond even when the reinforcer is devalued. The “rewarded response” is made but the reward itself is not consumed (Dickinson, 1980). With successful avoidance responses the reinforcer is never present and so responding can be even more resistant to extinction. The same is true of the conditioned suppression of behavior by anxiety. After extended training, the suppression becomes insensitive to anxiolytic drugs (McNaughton, 1985). The deeper level at which a learned avoidance response is not driven by fear rests in the fact that, unlike fear conditioning, it is not the presentation of shock that “reinforces” learning: rather it is the omission of shock. Continued responding is driven by relief, not fear. This is not mere semantic quibbling. In the same way that omission of reward has the same reinforcing properties (and many of the same eliciting properties) as the presentation of punishment—the “fear = frustration hypothesis” (Gray, 1987)—omission of punishment has the same reinforcing properties as the presentation of reward—the “hope ⫽ relief hypothesis” (Gray, 1987). As we shall see, below, we can attribute the learning of new responses to the release of dopamine and, consistent with this, dopamine is involved in avoidance conditioning (Sokolowski, McCullough, & Salamone, 1994; Stark, Bischof, & Scheich, 1999). Omission of punishment is, thus, truly rewarding. Here we should notice an asymmetry in the types of released behaviors associated with approach reactions compared to those associated with avoidance. Avoidance involves, in general, a hierachically organized set of released action patterns that do not vary much with the specific eliciting stimulus and that vary with “defensive distance”; approach involves, in general, released action patterns only in contact with the eliciting stimulus and then produces stimulus-specific responses. (Avoidance also involves stimulus-specific responses with contacting stimuli: for example attack of a predator is replaced with defensive burying of a shock probe—but these are not as many or various as the stimulus-specific responses produced by contact with appetitive stimuli. Likewise, there is little difference in principle between an appetitive-conditioned jaw movement response and an aversive-conditioned eyeblink response.) The specific behaviors observed in the context of active avoidance (when the animal moves away from a localized aversive stimulus) are suprisingly general and depend much more on defensive distance than on the specific nature of the aversive stimulus. Thus, both punishment and frustration will generally increase aggressive responses within and between many species, including humans (Renfrew & Hutchinson, 1983) and, in humans, will even increase aggressive responses directed at completely innocent inanimate objects (Kelly & Hake, 1970). Defensive behaviors,
8/19/09 4:13:12 PM
Emotion, Motivation, and Learning 721
then, give the appearance of output from a single, fairly homogenous, system—with specific released, as opposed to learned, behaviors varying mainly with the defensive distance. By contrast, the proximal behaviors required to consummate the approach to an appetitive stimulus are entirely stimulus specific. We eat food and mount a sexual partner, but not vice versa. There has not been reported, however, a hierarchical series of standard behaviors required for approach that varies with “appetitive distance.” It may simply be that there has been a lack of appropriate ethological analysis of such approach behavior. However, the behaviors required to approach an appetitive stimulus (other than simple locomotion) are unique to each situation and driven by the specifics of the situation rather than the nature of the appetitive stimulus. Indeed, the most obvious fundamental requirement is the learning of whatever new and, in evolutionary terms, completely arbitrary responses are required to achieve the goal. There are, then, no emotiongeneral innate reactions that characterize a specific appetitive distance. This not to say that there is no dimension of appetitive distance. Appetitive goals produce a systematic, distance-related, effect. But the evidence is that variation in distance between an organism and an appetitive goal drives the quantity or intensity of behavior, but not its quality. The intensity with which approach behavior is executed increases the closer an animal is to the goal, as if there is a “goal gradient” (Hull, 1952)—but this is as true of lever pressing on a fixed interval schedule in an operant chamber as of running in runway on a continuous reinforcement schedule. We have, therefore, two fundamental systems: one that controls the avoidance of specific stimuli (including reward omission) and one that controls approach to specific stimuli (including safety). Each of these is linked to systems that determine the specific aversive (e.g., defensive burying) or appetitive (e.g., eating) behavior that will be released by contact with a motivationally significant stimulus. But each is also more fundamentally a generic system devoted to avoidance or approach, respectively. Because of the asymmetry in functional requirements noted previously, the avoidance system has been named in terms of some common discrete elicited behaviors (fight, flight, freeze); while the approach system has been named generically the Behavioral Approach System (Gray, 1982) or Behavioral Activation System (Smits & Boeck, 2006)— with the abbreviation, BAS, designating the same appetitive neural system in both cases. Here we come to the nub of the relationship between the central control of emotion and that of motivation. To a first approximation, when we talk about emotion, we are talking about the elicitation of particular patterns of internal
c36.indd Sec7:721
(autonomic) and external (skeletal) behavior; when we talk about motivation, we are talking about the production of generalized approach or avoidance tendencies. Motivation, in this sense, cannot exist without emotion—at least in the initial phases of learning. But, in stable environments, with habitual responses reliably delivering appropriate appetitive stimuli or apparently successfully avoiding aversive stimuli, emotional reactions are minimized. We need to clear up a common misconception: There can be a tendency to link aversive stimuli and avoidance to emotion and to see them as distinct from appetitive stimuli and an approach that just involves motivation. This tendency results from the fact that the usual way to study aversive stimuli in the laboratory is to deliver electric shock (which requires no prior deprivation of some need for it to be effective); while the usual way to study appetitive stimuli is to deliver food to a hungry animal or water to a thirsty animal. It is common to see the eliciting stimulus of shock as creating the motivational state that drives behavior in the aversive case but to see deprivation, rather than the appetitive stimulus, as driving the motivational state in the appetitive case. Positive motivation does not, however, require a state of deprivation of some basic need. Female rats can often appear relatively passive during copulation—albeit showing receptive behavior linked to the phase of their ovarian cycle. However, not only does their receptive phase involve permitting the male to mount, it turns out that it involves more active tendencies when appropriate. Male (rats) normally pause for a while after intromissions, and for a longer time after intromissions that culminate in ejaculation. . . . Bermant (1961a, 1961b) provided female rats with a lever they could press to produce a male rat. After a mount (regardless of whether it resulted in intromission) the male was removed. The females quickly pressed the lever after the male was removed following a mount (without intromission), paused a bit more after an intromission (without ejaculation), and waited the longest time before summoning a male rat after ejaculation. Thus it appears that male and female rats prefer the same frequency of sexual contact. (Carlson, 1980, p. 333)
Here the reaction of the female rat to the male (albeit approach) is essentially the same class of reaction as that of a rat to a cat (albeit avoidance). The availability of the motivationally significant stimulus—and interactions with it—drives the behavior. One could argue that there is a background level of preparedness on the part of the female rat driven by the ovarian cycle—but there are also variations in fearfulness within rats from time to time and between rats, and the same is true for humans—especially with sexual receptivitiy.
8/19/09 4:13:13 PM
722
Central Theories of Motivation and Emotion
Even with hunger, it should be noted that the normal experience of hunger is linked as much or more to the availability of palatable food, or some other external or temporal cue for eating, than it is related to tissue need or level of deprivation (Pinel, 1997). For example, if a rat is provided for some time with six meals a day that are spaced irregularly but signaled by a buzzer and light stimulus and is then placed on free food so that it is satiated, presentation of the buzzer and light will elicit eating of as much as 20% of their daily food intake (Weingarten, 1983). (The total amount eaten over a day was not changed as later free feeding adjusted for the extra meal.) Likewise, if we see hunger as an essentially emotional rather than homeostatic state, we can understand its links with emotional disorder: the life-threatening reductions in weight that can occur in anorexia nervosa and the health-threatening increases in weight that can occur in depression. Likewise, simple rewarded responding can depend, like simple fear conditioning, on stimulus substitution. As we noted earlier, in experiments with autoshaped responses, pigeons produce stereotyped responses that show they are effectively drinking the key when they are thirsty and eating the key when they are hungry (Jenkins & Moore, 1973). Further, if the autoshaping schedule (which pairs a lit key with the reward) has added to it an instrumental omission contingency (so that pecking cancels food), the pigeon goes through cycles of responding and nonresponding corresponding to the simple associative contingencies in the situation, unlike a rat that ceases responding and reacts to the reinforcement contingencies (see Millenson & Leslie, 1979).
SOME CURRENT CENTRAL THEORIES OF MOTIVATION AND EMOTION There are many specific hypotheses currently being advanced by neuroscientists in relation to detailed aspects of the control of specific emotional reactions, motivational control systems, and learning and memory. For the behavioral scientist wanting to enter this field (which can appear like a minefield of novel jargon and mind-boggling detail), it is probably most important to note that the many detailed issues can be dealt with one at a time. You can focus on the detail that pertains only to the current issue. In essence, one is dealing with the neural specifics of particular ROT. Provided one has been warned about the capacity of ROT to be nested both in serial and parallel, it is not difficult to accept the bits of the jigsaw piecemeal and leave integration until sufficient bits have been obtained to make the overall puzzle worth solving. The most important thing is to not believe that the solution to the puzzle is obvious
c36.indd Sec7:722
and to wait for a sufficient number of the pieces to become available. Partly because they deal with the neural instantiation of ROT, neuroscientists seldom integrate their findings on emotion and motivation into grand overall theoretical schemes. They do use global, apparently integrative, concepts. But these concepts are usually taken directly from behavior analysis and so subsume ROT within what are effectively clusters (such as the PREE and instrumental learning) based on overall evolutionary function. This may give the impression that they are ascribing to ROT a specific source of integrated control but, as we have seen, this need not be the case. Instead, the use value of this approach is to gather together phenomena that may have some, albeit loose, integrated control—or that may have the appearance of control as an emergent property of the interaction of multiple ROT. There are, nonetheless, neuroscientifically grounded theories that attempt to provide more wholistic, integrated perspectives. In this section, we briefly describe some of these and show how the architecture of each maps to the basic ideas we have presented above. Gray and McNaughton We have based a number of the concepts we have already presented on one such theory—the idea, originally proposed by Jeffrey Gray (1982), that behavior is primarily controlled by a Fight-Flight-Freeze System (FFFS) and a Behavioral Approach System (BAS) with, linked to these, and controlling conflict between approach and avoidance, a Behavioral Inhibition System (BIS). This theory has clear links with the idea of multiple ROT, especially in its more recent development (Gray & McNaughton, 2000; McNaughton & Corr, 2004). Multiple ROT are instantiated in the mixture of levels and streams of Figure 36.2, which shows the FFFS and BIS, and in the matching levels of the separate stream of structures controlling the BAS (not shown). It also has at its core the idea that, in general, approach and avoidance behavior are each controlled in fundamentally the same way independent of the specific source of motivation for that approach or avoidance. Rolls This latter perspective is presented in perhaps an even stronger way by Edmund Rolls (1990, 2000) in his general theory of the control of emotion and motivation by the brain. He sees evolution as starting with simple ROT in the form of taxes that attract simple animals (including those with no nervous system) toward items that promote
8/19/09 4:13:13 PM
Some Current Central Theories of Motivation and Emotion 723
survival and reproduction and that drive them away from items with the opposite consequences. He argues that: brains are designed around reward- and punishment-evaluation systems, because this is how genes can build a complex system that will produce appropriate but flexible behavior to increase fitness. . . . If arbitrary responses are to be made by the animals, rather than just preprogrammed movements such as tropisms and taxes, [is] there any alternative to such a reward/punishment based systems in this evolution by natural selection situation? It is not clear that there is, if the genes are efficiently to control behavior. The argument is that genes can specify actions that will increase fitness if they specify the goals for action. It would be very difficult for them in general to specify in advance the particular response to be made to each of a myriad of different stimuli. . . . Outputs of the reward and punishment system must be treated by the action systems as being the goals for action. (Rolls, 2000, pp. 190, 183, 191).
Rolls could, at first blush, appear to be taking an excessively binary view. He states, for example, that “emotions can usefully be defined as states elicited by rewards and punishments, including changes in rewards and punishments” (Rolls, 2000, p. 178). He also argues that “the amygdala and orbitofrontal cortex . . . [are] of great importance for emotions, in that they are involved, respectively in the elicitation of learned emotional responses and in the correction or adjustment of these emotional responses as the reinforcing value of the environmental stimuli alters” (Rolls, 1990, p. 161). This perspective seems to force all emotion into either a reward or a punishment box with variation in behavior simply being the results of the learning of arbitrary responses. However, on closer inspection of the details of Rolls’ theory, it is clear that he allows not only for multiple ROT in terms of elements of behavior but also in terms of the separation of, for example, autonomic from behavioral aspects of emotional response. In his view, there are three major, neurally separate, classes of output available for any emotion: There are autonomic and endocrine outputs that optimize the state of the animal for particular types of action; there are implicit behavioral responses; and there are explicit behavioral responses. Implicit behavioral responses are controlled “via brain systems that have been present . . . for millions of years and can operate without conscious control. These systems include the amygdala and, particularly well developed in primates, the orbitofrontal cortex. They provide information about the possible goals for action based on their decoding of primary reinforcers taking into account the current motivational state, and on their decoding of whether stimuli have been associated by previous learning with reinforcement.” This clearly encompasses a wide range of emotion-specific and innately elicited responses. The control of explicit behavioral responses, by contrast,
c36.indd Sec2:723
“involves a computation with many ‘if . . . then’ statements, to implement a plan to obtain a reward or to avoid a punisher.” Here the behavior controlled is clearly general in its form and largely based on strategies for simple approach or avoidance. He locates the highest levels of this control in the dorsolateral prefrontal cortex—where they are strongly related to the processing of shortterm (or “active”) memory. Despite Rolls’ somewhat different perspective compared to Gray and McNaughton, he is like them in seeing orbitofrontal cortex as, in essence, coding “what” a stimulus is. “What” here has the sense of what specific class of reinforcer such as food, drink, or sex it is that the stimulus represents—and compounds “sensory integration, emotional processing, and hedonic experience” (see Chapter 41). Dorsolateral frontal cortex, by contrast, codes “where” a stimulus is. Thus both theories see a distinction between a reactive and excitatory orbital system and a prospective and inhibitory dorsolateral system. Critically, in the context of ROT, Rolls (2000) warns that “these three systems do not necessarily act as an integrated whole. Indeed, insofar as the implicit system may be for immediate goals, and the explicit system is computationally appropriate for deferred longer terms goal, they will not always indicate the same action. Similarly, the autonomic system does not use entirely the same neural systems . . . and will not always be an excellent guide to the emotional state of the animal, which the above argument in any case indicates is not unitary” (pp. 188–189). There is a strong link between emotion and motivation for Rolls, in both their more innate and more conditioned forms. While starting from the position that “emotions can usefully be considered as states produced by reinforcing stimuli” (Rolls, 1990), he sees the particular value of those states as involving elicitation of autonomic and hormonal responses and, in learning experiments, in the production of various conditioned emotional responses. Emotion, viewed in this light, provides a basis for the facilitation of memory storage and for the immediate elicitation of flexible responding when conditions change. The blocking of a learned response by new circumstances leaves intact the conditioned emotional response, which then provides the basis for the development of new behavior. Background autonomic and hormonal reactions provide the basis for the storage of such strategies as then prove successful. Ledoux For many years, Joe Ledoux has been developing a theory of fear, and consequentially, anxiety that is more limited in terms of the emotions analyzed but potentially deeper in the picture it presents of the details of the emotional systems. This can be seen as dovetailing to some extent with both the theoretical positions we have described so far.
8/19/09 4:13:13 PM
724
Central Theories of Motivation and Emotion
While Gray and McNaughton focus on hierarchy in terms of the specific elicited behaviors associated with specific defensive distances, Ledoux can be thought of as focusing more on hierarchies of stimulus analysis that are to some extent also selected by defensive distance. He has contrasted “quick and dirty” threat detection systems (operating via the thalamus) with slower and more sophisticated ones operating through sensory cortex (Ledoux, 1994) and more recently (Ledoux, 2002) has laid emphasis on the even slower, and potentially more sophisticated, mechanisms that reside in frontal cortex and are linked to working memory and that form of planning that we can call “worry.” At first, his theory appears to be at total variance with that of Gray and McNaughton. However, when we look at the neural details, we discover that the discrepancy is not great; and we demonstrate a major advantage of a central/neural approach to emotion and motivation. The details of the theories are linked to neural reality very tightly and this allows one to resolve, relatively easily, issues that depend much more on arbitrary linguistic definitions than scientific facts. Ledoux (2002) argues to some extent that anxiety is really fear but represented differently in consciousness. Thus: anxiety, in my view, is a cognitive state in which working memory is monopolized by fretful, worrying thoughts. The difference between an ordinary state of mind (of working memory) and an anxious one is that, in the latter case, systems involved in emotional processing, such as the amygdala, have detected a threatening situation, and are influencing what working memory attends to and processes. This in turn will affect the manner in which executive functions select information from other cortical networks and from memory systems and make decisions about the course of action to take. . . . I believe that the hippocampus is involved in anxiety not because it processes threat, as Gray suggests, but instead because it supplies working memory with information about stimulus relations in the current environmental context, and about past relations stored in explicit memory. . . . When the organism, through working memory, conceives that it is facing a threatening situation and is uncertain about what is going to happen or about the best course of action to take, anxiety occurs. (p. 288)
Ledoux’s very influential theory of the neural processing of fear was essentially incorporated into Gray’s (1982) original, essentially amygdala-free, theory in its revision by Gray and McNaughton (2000). So, as far as fear goes, there is essentially general agreement among the theories of Ledoux, Gray and McNaughton, and Rolls. It is in dealing with anxiety that he sees the Gray and McNaughton theory as underemphasizing working memory and worry, “in my opinion, it still gives the septum and hippocampus too prominent a role, at the expense of the amygdala and prefrontal cortex” (p. 288).
c36.indd Sec2:724
In resolving the differences, let us first note that Gray and McNaughton’s theory is anchored primarily in the effects of anxiolytic drugs. The link between anxiolytic action and effects on hippocampal electrical activity have been ever more firmly established (McNaughton, Kocsis, & Hajós, 2007). However, as Gray and McNaughton noted in their introduction: “psychosurgery”—lesions of the cingulate or prefrontal cortex—has been used as a treatment with some degree of success. So these cortical areas could well mediate extreme (Marks, Birley, & Gelder, 1966) or complex forms of anxiety, especially . . . in the case of obsessive-compulsive disorder (Rapoport, 1989). (Gray & McNaughton, 2000, p. 5)
Gray and McNaughton (2000) have a theory of “anxiolytic-sensitive anxiety” that necessarily separates this from the processes of anxiety (or fear or obsession) that are controlled by frontal cortex. What of their view of frontal and cingulate cortex—on which Ledoux focuses: We view them . . . as being hierarchically organized areas which deal (in their successively “higher” layers) with progressively higher levels of anticipation of action. . . . In the same way, then, that we distinguished the role of the hippocampus (in resolving concurrent goal-goal conflict) from the role of the defense system and other motor systems in resolving motor program conflicts without goal conflict, so we must distinguish its role from that of prefrontal and cingulate cortex. In our view these cortical areas are involved, quite independently of the hippocampus, in the resolution (i.e., ordering) of conflicts between successive sub-goals in a task. In the case of prefrontal cortex this amounts to saying that it is concerned with plans more than goals as such. However, where (as is common in certain types of working memory task) there is concurrent goal conflict within such a task, both the septo-hippocampal system and the prefrontal cortex are likely to be involved. (p. 5)
This view is not far from that expressed by Ledoux 2 years later, if we do not try and force the word “anxiety” to mean the same thing in the two cases. Gray and McNaughton focus on approach-avoidance conflict; something that can occur as a result of the apposition of two classes of innate releasing stimulus, with no requirement for learning or working memory. Ledoux focuses on “worry,” the maintaining of a perception of threat in working memory (with no necessary requirement for anything other than pure avoidance). The two theories are talking about different processes in different structures— and Gray and McNaughton have much the same view of the operations of frontal cortex and of the amygdala as Ledoux. Both theories agree that “the amygdala and hippocampus normally cooperate in the intact brain to store
8/19/09 4:13:14 PM
Some Current Central Theories of Motivation and Emotion 725
While it is not a full theory of emotion, mention should also be made here of Damasio’s somatic marker hypothesis (Damasio, 1995, 1996). This is a partial theory of how emotion or motivation can interact with cognition. It is
intended to be an account of only one of several ways that affect can influence decision making and focuses primarily on the operation of the ventromedial prefrontal cortex, to the exclusion of other frontal areas. It is of interest here for two reasons: First, its view of emotional influence is different from the theories we have discussed so far. Second, its view of somatic phenomena is broader ranging. Damasio’s theory (Damasio 1995, 1996; for a critical review, see Dunn, Dalgleish, & Lawrence, 2006; also see Chapter 38) originated in an attempt to account for the effects of ventromedial prefrontal damage. His patients showed severe impairments in decision making and in social choices but have intact IQ, learning, and retention of knowledge (including social knowledge) and skills, logical analysis, and language skills. They also perform normally on the Wisconsin Card Sorting Test that is normally affected by frontal damage. The abnormal decision making and social choices of these patients were accompanied by abnormalities in emotion and feeling and Damasio postulated that these emotional changes were the cause of the abnormal decision making. “The somatic marker hypothesis proposes that ‘somatic marker’ biasing signals from the body are represented and regulated in the emotion circuitry of the brain . . . to help regulate decision-making in situations of complexity and uncertainty” (Dunn et al., 2006, p. 240). The presence of what is, in effect, a somatic image called up by a situation constrains decision making and limits the amount of processing required of cognitive mechanisms either by explicitly labeling a scenario as negative or positive; or implicitly biasing decision mechanisms in a positive or negative direction. The somatic marker hypothesis differs from the other theories we have discussed in that it keeps the encoding of emotion (or strictly soma, see discussion that follows) distinct from the encoding of the information on which cognitive processes act, even at the prefrontal level. That is, emotional information can supplant, or bias, the processing of other types of information and is only integrated with them by altering their processing. The other theories, by and large, operate in terms of goals—compounds of cognitive (situational) and affective (affordance) information. It remains to be seen (Dunn et al., 2006) how far a somatic marker system in the ventromedial prefrontal cortex can be distinguished from some specific aspect of goal processing and how far it is qualitatively distinct from the other classes of processing that the hypothesis allows occur in other areas of frontal cortex. The somatic marker hypothesis is also broader ranging than conventional postive/negative valence approaches. Here it departs both from the other theories and from more conventional behaviorist perspectives. Damasio (1996) holds:
* We thank Rama Ganesan for bringing this literature to our notice.
that the results of emotion are primarily represented in the brain in the form of transient changes in the activity patterns
different components of the fear learning experience” (see Chapter 39). There is perhaps one area where discrepency may remain and where further experiment (or theoretical analysis) may be required to integrate the theories. Gray and McNaughton see the personality factor of neuroticism as being linked to frontal cortex, and as predisposing to both fear (threat avoidance) and anxiety (threat approach) disorders. Although they do not explicitly do so, they should link this personality factor to worry. For them worry is something that, if excessive, can lead to both pathological fear and pathological anxiety. These two states would appear to not only be conflated in Ledoux’s analysis but also to be consequences not causes. Ledoux sees threat, detected in the amygdala, as infecting working memory processes and resulting in worry. There is evidence that worry is not directly aligned with anxiety as measured by standard anxiety scales (Meyer, Miller, Metzger, & Borkovec, 1999) and that worry can result in intrusive negative thoughts (Borkovec, Robinson, Pruzinsky, & DePree, 1983).* This suggests that, provided we use the words “worry” and “anxiety” with sufficiently restricted definitions, we can see Ledoux’s theory as being more focused on a cause of pathological anxiety (and fear and depression), and their etiology, and Gray and McNaughton’s as providing a view of state fear and state anxiety that encompasses both normal and pathological examples of these emotions but distinguishes between them. Many of the differences between these three theories of central emotional and motivational states are more apparent (through variations in the use and meanings of words) than real. Critically, when what each theorist says of the mechanisms and psychological constructs associated with a particular neural structure is compared with the others, their fundamental message is very similar. They all believe that central states are fundamental to emotion and motivation, either in its normal or pathological form. We would also agree with Ledoux (2002) when he states, “I don’t study behavior to understand behavior so much as to understand how processes in the brain work” (p. 209). To this we, personally, would add the coda that we want to understand the processes in the brain because these anchor our understanding of the workings of the mind. Damasio
c36.indd Sec2:725
8/19/09 4:13:14 PM
726
Central Theories of Motivation and Emotion
of somato-sensory structures. I designated the emotional changes under the umbrella term ‘somatic state’. Note that by somatic I refer to the musculoskeletal, visceral and internal milieu components of the soma rather than just to the musculoskeletal aspect; and note also that a somatic signal or process, although related to structures which represent the body and its states does not need to originate in the body in every instance. (p. 1412, italics added for emphasis)
Thus, somatic markers are not the simple assignation of valence or even of specific motivation to a stimulus. They are the perception or recall of a quite specific and detailed somatic image. There is no question that we can encode such images, and rehearse in our “mind’s eye” the somatic experience of, say, a competition dive. However, to see this image as the basis of a background biasing of a cognitive decision about whether to make a particular bet in Damasio’s paradigm task, the Iowa Gambling Task, is a radical departure from most other views of decision making and goal processing.
Central Theories of Emotion and Motivation—Some Broad Conclusions The details, perspectives, and specific assignment of functions to structures by the theories we have considered differ. However, they all share a picture of the control of behavior by multiple serial and parallel ROT by hierarchically organized systems in the brain. They thus account for (without producing a complete explanation of) the apparent theoretical impenetrability of emotion. No two emotions need be constructed or controlled in the same way as each other. No single emotion need have a unitary control. Rather, an emotion, as normally identified, may be an emergent structure deriving much of its superficial unity from the evolutionary path that has shaped the various component reactions. That said, the adaptive requirements facing, for example, the autonomic nervous system are sufficiently similar across the different emotions that at the general, as opposed to specific, level they can be seen to have many common features. Critically, neural analysis can determine the similarities and differences in the control of both superficially similar and superficially different reactions. The theories also share a common picture of a variety of emotions being linked to two broad classes of general behavioral tendency: approach and avoidance. These have their origin, as emphasized by Rolls, in the fundamental properties of taxes—which are defined in terms of their being the result of the simplest stimuli generating, in the simplest way, either approach or avoidance—these ideas follow from Gray’s early articulation of the same basic
c36.indd Sec2:726
principles. Thus, while affective stimuli will define specific goals (and, with the possible exception of Damasio, the theories all see behavior as goal directed), a behavior such as a lever press can result in food, delivery of a mate, safety from shock, or a variety of other specific results—but in all cases (including relief from nonpunishment) it is reinforced in the same basic way, by the release of dopamine. The control of distal behavior, then, depends on systems fundamentally devoted to approach, in general, and avoidance, in general. To these basic systems, Gray and McNaughton add an additional system that resolves conflict between the approach and avoidance systems—but their view of the basic approach and avoidance systems is essentially similar to that of Rolls and their view of the basic control of avoidance is much the same as that of Ledoux.
FUTURE DIRECTIONS So far, it might be thought that our analysis has not produced much of an advance, from a behaviorist perspective, beyond confirming the unsurprising conclusions that different stimuli elicit different proximal behaviors, and that behavioral plasticity can be understood in terms of positive and negative reinforcement. However, there are a number of points where neural analysis provides specific departures from any simple form of these conclusions and where it leads, in extreme cases, to unexpected conclusions. Beyond the Basics—The Potential for Unexpected Conclusions Perhaps the most important conclusion that neural analysis allows is that what is paradigmatically conditioning does not necessarily require explicit reinforcement. As we noted, sensory preconditioning and Pavlovian fear conditioning both involve the same basic form of stimulus-stimulus association in which simple long-term potentiation is all that is required for the strengthening of connections. The specific site of this potentiation, for fear conditioning, has been identified as the input from the thalamus (which encodes the conditional stimulus) to the amygdala (which generates the unconditioned, and then conditioned, responses). We have also argued that this purely associative, nonreinforced, type of learning underlies what often appears to be instrumental learning in cases, such as a pigeon pressing a lit key, where the behavior is autoshaping in disguise—although it has not yet been proved that this involves long-term potentiation. Following on from this conclusion is the fact that true reinforcement in the classic sense intended by Pavlov, while strengthening neural connections, need not reinforce
8/19/09 4:13:14 PM
Future Directions
a previously occurring response. This provides a simple explanation of the fact that, for example, the conditioned response to a stimulus that predicts shock (usually freezing) is not simply the unconditioned response to the shock (vocalization, movement) moved forward in time. Indeed, while there will not be a perfect match between dependence on association rather than reinforcement and the occurrence of stimulus substitution, the neural data suggest that association rather than reinforcement should be suspected whenever the conditioned response (including those that are superficially the result of instrumental conditioning) can be accounted for by stimulus substitution. A related issue, with instrumental reinforcement, is the demonstration that punishers release dopamine. The broad two-dimensional affective model we presented is, admittedly, derived originally from learning theoretic analysis (Gray, 1975). In this analysis, the omission of expected, or termination of, punishment is functionally equivalent to the presentation of rewarding stimuli; and in a symmetrical manner, the omission of expected, or termination of, reward is functionally equivalent to the presentation of punishment. But the demonstration of a link between punishment and dopamine, and of the role of dopamine in controlling instrumental reinforcement (Reynolds et al., 2001), has two important consequences for this model. First, it means we can be sure that, at a mechanistic level, the effects of reward and punishment omission are identical—they both change behavior by releasing dopamine. It is not the case that they happen to coincidentally produce the same superficial effects on behavior through independent mechanisms. Second, we can link both normal reward and normal punishment omission directly to explanations of addiction—where all addictive drugs (and some addictivelike behavior) have been shown to support continued behavior by the release of dopamine (but see also Chapter 40). We use this fact to provide potential explanations of some behaviors that might not be expected from the perspective of a simple reinforcement theory. A final point we need to consider before moving on to some specific scenarios is the nature of the interaction between reward and punishment—where we again need to take into account the tendency of evolution to select multiple ROT rather than producing integrated control systems. In terms of simple decision making, for example, reward and punishment systems suppress each other. However, with respect to arousal, and so sometimes the vigor of production of responses, they can summate (Gray & Smith, 1969). These are quite distinct computations and, in terms of the effect of anxiolytic drugs on approach-avoidance conflict, can be shown to be processed in quite different parts of the brain. The inhibitory effect of punishment on rewarded behavior is mediated via the hippocampus,
c36.indd Sec3:727
727
while the excitatory effect of punishment on reward-elicited arousal is mediated via the amygdala and not, in either case, vice versa (Gray & McNaughton, 2000). As a result, the addition of negative reinforcement can increase the levels of behavior generated by a positive reinforcer (e.g., in behavioral contrast). More peripheral theories of emotion and motivation would struggle to account for such findings. In the sections that follow, we speculatively consider the possible insight that these features of the reward and punishment systems can offer into some of the more perplexing behaviors shown by human beings. (For a higher level view of apparently irrational behavior patterns, see Chapter 37.) Relief of Nonpunishment: Gambling We have already considered the complex mechanisms underlying the partial reinforcement extinction effect— where we argued that the phenomena are generally adaptive in that they conform to optimal foraging analyses. Here we consider cases of pathological gambling where persistence in the face of intermittent reinforcement is, in optimal foraging terms, maladaptive. According to standard behavioral accounts, pathological gambling should not develop very easily and should extinguish fast. That is, engaging in a behavior that provides a high ratio of punishment to reward should led to avoidance behavior, which of course it does in the majority of the population. However, in a significant minority of people, pathological gambling behavior develops. That is, the behavior entails high monetary losses leading to personal, family, and societal problems. We could attempt to explain this maladaptive behavior using standard learning theory. There is intermittent positive reinforcement, and the ratio and pattern of reward to response are carefully chosen to produce robust conditioning and maximum resistance to extinction. To some extent, this can explain gambling. But it seems not to be a sufficient explanation of normal gambling far less its pathological form. First, in animals simply subjected to intermittent schedules, as we noted earlier, the level of behavior conforms approximately to optimality—with over-responding being present only while information about a new reward density is being gathered. Second, quite apart from the local preponderance of negative reinforcement for the behavior, there is usually additional negative reinforcement in terms of the effect on other aspects of life, and this should produce robust avoidance behavior. Third, there is the brute fact that the majority of people who engage in recreational gambling do not develop pathological gambling behavior. These facts suggest that we must look elsewhere for
8/19/09 4:13:14 PM
728
Central Theories of Motivation and Emotion
a sufficient explanation of this form of counterproductive behavior. One alternative theory is to assume that people prone to pathological gambling have biased cognitions (e.g., “The more I lose, the more chance of have of winning”). We may suppose that such biases are important in maintaining pathological gambling, but such explanations are high on description but low on powers of explanation, and specifically fail to reveal why such cognitive biases exists, let alone how they relate to reinforcement sensitivity (which we know is important in gambling behavior). Nor do they explain the intensity of the behavior. A possibility suggested by our current analysis is that pathological gambling develops as a as a form of selfdefeating dopamine-mediated approach behavior. On this view, punishment summates with the expectation of rare, large rewards, to create a high level of arousal. It thus energies and invigorates behavior. Even if the schedule of reinforcement were net positive for the player (as it can be with games such as “21”) it involves a background of fairly steady punishment, in the form of loss of the stake and reward omission. This means that when a reward occurs its effect is supercharged by the positive effects of relieving nonpunishment. The resultant physiological arousal acts in the same way as a drug, such as amphetamine, to create an emotional high that produces rapid and resistant conditioning (e.g., to the paraphilia of the gambling context). These emotional “highs”, that are predicted by the higherdensity of punishments, can become associated with it and so, through counterconditioning, reduce its negative reinforcing value (which is weak at the level of the individual response). The overall picture, as with chemical addiction, is an overriding of background negative stimuli by occasional powerful stimulation of the dopamine system. As yet these behavioral processes, and the apparently paradoxical fact that punishment in gambling seems to maintain pathological gambling itself, does not make much sense in traditional Skinnerian terms, but it finds a natural explanation within the context of the known properties of the dopamine system—and with the low level of genuine “pleasure” in those addicted to drugs. Reward-Punishment Mutual Inhibition: Romantic Partner Abuse A similar process to that seen in pathological gambling may also operate in romantic partners who suffer long-term abuse but who are reluctant to escape their abusing partner (i.e., are reluctant to engage in FFFS-mediated avoidance of the threat stimulus). Putting aside other relevant factors involved in such situations (e.g., children and financial dependence), some abused partners (both males as well
c36.indd Sec3:728
as females—here the forms of abuse may differ) repeatedly fail to leave their partners who, on the one hand, they openly declare are abusing them, but, on the other hand, find it difficult to break away (even where there do not exist an financial, or other, objective reason, for doing so). Partner abuse should be expected to activate the FFFS (as well as the BIS due to the likelihood of conflict) leading to punishment-mediated behaviors (in this case fear, tension, attempts to avoid/escape abuse). When the abusive partner reconciles, the abused partner will not only experience an absence of punishment (itself a good thing in terms of reduced FFFS activity), but also a strong boost to the BAS in the form of release of suppression of the reward system by the punishment (FFFS/BIS) system. As in the case of gambling (see above), relief of nonpunishment processes may also be assumed to operate. This release, and the subsequent rebound effects, would be expected to lead to a heightened BAS activity and an emotional high, which would stamp in, via conditioning, behaviors immediately preceding it, namely the partner ’s reconciliation behavior and associated stimuli—Konorski (1967) made a similar claim about the rebound effects in romantic “making-up” behavior. Once again, the FFFS/BIS-induced arousal would serve further to augment the rebound of the BAS, increasing the subjective intensity of the positive emotional high. (Rebound effects are also suggested by anti-anxiety drugs that are traded illegally for the highs they produce in some people.) There is a further theoretical twist that would make an additional contribution to this BAS-mediated emotional high and resulting approach behavior (e.g., making up). The mutual inhibition of the reward and punishment systems would mean that the previous negative emotion and behavior associated with the punishment system would now itself be suppressed, making the abused partner, emotionally speaking, to forget (or, at least, attenuate the strength of) the previous punishment delivered by the partner. Thus, we may predict that one of the major factors contributing to the continuation of abusive relationships is that the abused partner has a strong mutual inhibition between their reward and punishment systems, rendering a supercharged BAS input from the abusive partner ’s reconciliation behavior. It might be the case that the abusive partner learns how to manipulate the emotions of the abused partner, and this would contribute to the cycle of abuse.
SUMMARY In our discussion of central states and theories of emotion and motivation, we ranged freely from the exotic, but fairly
8/19/09 4:13:15 PM
References 729
well established, theories of the partial reinforcement extinction effect (PREE) to the prosaic, but not clearly understood, behavior of pathological gambling and romantic partner relations. We attempted to show that neural analysis can, and has, generated quite distinct theories that not only have the advantage of being tied to neural and pharmacological reality (and so are less subject to the whims of verbal definition) but also have the advantage of throwing into strong relief some of the less obvious properties of emotional and motivational systems. These properties derive from the fact the emotion and motivation involve multiple serial and parallel ROT, each of which has evolved separately but nonetheless regularly co-occurs with and is often seamlessly integrated with others. The existence of multiple ROT itself creates an environment in which higher order control mechanisms can evolve. The addition of later, complex, ROT to sets of simpler ones has also tended to produce hierarchical structures with the quickest, dirtiest, and phylogenetically earliests mechanisms located at lower levels of the neuraxis and progressively slower and more sophisticated mechanism located at progressively higher levels. We considered a number of current central theories of emotion and motivation. These differ in detail and even in their use of terms. But they can all be seen as sharing a fundamentally Hebbian (purely associative, as opposed to reinforced) view of basic memory processes; a picture of two fundamental reinforcement systems—with dopaminergic systems reinforcing specific responses whether these produce reward or relieving nonpunishment; a distinction between ventral (“what”) and dorsal (“where”) processing streams; a view that behavior results from neural processing of goals (stimulus/response or, better, occasion/affordance compounds); and a view of prefrontal cortex as holding potential or intended goals in mind (i.e., in “working” or “active” memory). The take-home message is that emotion and motivation are intertwined and each is multifaceted. This is often blindingly obvious at the neural level—but still goes against the grain of our normal use of emotional terms. As we have seen, what is meant by “anxiety” can differ even among neurally driven theorists—making it unclear how far disagreements are about real facts or arbitrary definitions. What is needed, then, is recursive processing of neural and behavioral information. When the resultant “psychological” constructs are also firmly tied down to particular neural instantiations then we will be in a position to say that we truly understand the resultant structure of the behaviors emitted by the organism—and will be on the way to understanding our own minds from an objective standpoint.
c36.indd Sec4:729
REFERENCES American Psychiatric Association. (1987). Diagnostic and statistical manual of mental disorders (3rd ed., rev.). Washington, DC: Author. Amsel, A. (1992). Frustration theory: An analysis of dispositional learning and memory. Cambridge: Cambridge University Press. Bandler, R., Keay, K. A., Floyd, N., & Price, J. (2000). Central circuits mediating patterned autonomic activity during active vs. passive emotional coping. Brain Research Bulletin, 53(1), 95–104. Bandler, R., Price, J. L., & Keay, K. A. (2000). Brain mediation of active and passive emotional coping. Progress in Brain Research, 122, 331–347. Bindra, D. (1969). A unified interpretation of emotion and motivation. Annals of the New York Academy of Sciences, 159, 1071–1083. Blanchard, D. C., & Blanchard, R. J. (1988). Ethoexperimental approaches to the biology of emotion. Annual Review of Psychology, 39, 43–68. Blanchard, D. C., & Blanchard, R. J. (1990). Effects of ethanol, benzodiazepines and serotonin compounds on ethopharmacological models of anxiety. In N. McNaughton & G. Andrews (Eds.), Anxiety (pp. 188–199). Dunedin: Otago University Press. Blanchard, D. C., Blanchard, R. J., Tom, P., & Rodgers, R. J. (1990). Diazepam changes risk assessment in an anxiety/defense test battery. Psychopharmacology (Berl), 101, 511–518. Blanchard, R. J., & Blanchard, D. C. (1989). Antipredator defensive behaviors in a visible burrow system. Journal of Comparative Psychology, 103(1), 70–82. Blanchard, R. J., & Blanchard, D. C. (1990a). Anti-predator defense as models of animal fear and anxiety. In P. F. Brain, S. Parmigiani, R. J. Blanchard, & D. Mainardi (Eds.), Fear and defence (pp. 89–108). Chur: Harwood. Blanchard, R. J., & Blanchard, D. C. (1990b). An ethoexperimental analysis of defense, fear and anxiety. In N. McNaughton & G. Andrews (Eds.), Anxiety (pp. 124–133). Dunedin: Otago University Press. Blanchard, R. J., Griebel, G., Henrie, J. A., & Blanchard, D. C. (1997). Differentiation of anxiolytic and panicolytic drugs by effects on rat and mouse defense test batteries. Neuroscience and Biobehavioral Reviews, 21(6), 783–789. Bliss, T. V. P., Gardner-Medwin, A. R., & Lomo, T. (1973). Synaptic plasticity in the hippocampus. In G. B. Ansell & P. B. Bradley (Eds.), Macromolecules and behaviour (pp. 193–203). London: Macmillan. Bliss, T. V. P., & Lomo, T. (1973). Long-lasting potentiation of synaptic transmission in the dentate area of the anaethsetised rabbit following stimulation of the perforant path. Journal of Physiology (Lond), 232, 331–356. Borkovec, T. D., Robinson, E., Pruzinsky, T., & DePree, J. A. (1983). Preliminary exploration of worry: Some characteristics and processes. Behaviour Research and Therapy, 21(1), 9–16. Brady, J. V. (1975a). Conditioning and emotion. In L. Levi (Ed.), Emotions: Their parameters and measurement (pp. 309–340). New York: Raven Press. Brady, J. V. (1975b). Toward a behavioural biology of emotion. In L. Levi (Ed.), Emotions: Their parameters and measurement (pp. 17–46). New York: Raven Press. Capaldi, E. J. (1967). A sequential hypothesis of instrumental learning. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (pp. 67–156). New York: Academic Press. Carlson, N. R. (1980). Physiology of behavior. Boston: Allyn & Bacon. Damasio, A. R. (1995). On some functions of the human prefrontal cortex. Annals of the New York Academy of Sciences, 769, 241–251. Damasio, A. R. (1996). The somatic marker hypothesis and the possible functions of the prefrontal cortex. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 351, 1413–1420.
8/19/09 4:13:15 PM
730
Central Theories of Motivation and Emotion
Dickinson, A. (1980). Contemporary animal learning theory. Cambridge: Cambridge University Press. Dunn, B. D., Dalgleish, T., & Lawrence, A. D. (2006). The somatic marker hypothesis: A critical evaluation. Neuroscience and Biobehavioral Reviews, 30(2), 239–271. Fanselow, M. S., & LeDoux, J. E. (1999). Why we think plasticity underlying pavlovian fear conditioning occurs in the basolateral amygdala. Neuron, 23(2), 229–232. Feldon, J., Guillamon, A., Gray, J. A., De Wit, H., & McNaughton, N. (1979). Sodium amylobarbitone and responses to nonreward. Quarterly Journal of Experimental Psychology, 31, 19–50. Gray, J. A. (1975). Elements of a two-process theory of learning. London: Academic Press. Gray, J. A. (1977). Drug effects on fear and frustration: Possible limbic site of action of minor tranquilizers. In L. L. Iversen, S. D. Iversen, & S. H. Snyder (Eds.), Handbook of psychopharmacology, drugs, neurotransmitters and behaviour (Vol. 8, pp. 433–529). New York: Plenum Press. Gray, J. A. (1982). The neuropsychology of anxiety: An enquiry in to the functions of the septo-hippocampal system. Oxford: Oxford University Press. Gray, J. A. (1987). The psychology of fear and stress. London: Cambridge University Press. Gray, J. A., & McNaughton, N. (2000). The neuropsychology of anxiety: An enquiry into the functions of the septo-hippocampal system. Oxford: Oxford University Press. Gray, J. A., & Smith, P. T. (1969). An arousal-decision model for partial reinforcement and discrimination learning. In R. Gilbert & N. S. Sutherland (Eds.), Animal discrimination learning (pp. 243–272). London: Academic Press. Hebb, D. O. (1949). The organization of behavior: A neuropsychological theory. New York: Wiley-Interscience. Hinde, R. A. (1998). Animal behaviour. New York: McGraw-Hill. Hofer, M. A. (1972). Physiological and behavioural processes in early maternal deprivation. In R. Porter & J. Knight (Eds.), Physiology, emotion and psychosomatic illness (pp. 175–186). Ciba symposium No. 8 (new series). Elsevier. Hull, C. L. (1952). A behavior system. Yale University Press. James, W. (1884). What is an emotion? Mind, 9, 188–205. Jenkins, H. M., & Moore, B. R. (1973). The form of the auto-shaped response with food or water reinforcers. Journal of the Experimental Analysis of Behavior, 20, 163–181. Kandel, E. R., & Hawkins, R. D. (1992). The biological basis of learning and individuality. Scientific American, 267, 79–86. Kelly, J. F., & Hake, D. F. (1970). An extinction-induced increase in an aggressive response with humans. Journal of the Experimental Analysis of Behavior, 14, 153–164. Konorski, J. (1967). Integrative activity of the brain: An interdisciplinary approach. Chicago: University of Chicago Press. Krebs, J. R., Stephens, D. W., & Sutherland, W. J. (1983). Perspectives in optimal foraging. In G. A. Clark & A. H. Brush (Eds.), Perspectives in ornithology (pp. 165–221). Cambridge: Cambridge University Press.
McNamara, J., & Houston, A. (1980). The application of statistical decision theory to animal behaviour. Journal of Theoretical Biology, 85, 673–690. McNaughton, N. (1985). Chlordiazepoxide and successive discrimination: Different effects on acquisition and performance. Pharmacology, Biochemistry and Behavior, 23, 487–494. McNaughton, N. (1989a). Anxiety: One label for many processes. New Zealand Journal of Psychology, 18, 51–59. McNaughton, N. (1989b). Biology and emotion. Cambridge: Cambridge University Press. McNaughton, N., & Corr, P. J. (2004). A two-dimensional neuropsychology of defense: Fear/anxiety and defensive distance. Neuroscience and Biobehavioral Reviews, 28, 285–305. McNaughton, N., & Gray, J. A. (1983). Pavlovian counterconditioning is unchanged by chlordiazepoxide or by septal lesions. Quarterly Journal of Experimental Psychology, 35B, 221–233. McNaughton, N., Kocsis, B., & Hajós, M. (2007). Elicited hippocampal theta rhythm: A screen for anxiolytic and pro-cognitive drugs through changes in hippocampal function? Behavioural Pharmacology, 18, 329–346. Meyer, T. J., Miller, M. L., Metzger, R. L., & Borkovec, T. D. (1999). Development and validation of the Penn state worry questionnaire. Behaviour Research and Therapy, 28, 487–495. Millenson, J. R., & Leslie, J. C. (1979). Principles of behavior analysis. New York: MacMillan. Panksepp, J. (1998). Affective neuroscience: The foundations of human and animal emotions. New York: Oxford University Press. Pinel, J. P. J. (1997). Biopsychology. Boston: Allyn & Bacon. Renfrew, J. W., & Hutchinson, R. R. (1983). Motivation The motivation of aggression. In E. Satinoff & P. Teitelbaum (Eds.) Handbook of Behavioural Neurobiology, (Vol. 6, pp. 511–541). Plenum Press. Reynolds, J. N. J., Hyland, B. I., & Wickens, J. R. (2001, August 23). A cellular mechanism of reward-related learning. Nature, 413(6851), 67–70. Rolls, E. T. (1990). A theory of emotion, and its application to understanding the neural basis of emotion. Cognition and Emotion, 4, 161–190. Rolls, E. T. (2000). On the brain and emotion. Behavioral and Brain Sciences, 23(2), 219–228. Sartory, G., MacDonald, R., & Gray, J. A. (1990). Effects of diazepam on approach, self-reported fear and psychophysological responses in snake phobics. Behaviour Research and Therapy, 28(4), 273–282. Skinner, B. F. (1953). Science and human behavior. New York: Macmillan. Smits, D. J. M., & Boeck, P. D. (2006). From BIS/BAS to the big five. European Journal of Personality, 20, 255–270. Sokolowski, J. D., McCullough, L. D., & Salamone, J. D. (1994). Effects of dopamine depletions in the medial prefrontal cortex on active avoidance and escape in the rat. Brain Research, 651, 293–299. Stark, H., Bischof, A., & Scheich, H. (1999). Increase of extracellular dopamine in prefrontal cortex of gerbils during acquisition of the avoidance strategy in the shuttle-box. Neuroscience Letters, 264(1–3), 77–80. Sutherland, N. S. (1966). Partial reinforcement and the breadth of learning. Journal of Experimental Psychology, 18, 289–301.
LeDoux, J. E. (1993). Emotional memory systems in the brain. Behavioural Brain Research, 58, 69–79.
Weingarten, H. P. (1983). Conditioned cues elicit feeding in sated rats: A role for learning in meal initiation. Science, 220(4595), 431–433.
LeDoux, J. E. (1994). Emotion, memory and the brain. Scientific American, 270, 50–59.
Ziff, D. R., & Capaldi, E. J. (1971). Amytal and the small trial partial reinforcement effect: Stimulus properties of early trial nonrewards. Journal of Experimental Child Psychology, 87, 263–269.
LeDoux, J. E. (2002). Synaptic self. Harmondsworth: Viking Penguin.
c36.indd Sec4:730
8/19/09 4:13:15 PM
Chapter 37
The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation ANDREAS OLSSON AND ARNE ÖHMAN
Emotions sometimes appear mysterious. They can appear unmistakably clear, yet at the same time elusive. In our daily lives we often define the basic goals of human striving in terms of emotion: We yearn for happiness and do our best to avoid misery. But making the distinction between positive and negative emotions is not as simple as saying that we seek the former and shun the latter. Peace Corps workers, parachute jumpers, and snake handlers might seek situations that most of us fear. Likewise, we may indulge in behaviors such as passionate love, overeating, drinking too much, and substance abuse even when it brings misery. At times we may simultaneously experience two conflicting emotions about another person, pulling us in opposite directions, and after a separation we may find that our current emotions are not as abyssal as we forecasted them to be some time ago. In many of the most common psychological disorders—depression, anxiety disorders, and phobias—individuals find that their emotional lives defy reason and rational thought. Still, for most of us, life without emotion would not be worth living. Conflicts are not only abundant in our everyday experience of emotions. The landscape of scientific theories aspiring to describe and explain emotion is also riddled with conflicts. However, these often opposing theories of emotion have over the past century inspired various research paradigms in the behavioral and neurosciences that have contributed importantly to what we know about emotions today. In this chapter we revisit a selection of the most influential
approaches to emotion to discuss findings that are enlightening in terms of the link between behavior and its neural bases. A greater appreciation of the link between emotional behavior and its biological bases is critical for a more complete understanding of emotion and its conflicting nature.
UNDERSTANDING EMOTION THROUGH THE BRAIN–BEHAVIOR LINK The Neuroscience of Emotion Psychological research on emotion basically is a multivariate enterprise seeking to develop theories that relate emotion-provoking circumstances to verbal reports of emotion, psychophysiological data, and behavioral responses. An important contribution of this research is that it has made sophisticated methodologies available for manipulating and measuring emotion in the psychological laboratory (see Coan & Allen, 2007). Neuroscience offers unique prospects for a deeper understanding of emotion by revealing the neural underpinnings of the relationships revealed in the psychological laboratory. Imaging Emotion Animal research over the past century laid the foundation for understanding some of the basic neural mechanisms of emotion. However, the recent development of techniques to image the healthy human brain in vivo has provided an unprecedented opportunity to directly study the neural elements of emotional responses and experiences. Through imaging, the workings of the emotional brain can now be observed independently of people’s introspective accounts of their emotional states. Together with our already quite sophisticated knowledge of the peripheral psychophysiology of emotional responses, descriptions of neural processes are helping us to understand the specific mechanisms
The authors are also affiliated with the NIHM Center for the Study of Emotion and Attention at the University of Florida–Gainesville and the NOS-HS Center of Excellence on Cognitive Control. Arne Öhman is the recipient of a grant for Long-Term Support of Leading Investigators from the Swedish Science Research Council. Address for both authors: Section of Psychology, Department of Clinical Neuroscience, Karolinska University Hospital, Solna. 731
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c37.indd 731
8/17/09 3:07:03 PM
732 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
underlying emotions. In turn, mechanistic accounts provide a biological grounding that can be used to constrain alternative models and theories of emotion. Moreover, knowledge about which psychological processes involve similar and different neurobiological processes allows us to use neurobiology to carve the nature of emotion at its joints. For example, brain imaging research has confirmed that negative attitudes to members belonging to a racial group other than one’s own has an immediate, nonconscious emotional component, which is modulated by prefrontal influences (Cunningham, Johnson, Gatenby, Gore, & Banaji, 2004; Phelps et al., 2000). The rapidly growing body of knowledge about the workings of the human brain has made it possible to compare and integrate this knowledge with what we know about the brain– behavior link in other animals. Comparisons across species make it possible to use experimental paradigms in nonhuman animals, which would not be ethically feasible in humans, to learn more about human emotions. As a result, evolutionary theory has gained increasing influence as a unifying theory in psychology, solidifying its place within the realm of biological sciences. This development has facilitated scientific accounts of emotional behavior comprising both its proximate mechanisms and evolutionary functions (Damasio, 1994; LeDoux, 1996; Öhman, 1986; Rolls, 1999). Limitations of Imaging It should be remembered that neuroimaging has inherent limitations because of its correlational nature. For example, it can tell us which brain areas are involved in an emotional response, but it remains silent on what neural processes are necessary for a specific response, which is a critical step in inferring causality. To learn about the causal links between brain functions and behavior we need observations of the effects on behavior when the functioning of well-defined functional neural circuitry is impaired. To this end, systematic studies of patients with localized brain damage have proven invaluable. Lately a new technique, transcranial magnetic stimulation, has gained popularity. This technique has been developed to induce temporary and completely reversible brain lesions in healthy volunteers through the application of strong magnetic fields to specific cortical areas. Although there are several limitations to this technique, such as its being applicable only to cortical regions, it has been successfully used to examine the causal link between brain and behavior in humans. Nevertheless, to overcome the limitations associated with experimentation in humans, the study of other animals will remain imperative in our quest for a full understanding of emotion. Implicit and Explicit Measures of Emotion As stated earlier, neural activation, whether central or peripheral, can be assessed independently of the individual’s verbal
c37.indd 732
accounts of emotional experiences. Measures of emotional responses that are nonverbal in nature are often referred to as “implicit,” in contrast to “explicit,” verbal accounts. Whereas implicit measures assess responses of which the individual often lacks introspective awareness, explicit measures tap responses dependent of linguistic processes, such as reflective reasoning. Because of their relative insensitivity to demand characteristics, implicit responses are critical in crossspecies comparisons of emotion. Particularly informative are implicit responses that can be dissociated from the emoter’s explicit reports. There are good reasons for assuming that these responses provide a window into the workings of phylogenetically older mechanisms, drawing on brain systems that are partially independent from more recently evolved systems that are responsible for linguistic processing (LeDoux, 1996; Öhman, 1986; Öhman & Wiens, 2003). Although our current understanding of implicit emotional responses relies on recent technological advancements, the interest in different, separable nuances of emotional responses is ancient.
TWO EMOTIONAL MOVEMENTS More than 2,000 years ago Greek philosophers belonging to the Stoic tradition proposed an interesting separation between what they called the first and second “movements” of an emotion (Oatley, 2004). This would turn out to be prophetic about an important distinction in modern emotion research: automatic versus controlled emotional processing, or primary versus secondary appraisal. The first movement is rapid and reflexive, such as when we freeze when we are confronted with a snake or a fearful other whose fear expression might be informative about an imminent danger (Figure 37.1). The second movement is what we make of this instinctive response and how we appraise the current situation: Is the snake poisonous? Why is the other expressing fear? How does this fit into the surrounding context (Figure 37.1)? What are his or her intentions? This evaluative or reflective process depends mainly on voluntary mental activity. Reflecting about the properties of a stimulus is likely to change our emotional experience without our intending to do so. However, reflection can also be intentionally used to change our emotional responses. For example, you might want to curb the empathic response you feel for a terrified child who is about to receive an injection (Figure 37.1)—especially if you are the doctor who has to administer the injection with acute precision. An emotional response can be down-regulated in various ways, such as by shifting one’s attention to something less evocative or by reinterpreting the meaning of the situation. For the purpose of this chapter, and to maintain the Stoic terminology, we have tentatively called this intentional regulation the “third movement.”
8/17/09 3:07:04 PM
Two Emotional Movements External stimuli attended
(A)
(B)
(C)
Emotional response
Orienting
Appraisal (e.g., empathic pain)
Emotion regulation (e.g., down-regulation)
Response mode
Reflexive
Reflective
Reflective
Behavior
Freezing response/ Information seek
Widened information seek
Confirmatory information seek
Physiology
SCR Heart rate Amygdala, visual cortex, hypothamalus, brain stem nuclei
SCR Heart rate ACC, AI, hippocampus, MPFC, PFC,
SCR Heart rate MPFC and LPFC amygdala
Selected functional neuroanatomy
The temporal flow of emotion
1st movement
2nd movement
3rd movement Feedback
Figure 37.1 The temporal flow of emotion. Note. This figure illustrates how an emotional situation (a fearful child anticipating an injection) gives rise to the unfolding of a series of emotional responses in a perceiver. The temporal flow of these responses and a selection of the associated functional neuroanatomy is divided into three hypothetical stages, approximating the first (A), second (B), and third (C) emotional movements. (A) Initially, the fearful face of the child is encountered, giving rise to a reflexive orienting response, which mobilizes the perceiver to search of the environment for potential threats that could have caused the child’s fear expression. The attentional focus on the fearful face produces heart rate deceleration and an increased arousal response (SCR). These primary responses are predominantly mediated by a subcortical neural circuitry centered on the amygdala, which affects behavior through hypothalamic and brain stem nuclei. (B) As additional information becomes available to the perceiver, the situation is appraised in terms of its context and memories of similar situations. For example, facilitated by an automatically triggered mirroring of the child’s emotional responses, the reflective attribution of pain might produce an empathic response in the perceiver, which leads to an increase in physiological arousal. Apart from the subcortical circuitry described (A), this process is likely to draw on both regions involved in the retrieval of episodic memories, such as the hippocampus, sensory, and frontal regions responsible for the empathic experience of and reflection about the child’s emotional state. (C) The empathic response can be intentionally down-regulated through reappraisal of the situation. For example, reminding oneself of the long-term benefits to the child from receiving a life-saving vaccination or attending to his or her soothing parent might mitigate the perceiver’s emotional response. This regulatory process can be facilitated by selectively attending information that confirms one’s reappraisal of the situation. Emotion regulation has been shown to involve prefrontal influences on the subcortical circuitry described in A. As illustrated by the arrows at the bottom of the figure, the reflective appraisals of the situation described in B and C are likely to affect the initial response to new stimuli through top-down modulation of primary appraisals, supported by bidirectional neural connections. ACC = Anterior cingulate cortex; AI = Anterior insula; LPFC = Lateral prefrontal cortex; MPFC = Medial prefrontal cortex; PFC = Prefrontal cortex; SCR = Skin conductance response.
c37.indd Sec1:733
733
By making the second movement the essence of emotion, the Stoics changed the meaning of emotion from an automatic and involuntary response to something individuals could consciously control and take responsibility for. The distinction between the first and second movements is reminiscent of today’s accounts of dual processing of emotion that comes in many guises. Common to all of them is the suggestion that two distinct kinds of processes, resembling the two movements, are responsible for different aspects of emotional responses. Recent neuroscientific research has substantiated dual processing by describing two general types of processes, engaging partially separated but, under normal circumstances, interacting neural systems. The first system comprises reflexive, automatic responses that are implicitly expressed; the second comprises more reflective processes that can be affected by voluntary control.
The First Movement of Emotion Automatic Appraisal In 1980 Robert Zajonc and colleagues proposed a conceptualization of the first movement of emotion in terms of an automatic affective response that was primary to, and independent of, cognitive responses to the stimulus (KunstWilson & Zajonc, 1980). Succinctly summarized, Zajonc claimed that we are sure of what we like, even though we may not know why we like it, or even what it is that we like. This proposal was backed up by an extensive series of studies examining the liking of stimuli as a function of repeated exposure (see Zajonc, 2004, for a concise summary). For example, Kunst-Wilson and Zajonc presented Chinese ideograms at durations as short as a few milliseconds to make them unrecognizable. Even though the data confirmed their unrecognizability, the participants liked previously shown (but nonseen) more than new ideograms. The Neural Bases of Automatic Appraisal LeDoux (1996) suggested that the automatic affective response described by Zajonc (1980) was related to the very fast brain activation he and his colleagues had demonstrated through a direct route between sensory thalamic nuclei and the amygdala, a conglomerate of subnuclei bilaterally located in the medial temporal lobes, which has long been considered a central substrate of emotion (Klüver & Bucy, 1939; Weiskrantz, 1956). It has been argued that this neural system evolved to cope with rapidly emerging dangers, quickly relaying crude representations of potentially threatening stimuli to the amygdala, bypassing the cortex (Adolphs, 2001; de Gelder, Snyder, Greve, Gerard, & Hadjikhani, 2004; LeDoux, 1996, 2000; Morris, Öhman, & Dolan, 1999; Öhman & Mineka, 2001; Whalen et al., 1998).
8/17/09 3:07:04 PM
734 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
In an extensive research program, LeDoux (1996, 2000) and others (see Davis & Whalen, 2001) have demonstrated that the amygdala is the hub of this automatic and phylogenetically ancient neural circuitry centered on subcortical regions, controlling fear responses to threatening stimuli. The lateral nucleus of the amygdala receives input from the external world via the direct thalamic route and via the classical sensory pathways through sensory thalamic nuclei and the primary and secondary sensory cortices. This information is forwarded to the central nucleus, which has efferent connections to areas involved in emotional expression, such as the hypothalamus (sympathetic branch of the autonomic nervous system), and midbrain and brain stem nuclei related to fear behavior (periaqueductal gray), defensive reflexes (such as startle), and facial expression (facial nucleus). Although involved in many emotionally relevant processes, the primary function of this network is to quickly enhance attention to evaluate a potential threat by means of ramping up activation in a distributed network of functional brain systems (Davis & Whalen, 2001). It is also critical to the acquisition, retention, and expression of basic emotional forms of learning (Phelps & LeDoux, 2005) and certain social functions (Adolphs, 2001; Adolphs et al., 2005). Consistent with the role of this neural system in assembling perceptual and attentional resources and to prepare for action, the processing of emotionally significant stimuli that is known to involve the amygdala has been reported to influence early visual and attentional processing (Anderson & Phelps, 2001; Morris, Friston, et al., 1998; Phelps, Ling, & Carrasco, 2006; Vuilleumier, Richardson, Armony, Driver, & Dolan, 2004; see also Volume II, Section V, Chapter 11 of this handbook [D'Stanley, Ferneyhough & Phelps]) and action representations (de Gelder et al., 2004). However, the precise mechanisms underlying the amygdala’s influence on a widely distributed network of cortical areas have not yet been specified. These issues are currently attracting much research and debate (McGaugh, 2004; Phelps, 2006). Data Supporting the Concept of Automatic Appraisal Supporting the concept of a first automatic stage in emotion activation are both behavioral and neuroimaging indicators of immediate nonconscious emotional responses to unseen images presented very briefly and effectively masked from conscious perception by an immediate masking picture. For example, participants fearful of snakes and spiders responded with enhanced skin conductance responses (SCRs) to pictorial representations of the feared animals even when pictures of the animals were unseen (masked; Öhman & Soares, 1994). These behavioral data have been supported by brain imaging studies showing amygdala activations to masked snakes and spiders in fearful individuals
c37.indd Sec1:734
(Carlsson et al., 2004; Figure 37.2), to masked fearful faces (Whalen et al., 1998), and to aversively conditioned masked angry faces (Morris, Öhman, & Dolan, 1998). Furthermore, two studies have reported reliable amygdala activation to suppressed pictures of fearful faces in a binocular rivalry paradigm (Pasley, Mayes, & Schultz, 2004; Williams, Morris, McGlone, Abbott, & Mattingley, 2004), thus providing converging evidence that the amygdala can be activated by “unseen” visual stimuli. There is also suggestive evidence that visual information may take a direct route to the amygdala, bypassing the cortex. Morris, Öhman, & Dolan (1999) examined the neural connectivity between the amygdala and other brain regions when the amygdala was activated by masked stimuli. They reported that such activation could be reliably predicted from activation of subcortical way stations in the visual systems, such as the superior colliculus and the right pulvinar nucleus of the thalamus, but not from any cortical regions. The superior colliculus and the pulvinar are both involved in attentional processes, the superior colliculus P-unseen vs. N-unseen (A)
F-unseen vs. N-unseen (B) 3
2
1
0 (C)
(D)
5 4 3 2 1 0
P-seen vs. N-seen
P-seen vs. F-seen
Figure 37.2 Amygdala activations to masked and nonmasked pictures of snakes and spiders in individuals fearful of snakes or spiders. Note. Upper panels (A–B) show coronal views (y = ⫺4) of activation maps for unseen (masked) phobic stimuli (e.g., picture of a snake) contrasted with an unseen neutral stimulus (a picture of a mushroom) (P-unseen versus N-unseen) and a fear-relevant but nonfeared (e.g., a spider) stimulus (F-unseen versus N-unseen). Note the left-lateralized amygdala activations in both conditions. Lower panels (C–D) show contrasts between seen (nonmasked) phobic and neutral stimuli (P-seen versus N-seen, y = ⫺8) and between phobic and fear-relevant but nonfeared stimuli (P-seen versus F-seen, y = ⫺4). Note the bilateral amygdala activations seen in both contrasts. From “Fear and the Amygdala: Manipulation of Awareness Generates Differential Cerebral Responses to Phobic and Fear-Relevant (but Nonfeared) Stimuli,” by Carlsson et al., 2004, Emotion, 4, p. 345. Reprinted with permission.
8/17/09 3:07:04 PM
Two Emotional Movements
in eye movement control and shifts of attention and the pulvinar in monitoring attentional salience. Liddell et al. (2005) examined the effect of masked fearful versus masked neutral faces on anatomically defined regions of interest. Confirming the connectivity data reported by Morris, Öhman, et al. (1999), they found reliable activation to masked fearful faces in the left superior colliculi, the left pulvinar, and the bilateral amygdala. In addition they found activation in the locus coeruleus and the anterior cingulate cortex. The visual resolution that can be achieved by this route is likely to be quite restricted. Vuilleumier, Armony, Driver, and Dolan (2003) suggested that it operates primarily on gross, low-frequency information. Accordingly, they filtered the spatial frequency of pictures of faces to produce facial stimuli that retained only high- or low-frequency spatial information. Their results showed that amygdala responses were larger for low-frequency faces provided that they showed expressions of fear. Moreover, these researchers demonstrated activation of the pulvinar and superior colliculus by low-frequency but not high-frequency fearful faces. Thus these results suggest that there is a distinct superior colliculus-pulvinar pathway to the amygdala that operates primarily on low-frequency information. Appraising Reward Whereas the amygdala has been primarily implicated in the processing of aversive information (although evidence for its involvement in appetitive processing is accumulating; see, e.g., Murray, 2007; Sabatinelli, Bradley, Fitzsimmons, & Lang, 2005), the striatum has been singled out as critical for the automatic processing of appetitive stimuli and the generation of implicit positive emotional responses. Research has shown that the striatum can be engaged in a variety of different, but often related, tasks, such as the processing of reward stimuli (Cromwell & Schultz, 2003; Knutson, Adams, Fong, & Hommer, 2001), error-driven learning (O’Doherty et al., 2004; Schultz, 2002), novelty seeking (Bevins et al., 2002), and the initiation of instrumental behaviors (Rolls, 1999). With regard to appetitive emotional stimuli, Berridge and Winkielman (2003) showed that perceptually masked pictures of happy faces facilitated fluid consumption in thirsty participants who remained completely unaware of the emotional faces and any relationship between them and the drinking task. Angry faces had the opposite effect. Pessiglione et al. (2007) demonstrated that the striatum was critical in mediating motivation to perform in a task reinforced by monetary incentives. They provided masked or nonmasked information to their participants about how much money they could earn by exerting maximal handgrip force. Even when pictures of expected monetary
c37.indd Sec1:735
735
gain (a pound or a penny) were blocked from awareness by masking stimuli, activation in the ventral striatum was larger and the force exerted higher when the potential gain was large rather than small. Thus, whereas the amygdala can be nonconsciously activated to mediate fear, the striatum can be nonconsciously activated to mediate reward and positive emotion. There is converging evidence, therefore, that rapid, automatic processes operate both in aversive and appetitive circumstances to result in immediate and unconscious emotional activation. At this point a caveat is in place. Although we have focused on two core functional regions (the amygdala and the striatum), both aversive and reward processes normally draw on a widely distributed network of neural systems. This is discussed in greater detail later. Automatic Appraisal in Social Contexts Designed by Evolution Evolutionary scientists commonly assume that the pressure of complex social organization catalyzed the rapid enlargement of the human brain during the past million years. Robinson Crusoe, as Nicholas Humphrey once remarked, illustrates this model of human evolution: The real challenge for Crusoe was not to survive alone on the island but to survive the relationship with his man, Friday (Humphrey, 1983). Compared to the physical environment, our social milieu is more complex, less predictable, and, importantly, more responsive to our behavior. These characteristics, especially the benefits associated with social reciprocity and the dangers linked to interpersonal aggression, are thought to have driven the evolution of emotional responses to social cues. In light of this, it is not surprising that emotional and social tasks to a large extent recruit overlapping regions of the brain, engaging both subcortical and more posterior regions of the brain that traditionally have been ascribed a role in more reflexive emotional functions, as well as more anterior-medial cortices supporting reflective emotional functions (Ochsner et al., 2004a; Olsson & Ochsner, 2008). Indeed, there are both ontogenetic and phylogenetic reasons for why emotional and social processes should be intimately intertwined at all levels of explanation: the behavioral, the experiential, and the neural. Next we consider emotions in the social domain to illustrate the unfolding of the two movements and the dynamic interplay between them. Automatic Appraisal of Social Threat Our social environment is the source of our happiest moments, but also our greatest fears, and the emotional value of social stimuli can change within fractions of a second. Indeed, the swift alternation of facial expressions in a
8/17/09 3:07:06 PM
736 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
conspecific, signaling a shift from benevolent to adversary intentions, illustrates rapidly occurring changes that may have fatal implications for the individual if not adaptively responded to immediately. Under these circumstances it is critical to make a rapid evaluation of emotionally relevant cues around us, as well as of the current social context (Figure 37.1). As discussed earlier, the amygdala has been shown to play a role in the automatic appraisal of biologically relevant cues, such as fearful and angry faces (Morris et al., 1998). Adding importantly to our knowledge about the role of the amygdala in social emotions, a study on braindamaged patients suggested that the amygdala not only is sensitive to fear and other expressions, but may also contribute to the generation of actions favoring the detection of fear (Adolphs et al., 2005). This study reported that the inability to recognize fear expressions in a patient with bilateral amygdala lesions was due to the patient’s inability to make spontaneous use of information in the eye region, a region that is particularly diagnostic for fear expressions. These results remind us about the importance of complementing imaging work with research on patients with specific brain damages. Naturally our social environment contains significant cues other than emotional expressions, which can trigger a fast emotional response in the perceiver. Based on the significance of the sex and group belonging of others, signaling a potential mate, friend, or foe, it is reasonable that these categories are evaluated quickly. Indeed, research has shown that these categories are coded instantly, giving rise to a reflexive response. For example, there are now numerous demonstrations that unknown racial outgroup members, that is, individuals not belonging to one’s own racial group, can elicit a rapid threat response associated with the amygdala (Cunningham et al., 2004; Phelps et al., 2000). Pointing to the possibility of a hard-wired disposition to develop xenophobic responses, when associated with something aversive, male outgroup members elicit a stronger implicit fear response that is more persistent than that produced by male ingroup members. Interestingly, this effect has been demonstrated to be independent of explicit attitudes as well as previous exposure to the racial outgroup, with the exception of having outgroup dating experience, which abolishes the effect (Navarrete et al., 2009; Olsson, Ebert, Banaji, & Phelps, 2005).
reliable. For example, a recent study shows that viewing times as short as 100 ms allow a person to form an impression about a target person’s likeability, trustworthiness, competence, and aggressiveness that are highly correlated with impressions made in the absence of time constraints (Willis & Todorov, 2006). This resonates well with previous imaging (Winston, Strange, O’Doherty, & Dolan, 2002) and neuropsychological (Adolphs, 2001) studies reporting that the amygdala is involved in automatic evaluation of trustworthiness. That some of these immediate evaluations can have real and important behavioral implications outside the laboratory is evidenced by the fact that impression of competence formed about a face within 1 second has been shown to predict U.S. congressional elections better than chance (Todorov, Mandisodza, Goren, & Hall, 2005).
Automatic Appraisal of Traits
To predict the potential threat or reward value of another individual, information about his or her emotional state and intentions is critical. To this end people might benefit from an experiential sharing of the other ’s emotional state. Apart form enhancing the predictability of others’ behavior, this might also facilitate our social interaction.
People’s evaluations of each other often involve more complex judgments that are detached from the physical features of the target individual. Research has shown that even thin temporal slices of visual information can result in social evaluations that are quite sophisticated and surprisingly
c37.indd Sec1:736
Automatic Appraisal of Social Rewards Because danger in an evolutionary perspective was defined by agile, hungry predators and attacking conspecifics, with potentially deadly consequences, there was a high premium on responding quickly to these kinds of stimuli. Missing out on positive resources such as mates and meals, on the other hand, typically involved a missed opportunity rather than the end of a genetic lineage. In this perspective, it is hardly surprising that “negative information weighs more heavily on the brain” (Ito, Larsen, Smith, & Cacioppo, 1998, p. 887). Naturally, social stimuli can also trigger rapid reward-related emotional responses, although the heavy research bias on studying aversive responses appears to suggest something else. For example, attractive faces are instantly responded to even when the perceiver ’s attentional resources are consumed by some other explicit task. These emotional responses are tracked by the striatum (Childress et al., 2008; Cloutier, Heatherton, Whalen, & Kelley, 2008). Not only facial displays are instantly registered. Other physical traits signaling fertility and biological aptness, such as symmetry and body proportions, are also registered by the striatum even when the perceiver is lacking awareness of these cues (Cornwell et al., 2004; Schützwohl, 2006). Research has shown that unseen social rewards (erotica) can also affect our behavior (Jiang, Costello, Fang, Huang, & He, 2006). Mirroring Emotions
8/17/09 3:07:06 PM
Two Emotional Movements
Supporting these assumptions, it has been shown that humans spontaneously imitate emotional facial expressions. Experimenters using masking stimuli have even observed unconscious imitation responses (as assessed by electrophysiological measurements of facial muscles) to both angry and happy faces (Dimberg, Thunberg, & Elmehed, 2000). Without knowing it, we respond with minuscule facial gestures, which adds emotional color to our social interactions. In addition, social psychologists have demonstrated that whether we feel relaxed or uncomfortable with some people is affected by nonconscious emotional cues (Chartrand & Bargh, 1999). Mirror Neurons The discovery of so-called mirror neurons in the motor cortex that represent both one’s own and the corresponding actions in others has led to the suggestion that these reflexive responses play a key role in the understanding of actions and intentions (Gallese, Keysers, & Rizzolatti, 2004; Iacoboni & Dapretto, 2006). Inspired by this development, it has been proposed that shared representations of one’s own and others’ emotional experiences provide the reflexive basis for emotional empathy. This logic has guided subsequent studies of the direct experience and observation of pain or emotion showing activation of overlapping neural systems, mainly two cortical regions receiving ascending viscerosensory inputs: the anterior insula (AI) and the mid portion of the anterior cingulate cortex (ACC; Carr, Iacoboni, Dubeau, Mazziotta, & Lenzi, 2003; Decety & Jackson, 2007; Gallese et al., 2004; Iacoboni & Dapretto, 2006; Morrison, Lloyd, di Pellegrino, & Roberts, 2004; Singer, et al., 2004; Zaki, Ochsner, Hanelin, Wager, & Mackey, 2007). The AI is thought to support affective experience in part by providing awareness of these body state inputs (Craig, 2002; Critchley, Wiens, Rotshtein, Öhman, & Dolan, 2004), and the ACC is believed to code affective attributes of pain, such as perceived unpleasantness as opposed to sensory-discriminative properties, such as location and intensity (Eisenberger, Lieberman, & Williams, 2003; Hutchison, Davis, Lozano, Tasker, & Dostrovsky, 1999; Wager et al., 2004), and motivate appropriate behavior via projections to brain regions supporting motor and autonomic output (Critchley et al., 2004). The engagement of the AI, ACC, and other regions supporting the automatic sharing of, and hence an experiential understanding of, the intentions behind others’ emotional behaviors may in turn provide a substrate for the empathic responses underlying prosocial behaviors (Decety & Jackson, 2007; Lamm, Batson, & Decety, 2007; Singer et al., 2004). Of course, emotion understanding isn’t always so simple. Nonverbal emotional cues are often ambiguous, and additional information is needed to constrain attributions
c37.indd Sec1:737
737
about someone’s emotional state and intentions. Our prior experience with that person and knowledge about antecedent events and the wider surrounding context are important sources of information (Figure 37.1). For example, wide-open eyes could mean someone is either afraid or surprised—triggering amygdala activity accordingly— depending on our knowledge of what has just happened to him or her (Kim, Somerville, Johnstone, Alexander, & Whalen, 2003). Similarly, activation of shared representations—and presumably empathic responses—may be blocked if one perceives another to be a past or potential competitor (Batson, Thompson, & Chen, 2002; Singer et al., 2006) or the situation makes an emotional response inappropriate (Figure 37.1). Indeed, our understanding of the context and our expectations about the future (e.g., others’ behavior) are important determinants in our experience of a situation, which brings us to the next step of emotional processing. The Second Movement of Emotion Secondary Appraisal Automatic emotional activation provides input to the next stage of emotional processing, which we have called the second movement of emotion (Oatley, 2004). This is a more sluggish activity, which is responsive to reflection and explicit mental processes. It depends on more flexible neural mechanisms, primarily drawing on regions in sensory and prefrontal cortices as well as regions responsible for episodic memories, such as the hippocampus (Tulving, 2002). Of course, in our daily lives, which for most people is predominantly spent outside the laboratory, distant from experimenters’ surreptitious attempts to artificially tease apart different kinds of emotional processes, there is a constant interplay between rapid, reflexive processes on the one hand and slower reflective processes on the other. Providing the basis for this cross-talk, the amygdala and the striatum are reciprocally connected with cortical brain regions with a more recent evolutionary past. Constructing Emotions The second movement is the driving force behind the fact that people tend to construct their emotions. Depending on our previous experiences, we may respond in vastly different ways to the same emotional stimulus (Figure 37.1). Following this, some researchers have argued that the perceived meaning of the situation is the central determinant of the emotional response. And emotional meaning, it is claimed, results from an appraisal process (Scherer, Schorr, & Johnstone, 2001). To distinguish this from the automatic processes that provide an immediate evaluation of a stimulus, these appraisal processes should be called secondary appraisal.
8/17/09 3:07:07 PM
738 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
Input from Automatic to Secondary Appraisal There are different kinds of input from automatic to secondary appraisal. A major part concerns outputs from preliminary sensory processing. For example, these outputs may segregate the perceptual field into central figures and background, provide preliminary identification of the figures, and command attention to them for further processing. More important in the present context, the automatic processes of the amygdala and the striatum provide emotional coloring of the stimulus even before it is identified. The primary result of this evaluation is an estimate of the “goodness” and the “badness” of the stimulus, relating to a behavioral posture of approach or avoidance (e.g., Lang, Bradley, & Cuthbert, 1998). Interoceptive Input to Emotion What is the medium for conveying the emotional tone between the automatic and secondary appraisal stages? Conceptualizing the interface between the two processes in this way invites resurrecting the basic idea of the James-Lange theory of emotion: Feedback from the emotional response is a central stimulus source for emotion (and emotional experience). Concisely put, we do not cry because we feel sad; we feel sad because we cry. This idea has been in disrepute for close to a century after Cannon’s (1927) devastating critique. Cannon’s basic argument was that the physiological activation seen in intense emotion is too crude and too slow to account for the richness and nuance of emotional experience. Indeed, Cannon himself demonstrated that the patterns of physiological responses observed in anger and fear are indistinguishable and that it takes several seconds (and sometimes even tens of seconds) for the autonomic response to reach its maximum after an emotional provocation. Given what we know today, this critique is not as damaging as once perceived. Emotional information may reach the amygdala and start activating the bodily response within some 10 ms after it reaches sensory receptors and before it reaches the adequate cortical areas for identification. Feedback from facial responses is highly patterned and available within a few hundred milliseconds. Furthermore, facial feedback remains available even after surgery that blocks information from the body from reaching the brain. This may help explain why animals with such surgery (and humans with spinal chord damage that blocks feedback from the body) still appear to have emotion (thus rebutting another of Cannon’s [1927] critiques of the James-Lange theory). Feedback from the slow autonomic responses may not have to await the full-blown peripheral responses but may start coming in as soon as the relevant brain nuclei are activated. As Damasio (1999) pointed out, actual bodily
c37.indd Sec1:738
feedback may not always be necessary, because there may be “as-if body loops” that provide central simulations of previously experienced “real” emotional body loops in compressed time scales. Thus the brain may have quite specific information from the body early enough to make it a factor in shaping emotional experience and behavior. Indeed, Damasio and coworkers (2000) showed that simple instructions to recall different emotional episodes activated distinct patterns of activity in brain structures that regulate and represent bodily states. Such patterns provide “neural maps” that differ among emotions. According to Damasio, feelings can be understood as mental images arising from changes in such neural maps representing bodily activations brought about by emotional stimuli. He further speculated that these neural maps may be stored in the anterior insula, which is located in the convoluted cortex between the temporal and frontal lobes and which contains topographical maps of the viscera (see also Craig, 2002, 2004). Critchley et al. (2004) provided experimental data consistent with these conjectures when participants “listened for their heartbeats.” Participants listened to a sequence of tones that were either directly elicited by their heartbeats or were delayed by half a second from the heartbeats, which perceptually disconnected them from the heart. Their task was to decide whether the tones were synchronous with or independent from their heartbeats. In a control task, the participants listened for a tone that was fainter than the others. Compared to the control task, listening for the heartbeats produced activations in the insula, somatomotor, and cingulated cortex (Figure 37.3). Activity in the anterior insula predicted accuracy of heartbeat perception, and self-rated anxiety was related both to insula activation and to accurate heartbeat detection. Gray matter volume in the insula was correlated with heartbeat perception and with self-reported awareness of bodily changes. These data provide a quite compelling case for giving the insula a central role in perception and awareness of bodily changes (Craig, 2004). In addition, there are masking studies suggesting that the anterior insula is one of the brain areas that is exclusively correlated with conscious recognition of emotional stimuli (Critchley, Mathias, & Dolan, 2002). The exact role of bodily input to emotional processing remains to be determined. Because automatic appraisal decides that a stimulus is relevant for well-being, the associated bodily activation provides a critical signal that something should be done about the situation. In most cases, the reason for this activation should be readily appreciated from available contextual information, which also will support appropriate action. However, as we have seen, conscious perception of the eliciting stimulus is not necessary, which means that there might be instances when an emotional arousal is activated in the absence of a readily
8/17/09 3:07:07 PM
Two Emotional Movements
Hamilton anxiety scale
(C)
25
4 3.5 3 2.5 2 1.5 1 0.5 0
y 20
T value
R 0.64
20 15 10 5 0 0.3
(B) Activity of right anterior insular/opercular region
z4
L
1.5 R 0.62 1.0 0.5 0.0 0.5 0.3
(D) Activity of right anterior insular/opercular region
x 34
(A)
1.5
0.1 0.0 0.2 0.2 0.1 Performance on heartbeat detection task (relative to note task)
R 0.65
1.0 0.5 0.0 0.5
0.0 0.1 0.2 0.2 0.1 Performance on heartbeat detection task (relative to note task)
739
0
5 10 15 20 Hamilton anxiety scale (HAMA)
25
Figure 37.3 Functional neural correlates of interoceptive sensitivity. Note. A: Activation of the right anterior insula/opercular area is illustrated as a contrast between activities in the heartbeat auditory detection tasks. The anatomical location is mapped on orthogonal sections of a template brain, with coordinates in mm from anterior commissure. B: Activity within right insular/opercular cortex during interoceptive trials is plotted against interoceptive accuracy (relative to exteroceptive accuracy, to control for nonspecific detection difficulty in the noisy scanning environment). The Pearson correlation coefficient (R) is given in the plot. C: Subject scores on the Hamilton Anxiety Scale are plotted against relative interoceptive
awareness to illustrate the correlation in these subjects between sensitivity to bodily responses and subjective emotional experience, particularly of negative emotions. D: Activity in right anterior insula/opercular activity during interoception also correlated with anxiety score, suggesting that emotional feeling states are supported by explicit interoceptive representations within the right insula cortex. From “Neural Systems Supporting Interoceptive Awareness,” by Critchley et al., 2004, Nature Neuroscience, 7, p. 192. Reprinted with permission.
available explanation. Such unexplained arousal is a powerful motivation to search for explanations (Schachter & Singer, 1962), which will affect how the situation is interpreted. Thus, there may be spillover effects that influence the way the situation is emotionally interpreted. For example, running up stairs triggers an activation of the cardiovascular system. However, the emotional ramification of this activation is very different if we do it for exercise, to meet a lover waiting at the top of the stairs, or to escape a man chasing us with an axe. Although we discuss appraisal processes in terms of explicit mental activity, they need not be conscious. Although originally conscious activities, appraisals, particularly immediate ones, may become automatized to work outside of awareness (Lieberman, 2007).
emotional responses. People use conscious reflection to understand both their own and others’ emotional states. As discussed earlier, interoception provides a key component in reflecting about one’s own emotional state. Because we have already elaborated interoceptive input to the understanding of one’s own emotions was quite extensively, we focus next on reflection about others’ emotions.
Secondary Appraisal in Social Contexts The range of reflective emotional processes is, if anything, both larger and more complex than the first movement of
c37.indd Sec1:739
Appraising Others’ Emotions The ability to understand another individual’s emotional states is essential for virtually all aspects of social behavior and is likely to depend on both the reflexive emotional empathic responses discussed here and the attribution of mental states. Indeed, emotion understanding by definition requires a causal attribution about the intentions behind an action. As we have seen, people understand others’ emotions partly through the operation of rapid reflexive processes, such as the automatic appraisal of facial expressions and the sharing of emotional states. However, when
8/17/09 3:07:07 PM
Figure 37.4
(B)
3.0
2 r 0.61
1
2.0 1.0 0.0 1.0
0.0 1.0 Betas
2.0
Conditioned response
Conditioned response
(A)
3.0
Conditioned response
740 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
3.0
r 0.67
2.0 1.0 0.0 2.0
0.0
2.0 4.0 Betas
6.0
r 0.49
2.0 1.0 0.0 2.0
0.0 2.0 Betas
4.0
Functional activation during an observational fear-learning task.
A: A coronal view of activation in the right anterior insula, AI (–28, 15, –4)a when observing the pain response of a learning model to a shock. The adjacent graph shows that the magnitude of this activation predicts the strength of the conditioned response (indexed by the SCR) at a later time to a cue associated with the pain of the learning model. B: A sagittal view of activation in the (1) MPFC (1,46,24)a and (2) the ACC (3,27,32)a during the observation of the pain response of a learning model to a
shock. As in A, adjacent graphs display the positive relationship between the magnitude of activation during observation and the subsequent conditioned response. From “The Role of Social Cognition in Emotion,” by Olsson and Ochsner, 2008, Trends in Cognitive Sciences, 12, pp. 65–71. Reprinted with permission. a x, y, z coordinates for local maxima in Talairach space.
stimulus-driven processing of information is not sufficient, more reflective mental state attributions may be needed to understand another individual’s emotional state. These controlled attributions allow us to actively take other people’s perspectives and make judgments about their emotions or diagnostic elements of their emotional dispositions, thereby changing empathic responding (Batson et al., 2002) and recruitment of the anterior insula and the ACC (Lamm et al., 2007). These reflective processes have been shown to depend on a network of regions, including the right temporal parietal junction and dorsal regions of the medial prefrontal cortex (MPFC), including Brodmann area 10 (BA 10; Mitchell, Macrae, & Banaji, 2006; Ochsner et al., 2004a; Saxe, Moran, Scholz, & Gabrieli, 2006). Interestingly, a recent meta-analysis singles out BA 10, which is especially developed in humans, as particularly sensitive to tasks involving both emotions and mental state attributions (Gilbert et al., 2006). Taken together, these lines of work suggest that if the ACC and AI support direct experiential awareness of intentional states, the MPFC network might support the metacognitive reflective awareness of these experiences. The function of appropriately attributing emotional states to others is not limited to understanding, and thus responding to, their emotional expressions in the present moment, but additionally helps us to learn about the events causing others’ emotional responses. Previous behavioral research across primates has suggested that learning through observation draws on overlapping neural processes as classical conditioning (Mineka & Cook, 1993; Olsson & Phelps, 2004). Indeed, these findings were corroborated in
an imaging study demonstrating that overlapping regions of the amygdala, AI, and ACC were active during both observational learning and subsequent expression of fear responses (Olsson, Nearing, & Phelps, 2007). In contrast, the dorsal MPFC was active only during observation of another ’s distress. Importantly, the magnitude of the conditioned response was predicted by activity in the AI, ACC, and dorsal MPFC, suggesting that shared representations supporting experiential understanding of emotion, as well as regions supporting reflective mental state attributions, together support social-emotional learning (Figure 37.4).
c37.indd Sec1:740
REGULATING OUR EMOTIONAL RESPONSES: A “THIRD MOVEMENT”? An important role of reflective emotional processes is to regulate one’s own emotional responses. Once an emotional response has arisen, there are several ways it can be affected through top-down control (Ochsner & Gross, 2005). It has been argued that primitive regulatory processes that do not need voluntary effort, such as the extinction of a conditioned response, involve the down-regulation of amygdala activity by means of ventral and medial regions of the prefrontal cortex (PFC) across species (Quirk, Garcia, & Gonzalez-Lima, 2006). In contrast, depending on the strategy, voluntary up or down-regulation of amygdalabased emotions have been shown to draw on more dorsal medial and lateral regions, which have been implicated in executive control (Kalisch et al., 2005; Ochsner & Gross, 2005; Ochsner et al., 2004b).
8/17/09 3:07:08 PM
Summary and Conclusions
The most basic way of intentionally regulating one’s emotional response is to divert one’s attention away from the emotion-provoking stimuli. However, the flexibility of the human mind allows people to reappraise, or reinterpret, the situation in light of other knowledge. For example, if an emotion is evoked by another individual’s emotional expression, one strategy is to reinterpret the situational meaning of the other ’s intentions or feelings, as when thinking positively or negatively about the dispositions (“He is hearty [or weak]”) and future emotions (“Receiving the injection will inflict pain [or make him healthy]”; Figure 37.1). Interestingly, recent work suggests that simply making an attribution about the feelings of another person can have the unintended consequence of disrupting amygdala-mediated negative evaluations of him or her (Harris, Todorov, & Fiske, 2005; Lieberman, 2007). One reason for this may be that the attribution of emotional states can direct attention to the nonthreatening intentions (e.g., thinking about his or her food preferences) of a social target, thereby disambiguating the individual as a potential source of threat. This regulatory strategy can modulate amygdala activity through primarily ventrolateral PFC regions used to select from memory information that helps interpret another ’s feelings (Ochsner et al., 2004a). It is suggested that similar areas inhibit the enhanced amygdala response of nonprejudiced White participants exposed to masked Black faces (indicating an implicit racial bias) when the masking interval was extended to allow conscious recognition (Amodio, Devine, & Harmon-Jones, 2008). The Power of Language Humans are prone to retrospective justifications. As dramatically stated by V. S. Ramachandran (2004, p. 1), “Your conscious life . . . is nothing but an elaborate posthoc rationalization of things you really do for other reasons.” Famous examples of this process were inspired by Leon Festinger ’s (1957) theory of cognitive dissonance, which stated that humans seek balance and consistency in their belief systems. As a consequence, we are motivated to restore balance when there are conflicts between beliefs or between belief and action. For example, when persuaded by shrewd social psychologists to publicly express a view that was inconsistent with their beliefs, research participants were more likely to actually change their beliefs if paid a small rather than a large sum of money. With a big reward, participants could explain away the dissonant action as “I only did it for the money,” whereas with a trivial reward, justifying the action was more likely to require a change in conviction (Festinger & Carlsmith, 1959). Similar processes may be at work in emotion. Even though specific stimuli automatically activate emotions, this
c37.indd Sec2:741
741
automatic response often merely sets the constructive mind to work. We feel pressed to understand and to justify our emotions (“The man was so scary, what could I do but try to escape?” or “As adorable as she was, I just fell helplessly in love”), and we retrospectively manipulate emotion to justify our action (“I hit him because he made me so mad” or “I certainly must be madly in love to act this stupidly”). This was illustrated in a choice experiment in which subjects had to choose the most attractive face from two presented alternatives. Unbeknown to the subjects, a card trick made them believe that they had chosen the face they actually had not chosen. Not only did the surreptitious manipulation go unnoticed, but the vast majority of subjects also provided quite elaborated motivations for what they thought were their choices (Johansson, Hall, Sikstrom, & Olsson, 2005). Largely based on his research on split-brain patients, who have had their two cerebral hemispheres surgically disconnected from each other as a treatment for epilepsy, Michael Gazzaniga (1998, 2000) argued that the pressure to justify one’s actions reflects the operation of “an interpreter system” housed in the left frontal cerebral hemisphere. According to this view, the brain automatically takes care of most of the exigencies raised by the interaction of person and environment. The fundamental interpretive component of the human mind comes in late to make sense of the unfolding scenario mindlessly managed by the brain, to fit it into one’s worldview and self-image, and to keep constructing the narrative that we take to be our lives. Unlike all other creatures, humans can, by their access to language, keep a running commentary on their lives. As a consequence, we are prone to mixing up the commentary and the commented-on events in our memories, which may explain the unreliability of our memories (Loftus, 2005). But the commentary is not merely epiphenomenal activity. Rather, it gives consistency to the world and to our actions in it, and it helps us to cope with new situations by timeproven (and largely culturally and socially determined) formulas. In doing its work, the interpreter tries hard to be rational. Indeed, Gazzaniga claims that the interpreter is behind the human adoration of reason. Indeed, this mechanism might also explain the way emotions sometimes appear disconnected from our rational thinking.
SUMMARY AND CONCLUSIONS Our aim in this chapter has been to provide a selective overview of research that illuminates the link between emotional behavior and its neural substrate. To this end we have drawn on various streams of work within the realms of affective and cognitive neuroscience. Providing a rough framework for this endeavor, we have revisited the ancient
8/17/09 3:07:09 PM
742 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation
division of emotion into two separable movements. To this division, which was originally proposed by the Stoics more than 2,000 years ago (Oatley, 2004), we have added another stage that we have called the third movement. Recent behavioral, psychophysiological, and neural data have validated these distinctions. The first movement describes the initial surge of rapid, automatic, often implicit emotional responses to a stimulus or event, whereas the second movement captures the slower unfolding of a more nuanced repertoire of emotional responses that are dependent on conscious appraisals. Unlike the initial response, these secondary appraisals are shaped by situational factors, our explicit memories, and reflective reasoning. The amygdala has been identified as a core region in neural circuitry responsible for the automatic appraisal of emotionally relevant, especially potentially threatening stimuli. Providing further rapid information and connecting us with our social environment, another network of cortical regions, including the so-called mirror neurons system and the AI, is believed to support rapid mimicking of others’ emotional expressions and experiential sharing of their emotions. These neural systems supporting the first emotional movement contribute important input to a more widely distributed neural network underlying the second emotional movement. For example, the AI provides interoceptive information about one’s own emotional state, and together with prefrontal regions, such as the ACC and medial region of the PFC, it provides the basis for reflections about both one’s own and others’ emotional states. Thanks to the reciprocal connectivity between these more recently evolved prefrontal circuits and evolutionarily older circuits supporting automatic appraisal, emotional responses can not only be spontaneously regulated, but can be shaped by volitional strategies. The reinterpretation of the emotional value of a situation is one such strategy that has received much attention lately and has been shown to involve medial and lateral prefrontal regions. In a way, this prefrontally driven regulation extends the ancient Stoic description of emotions by adding a third movement to the two existing ones. A third movement closes the circle and provides for a loop of neural activation that can be continuously calibrated to produce the currently most adaptive emotional response. In humans language brings about a uniquely flexible tool kit for up or down-regulation of emotional responses. Indeed, the flexibility of symbols can sometimes play tricks on us, such as when our verbal reports of emotions or reflective reasoning stand in conflict with our own emotional experience or behavior. Although much work remains before we can fully understand the nature of emotion, this chapter
c37.indd Sec2:742
has shown that we have made good progress so far. With the rapid development of techniques to map the functional activity of the human brain, it will continue to be imperative to link these data with their physiological and behavioral correlates. Only this way will emotion—as we known it— appear less mysterious.
REFERENCES Adolphs, R. (2001). The neurobiology of social cognition. Current Opinion in Neurobiology, 11, 231–239. Adolphs, R., Gosselin, F., Buchanan, T. W., Tranel, D., Schyns, P., & Damasio, A. R. (2005, January 20). A mechanism for impaired fear recognition after amygdala damage. Nature, 433, 68–72. Amodio, D. M., Devine, P. G., & Harmon-Jones, E. (2008). Individual differences in the regulation of intergroup bias: The role of conflict monitoring and neural signals for control. Journal of Personality and Social Psychology, 94, 60–74. Anderson, A. K., & Phelps, E. A. (2001, May 17). Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature, 411, 305–309. Batson, C. D., Thompson, E. R., & Chen, H. (2002). Moral hypocrisy: Addressing some alternatives. Journal of Personality and Social Psychology, 83, 330–339. Berridge, K. C., & Winkielman, P. (2003). What is an unconscious emotion? (The case for unconscious “liking”). Cognition and Emotion, 17, 181–211. Bevins, R. A., Besheer, J., Palmatier, M. I., Jensen, H. C., Pickett, K. S., & Eurek, S. (2002). Novel-object place conditioning: Behavioral and dopaminergic processes in expression of novelty reward. Behavioral Brain Research, 129, 41–50. Cannon, W. (1927). The James-Lange theory of emotions: A critical examination and an alternative theory. American Journal of Psychology, 39, 106–124. Carlsson, K., Petersson, K. M., Lundqvist, D., Karlsson, A., Ingvar, M., & Öhman, A. (2004). Fear and the amygdala: Manipulation of awareness generates differential cerebral responses to phobic and fear-relevant (but nonfeared) stimuli. Emotion, 4, 340–353. Carr, L., Iacoboni, M., Dubeau, M. C., Mazziotta, J. C., & Lenzi, G. L. (2003). Neural mechanisms of empathy in humans: A relay from neural systems for imitation to limbic areas. Proceedings of the National Academy of Sciences, USA, 100, 5497–5502. Chartrand, T. L., & Bargh, J. A. (1999). The chameleon effect: The perception-behavior link and social interaction. Journal of Personality and Social Psychology, 76, 893–910. Childress, A. R., Ehrman, R. N., Wang, Z., Li, Y., Sciortino, N., Hakun, J., et al. (2008). Prelude to passion: Limbic activation by “unseen” drug and sexual cues. PLoS ONE, 3, 1506. Cloutier, J., Heatherton, T. F., Whalen, P. J., & Kelley, W. M. (2008). Are attractive people rewarding? Sex differences in the neural substrates of facial attractiveness. Journal of Cognitive Neuroscience, 20, 941–951. Coan, J. A., & Allen, J. J. B. (Eds.). (2007). Handbook of emotion elicitation and assessment. New York: Oxford University Press. Cornwell, R. E., Boothroyd, L., Burt, D. M., Feinberg, D. R., Jones, B. C., Little, A. C., et al. (2004). Concordant preferences for opposite-sex signals? Human pheromones and facial characteristics. Proceedings of the Royal Society: Biological Science, 271, 635–640. Craig, A. D. (2002). How do you feel? Interoception: The sense of the physiological condition of the body. Nature Reviews, 3, 655–666. Craig, A. D. (2004). Human feelings: Why are some more aware than others? Trends in Cognitive Science, 8, 239–241. Critchley, H., Mathias, C., & Dolan, R. J. (2002). Fear conditioning in humans: The influence of awareness and autonomic arousal on functional neuroanatomy. Neuron, 33, 653–663.
8/17/09 3:07:10 PM
References Critchley, H. D., Wiens, S., Rotshtein, P., Öhman, A., & Dolan, R. J. (2004). Neural systems supporting interoceptive awareness. Nature Neuroscience, 7, 189–195. Cromwell, H. C., & Schultz, W. (2003). Effects of expectations for different reward magnitudes on neuronal activity in primate striatum. Journal of Neurophysiology, 89, 2823–2838. Cunningham, W. A., Johnson, M. K., Gatenby, J. C., Gore, J. C., & Banaji, M. R. (2003). Neural components of social evaluation. Journal of Personality and Social Psychology, 85, 639–649. Cunningham, W. A., Johnson, M. K., Raye, C. L., Gatenby, J., Gore, J. C., & Banaji, M. R. (2004). Separable neural components in the processing of Black and White faces. Psychological Science, 15, 806–313. Damasio, A. (1994). Descartes’ error: Emotion, reason, and the human brain. New York: Avon Books. Damasio, A. (1999). The feeling of what happens: Body and emotion in the making of consciousness. New York: Harcourt Brace. Damasio, A. R., Grabowski, T. J., Bechara, A., Damasio, H., Ponto, L. L., Parvizi, J., Hichwa, R. D. (2000). Subcortical and cortical brain activity during the feeling of self-generated emotions. Nature Neuroscience, 3, 1049–1056. Davis, M., & Whalen, P. J. (2001). The amygdala: Vigilance and emotion. Molecular Psychiatry, 6, 13–34. Decety, J., & Jackson, P. L. (2007). The functional architecture of human empathy. Behavioral and Cognitive Neuroscience Reviews, 3, 71–100. de Gelder, B., Snyder, J., Greve, D., Gerard, G., & Hadjikhani, N. (2004). Fear fosters flight: A mechanism for fear contagion when perceiving emotion expressed by a whole body. Proceedings of the National Academy of Sciences, USA, 101, 16701–16706. Dimberg, U., Thunberg, M., & Elmehed, K. (2000). Unconscious facial reactions to emotional facial expressions. Psychological Science, 11, 86–89. Eisenberger, N. I., Lieberman, M. D., & Williams, K. D. (2003, October 10). Does rejection hurt? An FMRI study of social exclusion. Science, 302, 290–292. Festinger, L. (1957). A theory of cognitive dissonance. Evanston, IL: Row, Peterson. Festinger, L., & Carlsmith, C. (1959). Cognitive consequences of forced compliance. Journal of Abnormal and Social Psychology, 58, 203–210. Gallese, V., Keysers, C., & Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends in Cognitive Science, 8, 396–403. Gazzaniga, M. (1998). The mind’s past. Berkeley: University of California Press. Gazzaniga, M. (2000). Cerebral specialization and interhemispheric communication: Does the corpus callosum enable the human condition? Brain, 123, 1293–1326. Gilbert, S. J., Spengler, S., Simons, J. S., Steele, J. D., Lawrie, S. M., Frith, C. D., Burgess, P. W. (2006). Functional specialization within rostral prefrontal cortex (area 10): A meta-analysis. Journal of Cognitive Neuroscience, 18, 932–948. Harris, L. T., Todorov, A., & Fiske, S. T. (2005). Attributions on the brain: Neuro-imaging dispositional inferences, beyond theory of mind. NeuroImage, 28, 763–769. Humphrey, N. (1983). Consciousness regained: Chapters in the development of mind. Oxford University Press. Hutchison, W. D., Davis, K. D., Lozano, A. M., Tasker, R. R., & Dostrovsky, J. O. (1999). Pain-related neurons in the human cingulate cortex. Nature Neuroscience, 2, 403–405. Iacoboni, M., & Dapretto, M. (2006). The mirror neuron system and the consequences of its dysfunction. Nature Reviews Neuroscience, 7, 942–951. Ito, T. A., Larsen, J. T., Smith, N. K., & Cacioppo, J. T. (1998). Negative information weighs more heavily on the brain: The negativity bias in evaluative categorizations. Journal of Personality and Social Psychology, 75, 887–900. Jiang, Y., Costello, P., Fang, F., Huang, M., & He, S. A. (2006). A genderand sexual orientation-dependent spatial attentional effect of invisible images. Proceedings of the National Academy of Science, USA, 103, 1748–1752.
c37.indd Sec3:743
743
Johansson, P., Hall, L., Sikstrom, S., & Olsson, A. (2005, October 7). Failure to detect mismatches between intention and outcome in a simple decision task. Science, 310, 116–119. Kalisch, R., Wiech, K., Critchley, H. D., Seymour, B., O’Doherty, J. P., Oakley, D. A., Allen, P., & Dolan, R. J. (2005). Anxiety reduction through detachment: Subjective, physiological, and neural effects. Journal of Cognitive Neuroscience, 17, 874–883. Kim, H., Somerville, L. H., Johnstone, T., Alexander, A. L., & Whalen, P. J. (2003). Inverse amygdala and medial prefrontal cortex responses to surprised faces. NeuroReport, 14, 2317–2322. Klüver, H., & Bucy, P. (1939). Preliminary analysis of functioning of the temporal lobes in monkeys. Archives of Neurological Psychiatry, 42, 979–1000. Knutson, B., Adams, C. M., Fong, G. W., & Hommer, D. (2001). Anticipation of increasing monetary reward selectively recruits nucleus accumbens. Journal of Neuroscience, 21, 1–5. Kunst-Wilson, W. R., & Zajonc, R. B. (1980, February 1). Affective discrimination of stimuli that cannot be recognized. Science, 207, 557–558. Lamm, C., Batson, C. D., & Decety, J. (2007). The neural substrate of human empathy: Effects of perspective-taking and cognitive appraisal. Journal of Cognitive Neuroscience, 19, 42–58. Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (1998). Emotion, motivation, and anxiety: Brain mechanisms and psychophysiology. Biological Psychiatry, 44, 1248–1263. LeDoux, J. E. (1996). The emotional brain. New York: Simon & Schuster. LeDoux, J. E. (2000). Emotion circuits in the brain. Annual Review of Neuroscience, 23, 155–184. Liddell, B. J., Brown, K. J., Kemp, A. H., Barton, M. J., Das, P., Peduto, A., et al. (2005). A direct brainstem-amygdala-cortical “alarm” system for subliminal signals of fear. Neuroimage, 24, 235–243. Lieberman, M. D. (2007). Social cognitive neuroscience: A review of core processes. Annual Review of Psychology, 58, 259–289. Loftus, E. (2005). Planting misinformation in the human mind: A 30-year investigation of the malleability of memory. Learning and Memory, 12, 361–366. McGaugh, J. L. (2004). The amygdala modulates the consolidation of memories of emotionally arousing experiences. Annual Review of Neuroscience, 27, 1–28. Mineka, S., & Cook M. (1993). Mechanisms involved in the observational conditioning of fear. Journal of Experimental Psychology, 122, 23–38. Mitchell, J. P., Macrae, C. N., & Banaji, M. R. (2006). Dissociable medial prefrontal contributions to judgments of similar and dissimilar others. Neuron, 50, 655–663. Morris, J. S., Friston, K. J., Buchel, C., Frith, C. D., Young, A. W., Calder, A. J., et al. (1998). A neuromodulatory role for the human amygdala in processing emotional facial expressions. Brain, 121, 47–57. Morris, J. S., Öhman, A., & Dolan, R. J. (1998, June 4). Conscious and unconscious emotional learning in the amygdala. Nature, 393, 467–470. Morris, J. S., Öhman, A., & Dolan, R. J. (1999). A subcortical pathway to the right amygdala mediating “unseen” fear. Proceedings of the National Academy of Sciences, USA, 96, 1680–1685. Morrison, I., Lloyd D., di Pellegrino, G., & Roberts, N. (2004). Vicarious responses to pain in anterior cingulate cortex: Is empathy a multisensory issue? Cognitive, Affective and Behavioral Neuroscience, 4, 270–278. Murray, E. A. (2007). The amygdala, reward and emotion. Trends in Cognitive Sciences, 11, 489–497. Navarrete, C. D., Olsson, A., Ho, A., Mendes, W., Thomsen, L., & Sidanius, J. (2009). The roles of race and gender in the persistence of learned fear. Psychological Science, 20, 155–158. Oatley, K. (2004). Emotions: A brief history. Toronto: University of Toronto Press. Ochsner, K. N., & Gross, J. J. (2005). The cognitive control of emotion. Trends in Cognitive Sciences, 9, 242–249. Ochsner, K. N., Ray, R. D., Cooper, J. C., Robertson, E. R., Chopra, S., Gabrieli, J. D., et al. (2004a). For better or for worse: Neural systems supporting the cognitive down- and up-regulation of negative emotion. NeuroImage, 23, 483–499.
8/17/09 3:07:10 PM
744 The Affective Neuroscience of Emotion: Automatic Activation, Interoception, and Emotion Regulation Ochsner, K. N., Ray, R. D., Cooper, J. C., Robertson, E. R., Chopra, S., Gabrieli, J. D., et al. (2004b). Reflecting upon feelings: An fMRI study of neural systems supporting the attribution of emotion to self and other. Journal of Cognitive Neuroscience, 16, 1746–1772. O’Doherty, J., Dayan, P., Schultz, J., Deichmann, R., Friston, K., & Dolan, R. J. (2004, April 16). Dissociable roles of ventral and dorsal striatum in instrumental conditioning. Science, 304, 452–454. Öhman, A. (1986). Face the beast and fear the face: Animal and social fears as prototypes for evolutionary analyses of emotion. Psychophysiology, 23, 123–145. Öhman, A., & Mineka, S. (2001). Fears, phobias, and preparedness: Toward an evolved module of fear and fear learning. Psychological Review, 108, 483–522. Öhman A., & Soares, J. J. (1994). “Unconscious anxiety”: Phobic responses to masked stimuli. Journal of Abnormal Psychology, 103, 231–240. Öhman, A., & Soares, J. J. F. (1998). Emotional conditioning to masked stimuli: Expectancies for aversive outcomes following non-recognized fear-relevant stimuli. Journal of Experimental Psychology: General, 127, 69–82. Öhman, A., & Wiens, S. (2003). On the automaticity of autonomic responses in emotion: An evolutionary perspective. In R. J. Davidson, K. Scherer, & H. H. Goldsmith (Eds.), Handbook of affective sciences (pp. 256–275). New York: Oxford University Press. Olsson, A., Ebert, J. P., Banaji, M. R., & Phelps, E. A. (2005, July 29). The role of social groups in the persistence of learned fear. Science, 309, 785–787. Olsson, A., Nearing, K. I., & Phelps, E. A. (2007). Learning fears by observing others: The neural systems of social fear transmission. Social Cognitive and Affective Neuroscience, 2, 3–11. Olsson, A., & Ochsner, K. N. (2008). The role of social cognition in emotion. Trends in Cognitive Sciences, 12, 65–71. Olsson, A., & Phelps, E. A. (2004). Learned fear of “unseen” faces after Pavlovian, observational, and instructed fear. Psychological Science, 15, 822–828. Pasley, B. N., Mayes, L. C., & Schultz, R. T. (2004). Subcortical discrimination of unperceived objects during binocular rivalry. Neuron, 42, 163–172. Pessiglione, M., Schmidt, L., Draganski, B., Kalisch, R., Lau, H., Dolan, R. J., et al. (2007, May 11). How the brain translates money into force: A neuroimaging study of subliminal motivation. Science, 316, 904–906. Phelps, E. A. (2006). Emotion and cognition: Insights from studies of the human amygdala. Annual Review of Psychology, 24, 27–53. Phelps, E. A., & LeDoux, J. E. (2005). Contributions of the amygdala to emotion processing: From animal models to human behavior. Neuron, 48, 175–187. Phelps, E. A., Ling, S., & Carrasco, M. (2006). Emotion facilitates perception and potentiates the perceptual benefit of attention. Psychological Science, 17, 292–299. Phelps, E. A., O’Connor, K. J., Cunningham, W. A., Funayma, E. S., Gatenby, J. C., Gore, J. C., et al. (2000). Performance on indirect measures of race evaluation predicts amygdala activity. Journal of Cognitive Neuroscience, 12, 1–10. Quirk, G. J., Garcia, R., & Gonzalez-Lima, F. (2006). Prefrontal mechanisms in extinction of conditioned fear. Biological Psychiatry, 60, 337–643. Ramachandran, V. S. (2004). A brief tour of human consciousness. New York: Pi Press. Rolls, E. T. (1999). The brain and emotion. New York: Oxford University Press. Sabatinelli, D., Bradley, M. M., Fitzsimmons, J. R., & Lang, P. J. (2005). Parallel amygdala and inferotemporal activation reflect emotional intensity and fear relevance. NeuroImage, 24, 1265–1270.
c37.indd Sec3:744
Saxe, R., Moran, J. M., Scholz, J. K., & Gabrieli, J. D. E. (2006). Overlapping and non-overlapping brain regions for theory of mind and self reflection in individual subjects. Social Cognitive Affective Neuroscience, 1, 229–234. Schachter, S., & Singer, J. E. (1962). Cognitive, social, and physiological determinants of emotional state. Psychology Review, 69, 379–399. Scherer, K. R., Schorr, A., & Johnstone, T. (Eds.). (2001). Appraisal processes in emotion: Theory, methods, and research. New York: Oxford University Press. Schultz, W. (2002). Getting formal with dopamine and reward. Neuron, 36, 241–263. Schützwohl, A. (2006). Judging female figures: A new methodological approach to male attractiveness judgments of female waist-to-hip ratio. Biological Psychology, 71, 223–229. Singer, T., Seymour, B., O’Doherty, J., Kaube, H., Dolan, R. J., & Frith, C. D. (2004, February 20). Empathy for pain involves the affective but not sensory components of pain. Science, 303, 1157–1162. Singer, T., Seymour, B., O’Doherty, J., Stephan, K. E., Dolan, R. J., & Frith, C. D. (2006, January 26). Empathic neural responses are modulated by the perceived fairness of others. Nature, 439, 466–469. Todorov, A., Mandisodza, A. N., Goren, A., & Hall, C. C. (2005, June 10). Inferences of competence from faces predict election outcomes. Science, 308, 1623–1626. Tulving, E. (2002). Episodic memory: From mind to brain. Annual Review of Psychology, 53, 1–25. Vuilleumier, P., Armony, J. L., Driver, J., & Dolan, R. J. (2003). Distinct spatial frequency sensitivities for processing faces and emotional expressions. Nature Neuroscience, 6, 624–631. Vuilleumier, P., Richardson, M. P., Armony, J. L., Driver, J., & Dolan, R. J. (2004). Distant influences of amygdala lesion on visual cortical activation during emotional face processing. Nature Neuroscience, 7, 1271–1278. Wager, T. D., Rilling, J. K., Smith, E. E., Sokolik, A., Casey, K. L., Davidson, R. J., Kosslyn, S. M., Rose, R. M, Cohen, J. D. (2004, February 20). Placebo-induced changes in fMRI in the anticipation and experience of pain. Science, 303, 1162–1167. Weiskrantz, L. (1956). Behavioral changes associated with ablation of the amygdaloid complex in monkeys. Journal of Comparative Physiological Psychology, 49, 381–391. Whalen, P. J., Rauch, S. L., Etcoff, N. L., McInerney, S. C., Lee, M. B., & Jenike, M. A. (1998). Masked presentations of emotional facial expressions modulate amygdala activity without explicit knowledge. Journal of Neuroscience, 18, 411–418. Williams, M. A., Morris, A. P., McGlone, F., Abbott, D. F., & Mattingley, J. B. (2004). Amygdala responses to fearful and happy facial expressions under conditions of binocular suppression. Journal of Neuroscience, 24, 2898–2904. Willis, J., & Todorov, A. (2006). First impressions: Making up your mind after a 100-ms exposure to a face. Psychological Science, 17, 592–598. Winston, J. S., Strange, B. A., O’Doherty, J., & Dolan, R. J. (2002). Automatic and intentional brain responses during evaluation of trustworthiness of faces. Nature Neuroscience, 5, 277–283. Zajonc, R. B. (2004). Exposure effects: An unmediated phenomenon. In A. S. R. Manstead, N. Frijda, & A. Fischer (Eds.), Feelings and emotions, (pp. 194–203). Amsterdam Symposium: Cambridge: Cambridge University Press. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 1980, 35, 151–175. Zaki, J., Ochsner, K. N., Hanelin, H., Wager, T. D., & Mackey, S. (2007). Different circuits for different pain: Patterns of functional connectivity reveal distinct networks for processing pain in self and others. Social Neuroscience, 2, 276–291.
8/17/09 3:07:10 PM
Chapter 38
The Somatovisceral Components of Emotions and Their Role in Decision Making: Specific Attention to the Ventromedial Prefrontal Cortex ANTOINE BECHARA AND NASIR NAQVI
the same. However, philosophers argued that emotions are not just bodily sensations; the two have different objects. Body sensations are about awareness of the internal state of the body. Emotional feelings are directed toward objects in the external world. Neuroscientific evidence based on functional magnetic resonance imaging (fMRI) tend to provide important validation of the theoretical view of James-Lange that neural systems supporting the perception of body states provide a fundamental ingredient for the subjective experience of emotions. This is consistent with contemporary neuroscientific views (e.g., see Craig, 2002), which suggest that the anterior insular cortex, especially on the right side of the brain, plays an important role in the mapping of bodily states and their translation into emotional feelings. The view of A. R. Damasio (1999, 2003) is consistent with this notion, but it suggests further that emotional feelings are not just about the body, but they are also about things in the world. In other words, sensing changes in the body requires neural systems, in which the anterior insular cortex is a critical substrate. However, the feelings that accompany emotions require additional brain regions. In Damasio’s view, feelings arise in conscious awareness through the representation of bodily changes in relation to the object or event that incited the bodily changes. This second-order mapping of the relationship between organism and object occurs in brain regions that can integrate information about the body with information about the world. Such regions include the anterior cingulate cortex (Figure 38.1), especially its dorsal part. According to A. R. Damasio (1994, 1999, 2003), there is an important distinction between emotions and feelings. Emotions are a collection of changes in body and brain states triggered by a dedicated brain system that responds
The orbital and mesial prefrontal cortices have been implicated in a range of affective processes, including hedonic and anticipatory responses to reward and punishment, subjective states of desire, and basic as well as social emotions. Changes in the visceral state may be considered a form of anticipation of the bodily impact of objects and events in the world. Visceral responses to biologically relevant stimuli allow an organism to maximize the survival value of situations that may impact the state of the internal milieu. These include events that promote homeostasis, such as an opportunity to feed or engage in social interaction, as well as events that disrupt homeostasis, such as a physical threat or a signal of social rejection. Visceral responses are just one component of a broader emotional response system that also includes changes in the endocrine and skeletomotor systems, as well as changes within the brain that alter the perceptual processing of biologically relevant stimuli (A. R. Damasio, 1994). William James (1884) initially proposed that visceral responses to biologically relevant stimuli are a necessary component of the subjective experience of emotion. More specifically, suppose you saw the person you love bringing you flowers. The encounter may cause your heart to race, your skin to flush, and your facial muscles to contract with a happy expression. The encounter may also be accompanied by some body sensations, such as hearing your heartbeat and sensing “butterflies” in your stomach. However, there is also another kind of sensation: the emotional feeling of love, ecstasy, and elation directed toward your loved one. Since James’s initial proposal, neuroscientists and philosophers have debated whether these two sensations are fundamentally the same. The psychological view of James-Lange (James, 1884) implied that the two were 745
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c38.indd 745
8/17/09 3:07:29 PM
746 The Somatovisceral Components of Emotions and Their Role in Decision Making
The World
The Brain Anterior cingulate cortex
Sensory systems
Insular cortex
Orbitofrontal cortex Amygdala
Emotionally competent stimulus
Brain Stem/ hypothalamus
Visceral motor
Visceral sensory
The Body
The viscera
Autonomic responses
Figure 38.1 Information related to the emotionally competent object is represented in one or more of the brain’s sensory processing systems. Note: This information, which can be derived from the environment or recalled from memory, is made available to the amygdala and the orbitofrontal cortex, which are trigger sites for emotion. The emotion execution sites include the hypothalamus, the basal forebrain, and nuclei in the
to the content of one’s perceptions of a particular entity or event. The responses toward the body proper enacted in a body state involve physiological modifications that range from changes that are hidden from an external observer (e.g., changes in heart rate, smooth muscle contraction, endocrine release) to changes that are perceptible to an external observer (e.g., skin color, body posture, facial expression). The signals generated by these changes toward the brain itself produce changes that are mostly perceptible to the individual in whom they were enacted, which then provide the essential ingredients for what is ultimately perceived as a feeling. Thus, emotions are what an outside observer can see, or at least can measure through neuroscientific tools. Feelings are what the individual senses or subjectively experiences. An emotion begins with a stimulus (imagined or perceived), such as a snake, a speaking engagement, or the person you are in love with. In neural terms, images related to the emotional stimulus are represented in one or more of the brain’s sensory processing systems. Regardless of how short this presentation is, signals related to the
c38.indd 746
brain stem tegmentum. Only the visceral response is represented, although emotion comprises endocrine and somatomotor responses as well. Visceral sensations reach the anterior insular cortex by passing through the brain stem. Feelings result from the re-representation of changes in the viscera in relation to the object or event that incited them. The anterior cingulate cortex is a site where this second-order map is realized.
presence of that stimulus are made available to a number of emotion-triggering sites elsewhere in the brain. Two of these emotion-triggering sites are the amygdala and the orbitofrontal cortex (Figure 38.1). Evidence suggests that there may be some difference in the way the amygdala and the orbitofrontal cortex process emotional information: The amygdala is more engaged in the triggering of emotions when the emotional stimulus is present in the environment; the orbitofrontal cortex is more important when the emotional stimulus is recalled from memory (Bechara, Damasio, & Damasio, 2003). To create an emotional state, the activity in triggering sites must be propagated to execution sites by means of neural connections. The emotion execution sites are visceral motor structures that include the hypothalamus, the basal forebrain, and some nuclei in the brain stem tegmentum (Figure 38.1). Feelings result from neural patterns that represent changes in the body’s response to an emotional object. Signals from body states are relayed back to the brain, and representations of these body states are formed at the level of
8/17/09 3:07:30 PM
Visceral Functions of the vmPFC 747
visceral sensory nuclei in the brain stem. Representations of these body signals also form at the level of the insular cortex and lateral somatosensory cortex (Figure 38.1). It is most likely that the reception of body signals at the level of the brain stem does not give rise to conscious feeling as we know it, but the reception of these signals at the level of the cortex does so. The anterior insular cortex plays a special role in mapping visceral states and in bringing interoceptive signals to conscious perception. It is less clear whether the anterior insular cortex also plays a special role in translating the visceral states into subjective feeling and self-awareness. In A. R. Damasio’s (1999) view, feelings arise in conscious awareness through the representation of bodily changes in relation to the emotional object (present or recalled) that incited the bodily changes. A first-order mapping of self is supported by structures in the brain stem, insular cortex, and somatosensory cortex. However, additional regions, such as the anterior cingulate cortex, are required for a second-order mapping of the relationship between organism and emotional object, and the integration of information about the body with information about the world. According to the somatic marker hypothesis (A. R. Damasio, 1994), the sensory mapping of visceral responses not only contributes to feelings, but is also important for the execution of highly complex, goal-oriented behavior. In this view, visceral responses function to mark potential choices as advantageous or disadvantageous. This process aids in decision making in which there is a need to weigh positive and negative outcomes that may not be predicted decisively through cold rationality alone. Both the Jamesian view and the somatic marker hypothesis hold that the brain contains a system that translates the sensory properties of external stimuli into changes in the visceral state that reflect their biological relevance. We propose that this is the essential function of the ventromedial prefrontal cortex (vmPFC), a function that ties control of the visceral state to decision making and affect. In this chapter we review evidence that the vmPFC plays a role in eliciting visceral responses that are related to the value of objects and events in the world. We start by discussing anatomical and physiological evidence that the vmPFC can both influence the state of the viscera and also register changes in the viscera elicited by biologically relevant stimuli. We then review the results of lesion studies in humans showing that the vmPFC is necessary for eliciting visceral responses to certain forms of emotional stimuli. Finally, we review evidence supporting the somatic marker hypothesis, showing that the visceral responses that are controlled by the vmPFC play a role in guiding decision making in the face of uncertain reward and punishment.
c38.indd 747
It is important to clarify up front that although the Jamesian and the Damasio views have insinuated that different emotions and bodily states (or somatic states) are characterized by a unique signature of visceral responses (see Rainville et al., 2006, for an example), the fact remains that the preponderance of evidence does not seem to support this depiction (for reviews, see, e.g., Cacioppo, Berntson, Larsen, Poehlmann, & Ito, 2000; Cacioppo, Klein, Berntson, & Hatfield, 1993). However, the cumulative evidence suggesting that there are no physiological response profiles that differentiate discrete emotions is not necessarily fatal to the concept of the somatic marker hypothesis, as interoception from visceral responses is still likely to be playing a role (see the discussion of the somatovisceral afference model of emotion, SAME, which was first proposed in 1992 to explain precisely how the undifferentiated visceral responses might produce immediate, discrete, and indubitable emotions; Cacioppo et al., 1993, 2000). Thus, somatic markers can be viewed as becoming differentiated at the level of the central nervous system and not necessarily in the periphery, although peripheral visceral input still plays a key role.
VISCERAL FUNCTIONS OF THE vmPFC The terms ventromedial prefrontal cortex and orbitofrontal cortex (OFC) are often used interchangeably in the literature, even though they do not refer to identical regions. For this reason it is necessary to clarify exactly what we mean when we use these terms. The OFC is the entire cortex occupying the ventral surface of the frontal lobe, dorsal to the orbital plate of the frontal bone. We have used the term vmPFC to designate a region that encompasses medial portions of the OFC along with ventral portions of the medial prefrontal cortex. The vmPFC is an anatomical designation that has arisen because lesions that occur in the basal portions of the anterior fossa, which include meningiomas of the cribiriform plate and falx cerebri, and aneurysms of the anterior communicating and anterior cerebral arteries, frequently lead to damage in this area (Figure 38.2). Often this damage is bilateral. With respect to the cytoarchitectonic fields identified in the human orbitofrontal and medial prefrontal cortices by Price and colleagues (Ongur & Price, 2000), the vmPFC comprises Brodmann area (BA) 14 and medial portions of BA 11 and 13 on the orbital surface and BA 25 and 32 and caudal portions of BA 10 on the mesial surface. The vmPFC excludes lateral portions of the OFC, namely BA 47/12, as well as more dorsal and posterior regions of BA 24 and 32 of the medial prefrontal cortex. The vmPFC is thus a relatively large and heterogeneous area.
8/17/09 3:07:30 PM
748 The Somatovisceral Components of Emotions and Their Role in Decision Making (A)
(B)
Figure 38.2 ( Figure C. 36 in color section) A: The orbitofrontal cortex. B: The location of the vmPFC as defined in our lesion studies and in this chapter. Note: (A) On the sagittal section (left), the medial sector of the orbitofrontal cortex is depicted on the area of the brain highlighted in yellow. On the coronal slice (right), both the medial and lateral areas of the orbitofrontal cortex are depicted on the area of the brain highlighted in yellow. (B) A map showing areas of the brain that are damaged in patients who show impairments in visceral response and decision making. The colors reflect the number of subjects with damage in a given voxel. The region of greatest overlap is the vmPFC. Note the involvement of medial wall and medial orbitofrontal areas and the relative absence of involvement of the lateral orbitofrontal areas.
Viewing the vmPFC as a single region may blur the distinction between functions subserved by the OFC on the one hand and the ventral portion of the medial prefrontal cortex on the other. Recent evidence in both rodents (Chudasama & Robbins, 2003) and nonhuman primates (Pears, Parkinson, Hopewell, Everitt, & Roberts, 2003) suggests that these regions subserve distinct motivational and learning functions. Thus, lesions of the vmPFC in humans may disrupt more than one process. This is important to keep in mind when inconsistencies arise between animal studies and human studies. These differences are also important when comparing human lesion studies, which tend to examine the functions of relatively large regions, and functional imaging studies, which reveal more focused patterns of activity. The vmPFC encompasses different regions that have been identified in functional imaging studies. The vmPFC includes the medial prefrontal area identified as being deactivated by a broad range of cognitive tasks that require focused attention, reflecting a high level of resting activity that is suspended during goal-directed behavior (Raichle et al., 2001). Other investigators have argued that the
c38.indd 748
vmPFC encompasses the medial orbitofrontal area, in which activity is consistently related to the “reward value” of hedonically positive stimuli (Kringelbach & Rolls, 2004). In addition, the vmPFC includes the subgenual cingulate cortex (Figure 38.2), an area that has been implicated through functional imaging studies in the pathogenesis of mood disorders (Drevets et al., 1997). It remains to be seen whether these seemingly distinct functions reflect the operation of a single brain area that can be called the vmPFC, or instead are due to separate mental processes mediated by three functionally distinct areas encompassed by the vmPFC. There are two points that need to be considered when interpreting functional imaging studies of the vmPFC. First, as with all cognitive functions, activation or deactivation may show that this region is engaged by a particular function, but does not mean that it is necessary for the function to be performed. Thus, even though activity in the vmPFC can be shown to correlate with the reward value of hedonically positive stimuli, it still remains to be seen whether lesions in the human vmPFC disrupt the subjective experience of, for example, pleasure. Second, in fMRI studies the vmPFC undergoes significant BOLD signal dropout due to its location near an air–tissue interface. For this reason, the failure to detect activation or deactivation of the vmPFC using fMRI should not be taken as evidence that the vmPFC is not involved in the function under investigation, unless special procedures have been implemented to overcome signal dropout. For example, an fMRI study of decision making (Fukui, Murai, Fukuyama, Hayashi, & Hanakawa, 2005) using a task (the Iowa Gambling Task) on which subjects with vmPFC damage are impaired did not show activation of the vmPFC. An earlier positron emission tomography (PET) study (Ernst et al., 2002) using the same task did find activation in the vmPFC. Although one may cite differences in the experimental conditions or data analysis to explain this discrepancy, perhaps the most parsimonious explanation is that the Fukui et al. study did not use procedures to counteract BOLD signal dropout in the vmPFC, which was not an issue in the Ernst et al. study. This points to the larger possibility that the vmPFC is involved in a broader set of functions than would be indicated by fMRI studies alone. Nauta (1971) and then Neafsey (1990) proposed that the ventral prefrontal cortex represents a distinct visceromotor output region. Price and colleagues (Ongur & Price, 2000) later refined this concept, synthesizing the previous anatomic literature on the prefrontal cortex with their own anatomical studies in macaques. In their conception, the ventral prefrontal cortex (a region they term “the orbitomedial prefrontal cortex”) is composed of functionally
8/17/09 3:07:30 PM
Visceral Functions of the vmPFC 749
distinct orbital and medial networks. The orbital network is essentially a sensory input area that receives afferents from late sensory cortices for vision, audition, olfaction, taste, and visceral sensation and has reciprocal connections with the dorsolateral prefrontal cortex. The medial network is essentially a visceromotor output area that sends projections to subcortical structures that are involved in emotional and motivational processes, such as the amygdala and the nucleus accumbens, as well as regions of the brain stem and hypothalamus that directly govern the state of the viscera. In between the orbital and medial networks lies a transitional zone that is interconnected with both the orbital and medial networks that may function to transfer information between those networks. According to this scheme, the vmPFC, as we define it, corresponds largely to the medial and intermediate networks, areas that are largely concerned with translating highly processed sensory inputs into visceromotor output. Much of the early evidence for the role of the vmPFC in the control of visceral functions came from studies examining the cardiorespiratory effects of stimulation in this area in cats and macaques, but further support for the role of the vmPFC in visceral motor functions also comes from lesion studies. Early studies in humans (Luria, Pribram, & Homskaya, 1964) examined the effects of relatively large lesions of the frontal lobes on visceral functions. Our own laboratory has performed studies examining the effects of lesions in an array of cortical areas on visceral responses (Tranel & Damasio, 1994). These studies demonstrated that a number of cortical regions, including the vmPFC but also the anterior cingulate cortex and the right inferior parietal cortex, are necessary for the generation of visceral responses to sensory stimuli. These studies also showed that the role of the vmPFC in governing visceral responses is specific to stimuli with emotional or social content. More recent evidence for the visceral functions of the vmPFC comes from functional imaging studies. One study showed that neural activity in the vmPFC covaries with visceral responses, this time with skin conductance response during anticipation and receipt of monetary rewards (Critchley, Elliott, Mathias, & Dolan, 2000). In this study, skin conductance responses were modeled as discrete events, which allowed for the correlation with brain activity both preceding and following the responses. Using this approach it was possible to show that activity in the vmPFC was related to both the generation of skin conductance responses as well as the afferent mapping of skin conductance responses, indicating both visceral motor and visceral sensory functions for the vmPFC. Using fMRI it has also been shown that activity in vmPFC is correlated with skin conductance response across a variety of cognitive
c38.indd 749
states, including the resting state (Nagai, Critchley, Featherstone, Trimble, & Dolan, 2004; Patterson, Ungerleider, & Bandettini, 2002). These studies suggest that the visceral functions of the vmPFC are not specific to emotional stimuli, contrary to the results of the lesion studies described earlier. However, just because activity in the vmPFC is related to visceral responses in a given context does not mean that it is necessary for the generation of visceral responses in that context. Both the somatic marker hypothesis and the Jamesian view of emotion place importance on the sensory representations of the visceral responses that are elicited by biologically relevant stimuli. An important question, therefore, regards how visceral responses, once generated by the vmPFC and deployed in the body, are mapped in the brain. Visceral sensations are represented at multiple levels of the neuraxis, including the spinal cord, brain stem, hypothalamus, thalamus, and cortex (Craig, 2002). Each of these stages of visceral representation may have a specific role to play in affective and executive processes. The sensory representation of the viscera within the insular cortex, in particular the right anterior insular cortex, has been proposed to play a special role in conscious emotional feelings (Craig, 2002; Damasio et al., 2000). The right insular cortex, along with the right somatosensory cortices, have also been proposed by A. R. Damasio (A. R. Damasio, 1994) to be a component of the somatic marker network for decision making. The anterior (agranular) insular cortex projects to a number of areas involved in emotion and motivation, including the amygdala and the nucleus accumbens. The anterior insular cortex also projects to the vmPFC, both via the orbital network and through direct projections to medial network areas (Flynn, Benson, & Ardila, 1999). In addition, recent anatomical evidence suggests that the right anterior insular cortex has evolved special functions in higher primates (Craig, 2002), consistent with a role in conscious feelings. Indeed, the right anterior insular cortex has been shown to be active during a number of subjective feeling states (Critchley, Wiens, Rotshtein, Ohman, & Dolan, 2004; A. R. Damasio et al., 2000; Lane, Reiman, Ahern, Schwartz, & Davidson, 1997). Thus, the visceral sensory representation within the right anterior insular cortex may play a role in the feelings that accompany decision making, such as hunches and gut feelings that may guide decision making in the face of uncertainty. According to the somatic marker hypothesis (Bechara & Damasio, 2005), visceral sensory signals can also influence decision making by acting on brain stem nuclei forascending neurotransmitter systems, including dopaminergic, serotonergic, and noradrenergic systems. These neurotransmitter
8/17/09 3:07:31 PM
750 The Somatovisceral Components of Emotions and Their Role in Decision Making
systems exert widespread influence on the function of the prefrontal cortex, including the dorsolateral, medial, and orbital prefrontal cortices, and on subcortical structures, including the amygdala and the ventral and dorsal striata. Through these projections, ascending neurotransmitter systems play a role in multiple attentional, executive, and motivational processes (Berridge & Robinson, 1998; Rahman, Sahakian, Cardinal, Rogers, & Robbins, 2001). There is evidence for direct visceral sensory inputs to these nuclei and that visceral states can influence neurotransmitter release from these nuclei (Berntson, Sarter, & Cacioppo, 2003). The somatic marker hypothesis holds that visceral states, via their influence on ascending neurotransmitter systems, can influence decision making both by promoting the maintenance of specific goals in working memory and by biasing of behavior toward these goals. In this framework, brain stem neurotransmitter nuclei may also be engaged by “as if” loops. Here, areas such as the vmPFC, instead of triggering visceral responses in the body that feed back to brain stem neurotransmitter nuclei, facilitate neurotransmitter release via direct brain stem projections that bypass the body. This triggers neurotransmitter release as if a visceral response had been expressed in the body. As discussed earlier, functional imaging evidence (Critchley et al., 2000) indicates that the vmPFC, in addition to its role in the generation of visceral responses, plays a role in the sensory representation of the visceral state. The vmPFC may receive visceral sensory information from the insular cortex or via ascending neurotransmitter systems. One function of the visceral sensory inputs to the vmPFC may be to represent the visceral responses that are themselves induced within the vmPFC, amygdala, or other areas that trigger visceral responses to biologically relevant stimuli. This function may allow the vmPFC to compare the sensory representations of the visceral state with an efferent copy of the visceral response evoked by a biologically relevant stimulus. Differences between these two inputs may signal that a goal has been achieved; that is, a consummatory event has occurred that has altered the state of the viscera. The vmPFC has available to it information regarding the sensory consequences of innately pleasurable consummatory behaviors that impinge directly on the viscera, such as feeding, as well as innately aversive bodily states that result from actual or potential tissue damage (nociception). For example, it has been shown that activity in the vmPFC is correlated with the subjective pleasantness of stimuli such as taste (Kringelbach & Rolls, 2004), the oral sensations elicited by water (Denton et al., 1999a, 1999b), and pleasant touch (Rolls, 2000). In addition, activity in the vmPFC is correlated with subjective ratings of the intensity of thermal pain (Craig, 2002). All of these
c38.indd 750
stimuli have direct homeostatic relevance and are signaled through a distinct sensory channel that includes the insular cortex (Craig, 2002). This suggests that the vmPFC represents the hedonic value of the visceral sensory signals generated by innately rewarding or punishing consummatory behavior (so-called primary reinforcers; Kringelbach & Rolls, 2004). These representations may be important for the feelings of pleasure and pain that accompany good and bad outcomes. However, there is a lack of human lesion evidence that indicates that the vmPFC mediates the subjective hedonic impact of interoceptive stimuli. Another possibility is that interoceptive or visceral signals within the vmPFC may be important for learning to associate the hedonic consequences of behavior with the particular courses of action that precede them. This is supported by the findings from functional imaging studies that the vmPFC is activated when feedback indicating a correct choice is signaled specifically via an interoceptive route (Hurliman, Nagode, & Pardo, 2005). In this way, interoceptive signals generated by primary reinforcers may form the basis for representations of more abstract rewards and punishments that are elicited within the vmPFC.
DECISION-MAKING FUNCTIONS OF THE vmPFC Our laboratory’s interest in the functions of the vmPFC was fueled by observations in neurological patients that lesions in this area led to profound impairments in personality and real-life decision-making capabilities. One of the first and most famous cases of the so-called frontal lobe syndrome was the patient Phineas Gage, a railroad construction worker who survived an explosion that blasted an iron tamping bar through the front of his head (Harlow, 1848). Before the accident Gage was a man of normal intelligence, energetic and persistent in executing his plans of operation. He was responsible, sociable, and popular among peers and friends. After the accident his medical recovery was remarkable. He survived the accident with normal intelligence, memory, speech, sensation, and movement. However, his behavior changed completely. He became irresponsible, untrustworthy, and impatient of restraint or advice when it conflicted with his desires. Using modern neuroimaging techniques, H. Damasio and colleagues (H. Damasio, Grabowski, Frank, Galburda, & Damasio, 1994 ) have reconstituted the accident by relying on measurements taken from Gage’s skull. The key finding of this neuroimaging study was that the most likely placement of Gage’s lesion included the vmPFC region, bilaterally.
8/17/09 3:07:31 PM
Decision-Making Functions of the vmPFC 751
The case of Phineas Gage paved the way for the notion that the frontal lobes were linked to social conduct, judgment, decision making, and personality. A number of instances similar to Gage’s case have since appeared in the literature (Ackerly & Benton, 1948; Brickner, 1932; Welt, 1888). Interestingly, these cases received little attention for many years. Over the years, we have studied numerous patients with this type of lesion. Such patients with damage to the vmPFC develop severe impairments in personal and social decision making, in spite of otherwise largely preserved intellectual abilities. These patients had normal intelligence and creativity before their brain damage. After the damage they begin to have difficulties planning their workday and future and difficulties in choosing friends, partners, and activities. The actions they elect to pursue often lead to losses of diverse order, for example, financial losses, losses in social standing, losses of family and friends. The choices they make are no longer advantageous and are remarkably different from the kinds of choices they were known to make in the premorbid period. These patients often decide against their best interests. They are unable to learn from previous mistakes, as reflected by repeated engagement in decisions that lead to negative consequences. In striking contrast to this real-life decisionmaking impairment, problem-solving abilities in laboratory settings remain largely normal. As noted, the patients have normal intellect, as measured by a variety of conventional neuropsychological tests (Bechara, Damasio, Tranel, & Anderson, 1998; A. R. Damasio, Tranel, & Damasio, 1990; Eslinger & Damasio, 1985; Saver & Damasio, 1991). The Genesis of the Somatic Marker Hypothesis When we first observed the real-life decision-making deficits of patients with vmPFC damage, a good deal of evidence for the visceral motor functions of the vmPFC, described earlier, had already accumulated. The question then arose as to whether the decision-making deficits caused by vmPFC damage were related to its visceromotor functions. Nauta (1971) had by then proposed that the guidance of behavior by the frontal lobes was linked to the interoceptive and visceromotor functions of this area. Specifically, he proposed that the prefrontal cortex, broadly defined, functioned to compare the affective responses evoked by the various choices for behavior and to select the option that “passed censure by an interoceptive sensorium.” (p. 172). According to Nauta, the “interoceptive agnosia” suffered by patients with frontal lobe damage could explain their impairments in real life, as well as their poor performance on various tests of executive function, including the Wisconsin card sort task. This model was
c38.indd 751
meant to explain the function of the prefrontal cortex as a whole. Furthermore, it was meant as a broad explanation of executive function deficits, not of a specific deficit in decision making within the social and personal domains. However, the deficits of patients with damage in the vmPFC were limited to the personal and social domains; patients with focal vmPFC damage showed marked impairments in their real-life personal and social functioning but had intact intelligence. Indeed, these patients performed normally on standard laboratory tests of executive function such as the Wisconsin card sort task. This background helped to shape a more specific formulation, deemed the “somatic marker hypothesis” (A. R. Damasio, 1994; A. R. Damasio, Tranel, & A. R. Damasio, 1991). According to this hypothesis, patients with damage in the vmPFC make poor decisions in part because they are unable to elicit somatic (visceral) responses that mark the consequences of their actions as positive or negative. In this framework, the vmPFC functions to elicit visceral responses that reflect the anticipated value of the choices. Though this function is specific to the vmPFC, it draws on information about the external world that is represented in multiple higher order sensory cortices. Furthermore, this function is limited to specific types of decision making, in particular those situations where the meaning of events is implied and the consequences of behavior are uncertain. These are situations, such as social interactions and decisions about one’s personal and financial life, where the consequences of behavior have emotional value; that is, they can be experienced as subjective feelings and can also increase or decrease the likelihood of similar behavior in the future (they are rewarding or punishing). Furthermore, these are situations where the rules of behavior are not explicit but yet require some form of mental deliberation in real time in order to navigate them successfully. This form of reasoning is distinct from reasoning that does not require the weighing of positive and negative consequences, or in which the outcomes of decisions are known with a high degree of certainty. In addition to explaining the specificity of the impairments in patients with vmPFC damage, the somatic marker framework leads to testable hypotheses about the kinds of information represented within the vmPFC and the relationship of this information to the state of the viscera. Lesions of the vmPFC Impair Visceral Responses to Emotional Stimuli One of the first empirical tests of the somatic marker hypothesis came from studies examining the effects of vmPFC lesions on the visceral response to complex visual
8/17/09 3:07:31 PM
752 The Somatovisceral Components of Emotions and Their Role in Decision Making
stimuli (A. R. Damasio et al., 1990). Though it was known that the vmPFC played a role in the elicitation of visceral responses, the precise behavioral context in which visceral functions were engaged by the vmPFC was not known. In other words, it was possible that the vmPFC functioned to elicit visceral responses to all forms of stimuli, or only certain types of stimuli, such as those that, like social signals, possess emotional value that is often largely implicit. According to the somatic marker hypothesis, the visceral responses that were mediated by the vmPFC were especially related to the implied meaning of social stimuli. To address this hypothesis, one study examined the visceral responses of a group of patients with damage in the vmPFC, all of whom showed a pattern of behavior that reflected an inability to make advantageous decisions in the personal and social realms despite intact intellectual functioning. The subjects were shown a series of affectively charged pictures, including pictures of mutilations, disasters, and sexual images, along with a series of neutral pictures. The patients were tested in two conditions, one in which they watched the stimuli passively and another in which they were asked to describe the pictures in terms of their emotional content. After each stimulus, the skin conductance response (SCR), an index of sympathetic arousal, was measured. In addition, SCRs to orienting stimuli, including loud noises and deep breaths, were measured. The responses of patients with vmPFC damage were compared to the responses of patients with damage to regions outside the vmPFC as well as the responses of neurologically intact comparison subjects. It was found that patients with vmPFC damage were significantly impaired in their visceral responses to emotional pictures when required to view them passively, compared to both neurologically intact and brain-damaged comparison subjects. However, when required to comment on the content of the pictures, the visceral responses of the vmPFC patients to the emotional pictures were largely intact. In addition, the vmPFC patients showed intact SCRs to orienting stimuli. These results indicate that the vmPFC plays a role in the elicitation of visceral responses to biologically relevant stimuli. This impairment is specific to stimuli for which emotional meaning must be decoded through cognitive processes and does not extend to stimuli that elicit visceral responses because they are innately aversive or arousing (e.g., a loud noise) or to stimuli that are physiological elicitors of visceral responses (e.g., a deep breath). The results also imply that the vmPFC mediates the visceral response to emotional stimuli when the evaluation of these stimuli does not require verbal mediation, that is, when it is implicit.
c38.indd 752
Lesions in the vmPFC Lead to Impairments in the Iowa Gambling Task Once it was known that patients with vmPFC damage were abnormal both in their capacity to make decisions and in their ability to respond viscerally to the emotional meaning of certain stimuli, it still remained to be shown that these two abnormalities were linked. Up to this point, the behavioral abnormalities of these patients, which were striking in real life, had largely eluded conventional neuropsychological and laboratory tests. Thus, it was important to develop a laboratory test that simulated the real-life decisions in which patients with vmPFC damage failed. This test factored in reward and punishment, as well as the uncertainty and risk that accompany many real-life decisions. In addition, this test required participants to reason and deliberate the outcome of choices in real time. The Iowa Gambling Task The Iowa Gambling Task (IGT; Bechara, Damasio, Damasio, & Anderson, 1994; Bechara, Tranel, & Damasio, 2000) uses four decks of cards, named A, B, C, and D. The goal in the task is to maximize profit on a loan of play money. Subjects are required to make a series of 100 card selections. However, they are not told ahead of time how many card selections they are going to make. Subjects can select one card at a time from any deck they choose, and they are free to switch from any deck to another at any time and as often as they wish. However, the subject’s decision to select from one deck versus another is largely influenced by various schedules of immediate reward and future punishment. These schedules are preprogrammed and known to the examiner but not to the subject, and they entail the following principles. Every time the subject selects a card from deck A or deck B, the subject gets $100. Every time the subject selects a card from deck C or deck D, the subject gets $50. However, in each of the four decks, subjects encounter unpredictable punishments (money loss). The punishment is set to be higher in the high-paying decks A and B and lower in the low-paying decks C and D. For example, if one picks 10 cards from deck A, one would earn $1,000. However, in those 10 card picks, five unpredictable punishments would be encountered, ranging from $150 to $350, bringing a total cost of $1,250. Deck B is similar: Every 10 cards picked from deck B would earn $1,000; however, these 10 card picks would encounter one high punishment of $1,250. On the other hand, every 10 cards from deck C or D earn only $500, but they cost only $250 in punishment. Hence, decks A and B are disadvantageous because they cost more in the long run; that is, one loses $250 every 10 cards. Decks C and D are advantageous because they
8/17/09 3:07:31 PM
Decision-Making Functions of the vmPFC 753
result in an overall gain in the long run; that is, one wins $250 every 10 cards. We investigated the performance of normal controls and patients with vmPFC lesions on this task. Normal subjects avoided the bad decks A and B and preferred the good decks C and D. In sharp contrast, the vmPFC patients did not avoid the bad decks A and B; indeed, they preferred those decks (Figure 38.3). From these results we suggested that the patients’ performance profile is comparable to their real-life inability to decide advantageously. This is especially true in personal and social matters, a domain for which, in life, as in the task, an exact calculation of the future outcomes is not possible and choices must be based on hunches and gut feelings. Lesions in the vmPFC Disrupt Visceral Responses during the Iowa Gambling Task In light of the finding that the IGT is an instrument that detects the decision-making impairment of vmPFC patients in the laboratory, we went on to address the next question: whether the impairment is linked to a failure in somatic signaling (Bechara, Tranel, Damasio, & Damasio, 1996). To address this question, we added a physiological measure to the IGT. The goal was to assess somatic state activation while subjects were making decisions during performance of the task. We studied two groups: normal subjects and vmPFC patients. We had them perform the IGT while we recorded their electrodermal activity (SCRs). As the body begins to change after a thought, and as a given somatic state begins to be enacted, the autonomic nervous system begins to increase the activity in the skin’s sweat glands. Although this sweating activity is relatively small and not observable by the naked eye, it can be amplified and recorded by a polygraph as a wave. The amplitude of this wave can be measured and thus provide an indirect measure of the somatic state experienced by the subject.
vmPFC Patients
Total # of Cards Selected from Decks
Normal Control 20
20
15
15
10
10
5
5
0
0 1–20 21– 40 41– 60 61– 80 81–100 1–20 21– 40 41– 60 61– 80 81–100 Order of Card Selection from the 1st to the 100th Trial Disadvantageous decks (A&B) Advantageous decks (C&D)
c38.indd 753
Both normal subjects and vmPFC patients generated SCRs after they had picked a card and were told that they won or lost money. The most important difference, however, was that normal subjects, as they became experienced with the task, began to generate SCRs prior to the selection of any cards, that is, during the time they were pondering which deck to choose. These anticipatory SCRs were more pronounced before picking a card from the risky decks A and B when compared to the safe decks C and D. In other words, these anticipatory SCRs were like gut feelings that warned the subject against picking from the bad decks. Patients with vmPFC damage failed to generate such SCRs before picking a card. This failure to generate anticipatory SCRs before picking cards from the bad decks correlates with their failure to avoid these bad decks and choose advantageously in this task (Figure 38.4). These results provide strong support for the notion that decision making is guided by emotional signals (gut feelings) that are generated in anticipation of future events. An important question regards the information content of visceral responses that are elicited by the vmPFC. If somatic markers are to be useful in guiding decision-making processes involving uncertain reward and punishment, then they should provide information about both the valence of an anticipated outcome (e.g., whether a choice will result in winning or losing money) and the magnitude of the anticipated outcome (e.g., how much money will be won or lost). Our results using the IGT show that the vmPFC triggers anticipatory visceral responses to both the advantageous and the disadvantageous decks. These responses are larger for disadvantageous decks than for advantageous decks, though they are still deployed for the advantageous decks. Further experiments from our laboratory (Bechara, Dolan, & Hindes, 2002) and others (Tomb, Hauser, Deldin, & Caramazza, 2002) have shown that when the reward-punishment contingencies are reversed, with the disadvantageous decks paying out
Figure 38.3 Card selection on the Iowa Gambling Task as a function of group (normal control, vmPFC), deck type (disadvantageous versus advantageous), and trial block. Note: Normal control subjects shifted their selection of cards to the advantageous decks. The vmPFC prefrontal patients did not make a reliable shift and opted for the disadvantageous decks.
8/17/09 3:07:32 PM
754 The Somatovisceral Components of Emotions and Their Role in Decision Making (A)
Control Subjects
(B)
Ventromedial Prefrontal Patients
1.5
SCR Magnitudes (in S)
1.5
Deck A Deck B Deck C Deck D
1.0
1.0
0.5
0.5
0.0 1–5
6 –10
0.0 11–15 16 –20 21–25 26–30 31–35 36 – 40 1–5 6 –10 Card Position within Each Deck
11–15 16–20 21–25 26 –30 31–35 36– 40
Figure 38.4 Magnitudes of anticipatory SCRs as a function of group (control [A] versus vmPFC [B]), deck, and card position within each deck.
Note: Control subjects gradually began to generate high-amplitude SCRs to the disadvantageous decks. The vmPFC patients failed to do so.
a lower quantity of reward rather than doling out a higher punishment, the SCRs are now greater to the advantageous decks than to the disadvantageous decks. This suggests that SCR is not merely an index of the potential “badness” of choices. Rather, SCR can index the magnitude of both the anticipated negative and the anticipated positive outcomes of a choice. It seems, however, that SCR does not differentiate the anticipated valence of the outcomes. This is consistent with work by others (Lang, Bradley, Cuthbert, & Patrick, 1993) showing that SCR does not differentiate the hedonic valence of emotional stimuli but does index the magnitude of the arousal that they elicit. This would mean that some other signal is required in order to assess the valence of the anticipated outcome. Although our laboratory (Rainville et al., 2006) has provided preliminary evidence that cardiovascular responses, such as changes in heart rate, can provide information that distinguishes between positive and negative emotional states, the fact remains that the preponderance of evidence speaks to the lack of such a distinction at the peripheral visceral level (Cacioppo et al., 1993, 2000). Although it is possible that such signals can combine with those reflected in the SCR to provide information about both the perceived valence and the perceived magnitude of the future outcome of a choice, there is a strong likelihood that this discrimination is not achieved until the signals reach the central nervous system. Indeed, the somatovisceral afference model of emotion does provide an explanation for how undifferentiated visceral responses might produce distinguishable emotions (Cacioppo et al., 1993, 2000). Perhaps somatic markers operate in a fashion that is consistent with that model. Future experiments may examine at what level of the brain the visceral signals reflecting different channels
of autonomic outflow become differentiated in such a manner to exert influence on decision making.
c38.indd 754
Visceral Responses That Signal the Correct Strategy Do Not Need to Be Conscious According to the somatic marker hypothesis, the vmPFC mediates an implicit representation of the anticipated value of choices that is distinct from an explicit awareness of the correct strategy. To test this idea, we performed a study (Bechara, Damasio, Tranel, & Damasio, 1997) in which we examined the development of SCRs over time in relation to subjects’ knowledge of the advantageous strategy in the IGT. In this study the IGT was administered as before, but this time the task was interrupted at regular intervals and the subjects were asked to describe their knowledge about what was going on in the task and their feelings about the task. Normal subjects began to choose preferentially from the advantageous decks before they were able to report why these decks were preferred. They then began to form hunches about the correct strategy, which corresponded to their choosing more from the advantageous decks than from the disadvantageous decks. Finally, some subjects reached a conceptual stage where they possessed explicit knowledge about the correct strategy (i.e., to choose from decks C and D because, although they pay less, they result in less punishment). As before, normal subjects developed SCRs preceding their choices that were larger for the disadvantageous decks than for the advantageous decks. This time it was also found that the SCR discrimination between advantageous and disadvantageous decks preceded the development of conceptual knowledge of the correct strategy. In fact, the SCR discrimination between
8/17/09 3:07:32 PM
Decision-Making Functions of the vmPFC 755
advantageous and disadvantageous decks even preceded the development of hunches about the correct strategy. In contrast to the normal subjects, subjects with damage in the vmPFC failed to switch from the disadvantageous decks to the advantageous decks, as in the previous study. In addition, subjects in this group again failed to develop anticipatory responses that discriminated between the disadvantageous and advantageous decks. Furthermore, patients with vmPFC damage never developed hunches about the correct strategy. Together, these results suggest that anticipatory visceral responses that are governed by the vmPFC precede emergence of advantageous choice behavior, which itself precedes explicit knowledge of the advantageous strategy. This further suggests that signals generated by the vmPFC, reflected in visceral states, may function as a nonconscious bias toward the advantageous strategy. More recently, other investigators have questioned whether it is necessary to invoke visceral responses as constituting nonconscious biasing signals (Maia & McClelland, 2004). By using more detailed questions to probe subjects’ awareness of the attributes of each of the decks in the IGT, this study showed that subjects possess explicit knowledge of the advantageous strategy at an earlier stage in the task than was shown in the Bechara et al. (1997) study. Furthermore, the Maia and McClelland study found that subjects began to make advantageous choices at around the same time that they reported knowledge of the correct strategy. Based on these findings, it was argued that nonconscious somatic marker processes are not required in order to explain how decision making occurs. A response to this study has been published elsewhere (Bechara, Damasio, Tranel, & Damasio, 2005), along with a rebuttal by Maia and McClelland (2005). Two points bear discussion here. First, because this study did not measure visceral responses and did not examine the effects of brain damage, it does not disprove the hypothesis that somatic markers mediated by the vmPFC play a role in decision making; it only shows that conscious awareness of the correct strategy occurs at around the same time as advantageous decision making. Second, both the Bechara et al. (1997) study and the Maia and McClelland (2005) study found that some subjects continue to make disadvantageous choices despite being able to report the correct strategy. This pattern bears an uncanny resemblance to the way subjects with lesions in the vmPFC are able to report the correct strategies for personal and social decision making, despite their severe deficits in the actual execution of personal and social behavior in real life. Indeed, this clinical observation provided the initial impetus to hypothesize a role for covert biasing processes in decision making in the first place. This indicates that,
c38.indd 755
in both the IGT and in real life, conscious knowledge of the correct strategy may not be enough to guide advantageous decision making. Thus, some process that operates independently of conscious knowledge of the correct strategy (i.e., somatic markers) must be invoked to explain fully how individuals make advantageous decisions. Indeed, it seems likely that this process can sometimes bias behavior that goes against what a person consciously thinks to be the correct strategy. That nonconscious biasing processes may not precede conscious knowledge in time is potentially an important finding, but it does not provide a basis for rejection of the fundamental role of somatic markers as nonconscious biases of behavior. The Decision-Making Functions of the vmPFC Are Different from the Decision-Making Functions of the Amygdala Like patients with damage to the vmPFC, patients with bilateral damage to the amygdala also demonstrate impairments in their ability to make advantageous choices in their personal and social lives (A. R. Damasio, 1994). The amygdala, like the vmPFC, has been strongly implicated in emotional and motivational processes (Cardinal, Parkinson, Hall, & Everitt, 2002; LeDoux, 1996). There is much in common between the amygdala and the vmPFC in terms of cortical and subcortical connectivity. In particular, the amygdala receives information from higher order sensory cortices for vision, olfaction, audition, and visceral sensation (Amaral, Price, Pitkanen, & Carmichael, 1992) and sends output to subcortical sites that regulate the state of the viscera, including nuclei of the brain stem and hypothalamus. Thus, like the vmPFC, the amygdala is positioned to receive multiple sensory inputs pertaining to biologically relevant stimuli and to trigger changes in the visceral state. This suggests that the amygdala plays a role similar to that of the vmPFC in decision making. However, there are important differences between the amygdala and the vmPFC in terms of their visceral and decision-making functions. One source of evidence regarding these distinct roles comes from an experiment in which we administered the IGT to a group of subjects with bilateral amygdala damage (Bechara, Damasio, Damasio, & Lee, 1999). Their performance on this task, along with their SCRs, were compared to a group of subjects with vmPFC damage and a group of neurologically intact subjects. Similar to subjects with vmPFC damage, subjects with damage to the amygdala performed poorly on the IGT, failing to shift toward choosing more frequently from the advantageous decks. When examining the SCRs, however, there were different
8/17/09 3:07:32 PM
756 The Somatovisceral Components of Emotions and Their Role in Decision Making
patterns of deficit in amygdala-lesioned subjects versus vmPFC-lesioned subjects. As discussed earlier, patients with vmPFC damage fail to deploy anticipatory visceral responses before their choices, responses that, in normal subjects, are larger before choosing from the disadvantageous decks than before choosing from the advantageous decks. As also noted previously, vmPFC-lesioned subjects continue to have normal SCRs in response to receiving reward and punishment. In contrast, patients with amygdala damage fail to deploy SCRs during both the anticipatory period and in response to receiving rewards and punishments. These data are shown in Figure 38.5. This suggests that the decision-making deficit in patients with amygdala damage is due to an inability to respond viscerally to rewards and punishments. This is different from the deficit in patients with vmPFC damage, who possess an inability to viscerally anticipate uncertain rewards and punishments but who are normal in their ability to respond viscerally to rewards and punishments once they are received.
Amygdala (Nⴝ5)
Normal Control (Nⴝ13)
(A)
To examine further the distinction between the affectivevisceral functions of the amygdala and the vmPFC, the same subjects underwent a Pavlovian conditioning paradigm (Bechara et al., 1999). Here it was found that patients with bilateral amygdala damage failed to acquire conditioned SCRs. In contrast, patients with vmPFC damage were not different from neurologically intact subjects in their ability to produce conditioned SCRs. Both the amygdala and the vmPFC patients were normal in their SCRs in response to the unconditioned stimulus. This indicates that the amygdala, but not the vmPFC, is required for the acquisition of Pavlovian conditioning. In other words, the visceral responses to stimuli that acquire hedonic value through simple associative learning processes are not mediated by the vmPFC but are mediated by the amygdala. This parallels the dissociation between the amygdala and the vmPFC with respect to the visceral response to reward and punishment in the IGT. The distinction between the affective-visceral functions of the amygdala and the vmPFC may be conceptualized in
Ventromedial Prefrontal (VMF) (Nⴝ5)
Total # of Cards Selected from Decks
Disadvantageous decks (A&B) Advantageous decks (C&D) 20
20
20
15
15
15
10
10
10
5
5
5
0
0 1–20 21–40 41–60 61–80 81–100
0 1–20 21– 40 41– 60 61–80 81–100
1–20 21–40 41– 60 61–80 81–100
Order of Card Selection from the 1st to the 100th Trial
(C) 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00
Advantageous decks (C&D) Disadvantageous decks (A&B)
Normal (n=13)
Amygdala (n=5)
VMF (n=5)
Figure 38.5 The visceral functions of the vmPFC are different from those of the amygdala. Note: A: Behavioral performance on the IGT. Both vmPFC- and amygdala-lesioned subjects fail to switch to choosing preferentially from the advantageous decks. B: Both vmPFC and amygdala-lesioned subjects fail to produce anticipatory SCRs. C: vmPFC-lesioned subjects produce normal SCRs in response to reward and punishment, but amygdala lesioned subjects fail to produce SCRs in response to reward and punishment.
c38.indd 756
Area under the Curve of SCRs (S/sec) after Receiving a Reward or Punishment
Area under the Curve of SCRs (S/sec) Prior to the Selection of a Card
(B)
Advantageous decks (C&D) Disadvantageous decks (A&B) 0.24 0.22 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00
Reward Punishment Reward Punishment Reward Punishment Normal (n= 13)
Amygdala (n=5)
VMF (n=5)
Error bars represent the standard error of the mean. Not depicted are results showing that vmPFC-lesioned subjects produce normal SCRs to a classically conditioned stimulus, whereas amygdala-lesioned subjects fail to produce classically conditioned SCRs. From “Different Contributions of the Human Amygdala and Ventromedial Prefrontal Cortex to Decision-Making,” by A. Bechara, H. Damasio, A. R. Damasio, and G. P. Lee, 1999, Journal of Neuroscience, 19, pp. 5473–5481. Reprinted with permission.
8/17/09 3:07:33 PM
Summary
terms of the demands of processing biologically relevant stimuli on attention and working memory. In this view, the amygdala triggers visceral responses to stimuli whose biological significance can be decoded in relatively automatic fashion, such as winning or losing money, conditioned stimuli that reliably predict aversive and pleasurable events in the future, and stimuli with innate biological significance, such as the sight of spiders and snakes and facial expressions of fear. We refer to this class of stimuli as “primary inducers.” The visceral responses to primary inducers can be elicited quickly and without thought or complex attention. The vmPFC, in contrast, triggers visceral responses to stimuli whose biological significance must be decoded through a deliberative process. We refer to this class of stimuli as “secondary inducers.” This includes thoughts of future loss or gain, particularly when loss or gain is uncertain, as well as the recollection of pleasant and unpleasant events from the past. Secondary inducers, which may not be present within the sensory field, must be brought to mind to elicit a visceral response. Thus, vmPFC functions are related to attention and working memory and also to the recall of episodic memories.
SUMMARY The Somatic Maker Hypothesis Somatic Markers as Executive Processes According to the somatic marker hypothesis (A. R. Damasio, 1994), the visceral response elicited during decision making, both during the contemplation of the future outcome of a choice and after the outcome of a choice has been signaled, aid in guiding decisions toward advantageous choices and away from disadvantageous choices. The process that is assessed by the IGT is ultimately a learning process, one in which knowledge of the correct strategy evolves over time. In this view, visceral responses to receiving reward and punishment, which are mediated by the amygdala, contribute to the encoding of the predictive value of the sensory cues and actions that preceded reward and punishment. Over time, through this encoding, subjects learn the association between a given choice and its outcome. This learning may precede explicit awareness of the contingencies between specific choices and their outcome. This learning is expressed by the vmPFC, which evokes learned representations of the predictive value of a choice in the period before a choice is made, when the outcomes of various choices are weighed against each other as they are held in mind. The representation of predictive value is based on the visceral response that is triggered within the vmPFC, an emotional response that marks the value of options for behavior based on past experience.
c38.indd 757
757
Within the somatic marker framework, then, the vmPFC functions as a system that holds the affective-visceral properties of objects in mind during the planning and organization of behavior that is directed toward courses of action that are in the overall best interests of the organism. This function falls into the broader executive role of the prefrontal cortex, of which the vmPFC is a part. This role is supported by connections between the vmPFC and higher order sensory cortices, as well as connections between the vmPFC and the dorsolateral prefrontal cortex (both of which are mediated by the orbital network). The connections with higher order sensory cortices provide a route for highly processed information about the sensory properties of biologically relevant stimuli to reach the vmPFC. The connections with the dorsolateral prefrontal cortex link the functions of the vmPFC to executive processes that guide attention and prioritize action, allowing the vmPFC to serve as a buffer for the maintenance of information pertaining to the homeostatic value of goal objects (i.e., predictive value). Thus, the vmPFC is not involved in regulating global working memory processes, as indicated by the finding that damage to the vmPFC does not disrupt performance on broad tasks of working memory (Bechara et al., 1998). However, vmPFC function does require intact working memory processes, as indicated by the finding that damage in regions of the prefrontal cortex that play a global role in working memory impairs performance on the IGT (Bechara, 2004; Clark, Cools, & Robbins, 2004). Some tasks that call on representations of predictive reward value but that do not require this information to be held in working memory may also engage the vmPFC. For example, one study has shown that damage to the vmPFC disrupts both reversal learning and IGT performance (Fellows & Farah, 2005). In contrast, damage to the dorsolateral prefrontal cortex impairs performance on the IGT but does not impair reversal learning. An important caveat in the comparison of this study with studies from our laboratory is that the Fellows and Farah study examined damage in posterior regions of vmPFC that also impinged on basal forebrain structures, such as the nucleus accumbens. Our studies, in contrast, have found that lesions restricted to more anterior regions of the vmPFC that do not include the basal forebrain can alter performance on the IGT (Bechara et al., 1998). Thus, it is possible that the reversal learning deficits found in the Fellows and Farah study are attributable to damage in the basal forebrain rather than to damage in the vmPFC. Notwithstanding this, it is possible that both reversal learning and the IGT require an ability to register that the predictive reward value of a stimulus has changed, as well as an ability to inhibit a previously rewarded response. However, unlike the IGT, reversal learning does not require that information about the predictive
8/17/09 3:07:33 PM
758 The Somatovisceral Components of Emotions and Their Role in Decision Making
reward value of a stimulus be held in working memory. Thus, the vmPFC may be engaged by processes that invoke representations of predictive reward value as well as by processes that require inhibition of a previously rewarded response. Such processes, which may themselves rely on somatic markers, could operate independently of working memory under certain experimental situations, such as reversal learning. In real life, however, where decision making usually requires holding representations of predictive reward value in mind over a delay, they are likely to work in concert with working memory processes. The Role of Feedback of Somatic Markers in Decision Making According to the somatic marker hypothesis, the afferent feedback of visceral responses is an important component of the decision-making process. In other words, the visceral responses during the contemplation of choices are necessary for biasing behavior in the advantageous direction, as well as for gut feelings and hunches related to choices. The question arises, then, as to whether the visceral responses induced by the vmPFC are actually necessary for decision making or are merely an epiphenomenal bodily reflection of the operation of certain mental processes. One way to address this question is to directly manipulate the sensory feedback of the visceral state during performance of the IGT. A number of studies have attempted to do this. For example, one study has examined how cervical transection of the spinal cord affects performance on the IGT (North & O’Carroll, 2001). This study found no effect of the manipulation on performance on the IGT. Because the spinal cord carries somatosensory and interoceptive information from the body to the brain (Craig, 2002), this may be taken as evidence that the sensory feedback of bodily states does not contribute to decision making. However, a great deal of the information about visceral states is conveyed to the central nervous system via the vagus nerve, which is spared by spinal transection. If visceral states play a special role in signaling homeostatic processes, which we believe they do, then it is not surprising that spinal transection has a limited effect on decision making. Another study (Heims, Critchley, Dolan, Mathias, & Cipolotti, 2004) examined more specifically the role of visceral states in decision making. This study showed that patients with pure autonomic failure, a peripheral nervous disorder that broadly disrupts the ability to deploy visceral responses, do not demonstrate impaired performance on the IGT. This can also be taken as evidence that visceral responses are not necessary for decision making. However, this study did not actually measure visceral responses during the IGT, so it is possible that subjects still produced some form of visceral response during the
c38.indd 758
task. Also, it is possible that, because pure autonomic failure develops slowly and manifests later in life, significant neural reorganization may take place in subjects with this disease, altering the normal mechanism of decision making. Yet another study (Martin, Denburg, Tranel, Granner, & Bechara, 2004) showed that electrical stimulation of the vagus nerve during the IGT, which largely affects visceralafferent signaling, can actually improve performance. This can be taken as evidence that visceral states do play a role in decision making. However, this study was limited by the fact that most of the subjects suffered long-standing epilepsy, and many of them had lower than normal decisionmaking ability to begin with. Functional imaging studies have provided circumstantial evidence of the role of visceral states in decision making. As discussed earlier, the insular cortex is a visceral sensory region that has been hypothesized to play a role in decision making by mapping the visceral responses that are induced by the vmPFC and the amygdala. A number of studies (Craig, 2002) have shown that activity in the insular cortex is correlated with changes in the visceral state. The insular cortex is also activated by decision-making tasks that involve uncertain reward and punishment and an evaluation of emotional information. For example, a PET study (Ernst et al., 2002) has shown that performance of the IGT, in addition to activating the vmPFC, also activates the insular cortex. Moreover, this study found that activity in the insular cortex was correlated with performance on the IGT. The insular cortex has also been shown using fMRI to be activated during other decision-making tasks that involve uncertain reward and punishment (Critchley, Mathias, & Dolan, 2001). In addition, one study (Sanfey, Hastie, Colvin, & Grafman, 2003) found that the insular cortex is activated by the evaluation of the fairness of offers of money. Here activity in the insular cortex was shown to be correlated with the tendency to reject unfair offers. Although these studies did not examine visceral responses directly, they show that the insular cortex, an area that has previously been established as a visceral sensory representation area, is engaged during decision making, particularly when the decisions require an evaluation of emotional consequences that are uncertain. Thus, on balance, the evidence seems to favor the role of visceral states in decision making; however, more definitive evidence is required to establish exactly how and under what circumstances visceral states contribute to decision making. Though certain forms of decision making may engage somatic marker processes, it may be that not all forms of decision making require the elicitation and sensory mapping of visceral states. Indeed, the somatic marker hypothesis maintains that, under some conditions, as-if representations of the visceral state, mediated by direct
8/17/09 3:07:33 PM
References 759
connections between the vmPFC and brain stem neurotransmitter nuclei, may be sufficient to guide decision making in the advantageous direction. Also, decisions that do not require the weighing of rewarding or punishing consequences or in which the outcome is relatively certain may not engage somatic marker processes at all. Why Somatic Markers? A strictly computational approach to decision making may not require that the brain represent signals that are expressed within the body in order to compute the anticipated value of options for behavior (Maia & McClelland, 2004, 2005; Rolls, 1999). It is important to keep in mind, however, that human brains differ from computers in many ways, not the least of which is a concern for the promotion of survival through regulation of the internal milieu—which is the regulation of bodily processes. All nervous systems contain representations of basic bodily processes, such as those that regulate energy demands, reproduction, fluid balance, temperature, and the response to sickness and injury. Survival requires precise control over the state of these processes in order to maintain them within the narrow range that is compatible with life (i.e., homeostasis). The autonomic nervous system functions to make relatively rapid adjustments in the visceral state that maximize the survival value of events in the world that have the potential to impact homeostasis. It can be argued that visceral responses operate merely as reflexes, acting independently of higher order cognitive processes. Indeed, visceral reflexes that are implemented at the level of the spinal cord and brain stem do provide some benefit after the fact for reacting to events that challenge homeostasis. However, it is more advantageous to be able to predict the impact of events on the internal milieu before they occur. To do this the brain must connect sensory and motor representations of the viscera with processes that govern perception, learning, memory, and goal-directed behavior. It is clear, based on a multitude of studies, many of which are reported in this volume, that the vmPFC plays a role in a number of cognitive processes. It is also clear that the vmPFC plays a role in the control and mapping of visceral states. The most parsimonious explanation would seem to be that the cognitive processes that are mediated by the vmPFC and the visceral functions mediated by this area are somehow linked. According to the somatic marker hypothesis, the integration of visceral states into higher cognitive functions, such as decision making, is the function of the vmPFC. This function has expanded in evolution, allowing for the planning of behaviors that are executed further into the future and for which the outcomes of behavior in terms of rewards and punishments are more abstract. In humans, as
c38.indd 759
well as in nonhuman primates and rodents, the vmPFC is involved in the planning of behaviors related to the most immediate and basic needs, such as food, water, and sex. In humans the vmPFC also plays a role in guiding behaviors for which choosing advantageously requires a deliberate concern for one’s long-term well-being as well as knowledge of cultural norms and expectations. In this way the vmPFC may function to link highly evolved human faculties, such as moral behavior, altruism, financial reasoning, creativity, and a sense of purpose in one’s work life and social relationships, to the basic mechanisms that govern survival and the maintenance of homeostasis.
REFERENCES Ackerly, S. S., & Benton, A. L. (1948). Report of a case of bilateral frontal lobe defect. Proceedings of the Association for Research in Nervous and Mental Disease (Baltimore), 27, 479–504. Amaral, D. G., Price, J. L., Pitkanen, A., & Carmichael, S. T. (1992). Anatomical organization of the primate amygdaloid complex. In J. P. Aggleton (Ed.), The amygdala: Neurobiological aspects of emotion, memory, and mental dysfunction (pp. 1–66). New York: Wiley-Liss. Bechara, A. (2004). The role of emotion in decision-making: Evidence from neurological patients with orbitofrontal damage. Brain and Cognition, 55, 30–40. Bechara, A., & Damasio, A. R. (2005). The somatic marker hypothesis: A neural theory of economic decision. Games and Economic Behavior, 52, 336–372. Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50, 7–15. Bechara, A., Damasio, H., & Damasio, A. R. (2003). The role of the amygdala in decision-making. In P. Shinnick-Gallagher, A. Pitkanen, A. Shekhar, & L. Cahill (Eds.), The amygdala in brain function: Basic and clinical approaches (Vol. 985, pp. 356–369). New York: Annals of the New York Academy of Sciences. Bechara, A., Damasio, H., Damasio, A. R., & Lee, G. P. (1999). Different contributions of the human amygdala and ventromedial prefrontal cortex to decision-making. Journal of Neuroscience, 19, 5473–5481. Bechara, A., Damasio, H., Tranel, D., & Anderson, S. W. (1998). Dissociation of working memory from decision making within the human prefrontal cortex. Journal of Neuroscience, 18, 428–437. Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (1997, February 28). Deciding advantageously before knowing the advantageous strategy. Science, 275, 1293–1295. Bechara, A., Damasio, H., Tranel, D., & Damasio, A. R. (2005). The Iowa Gambling Task (IGT) and the somatic marker hypothesis (SMH): Some questions and answers. Trends in Cognitive Sciences, 9, 159–162. Bechara, A., Dolan, S., & Hindes, A. (2002). Decision-making and addiction: Pt. II. Myopia for the future or hypersensitivity to reward? Neuropsychologia, 40, 1690–1705. Bechara, A., Tranel, D., & Damasio, H. (2000). Characterization of the decision-making impairment of patients with bilateral lesions of the ventromedial prefrontal cortex. Brain, 123, 2189–2202. Bechara, A., Tranel, D., Damasio, H., & Damasio, A. R. (1996). Failure to respond autonomically to anticipated future outcomes following damage to prefrontal cortex. Cerebral Cortex, 6, 215–225.
8/17/09 3:07:33 PM
760 The Somatovisceral Components of Emotions and Their Role in Decision Making Berntson, G. G., Sarter, M., & Cacioppo, J. T. (2003). Ascending visceral regulation of cortical affective information processing. European Journal of Neuroscience, 18, 2103–2109. Berridge, K. C., & Robinson, T. E. (1998). What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience? Brain Research Reviews, 28, 309–369. Brickner, R. M. (1932). An interpretation of frontal lobe function based upon the study of a case of partial bilateral frontal lobectomy: Localization of function in the cerebral cortex. Proceedings of the Association for Research in Nervous and Mental Disease (Baltimore), 13, 259–351. Cacioppo, J., Berntson, G., Larsen, J., Poehlmann, K., & Ito, T. (2000). The psychophysiology of emotion. In M. Lewis & J. Haviland-Jones (Eds.), The handbook of emotion (2nd ed., pp. 173–191). New York: Guilford Press. Cacioppo, J., Klein, D. J., Berntson, G. G., & Hatfield, E. (1993). The psychophysiology of emotion. In M. Lewis & J. M. Haviland (Eds.), The handbook of emotion (pp. 119–148). New York: Guilford Press. Cardinal, R. N., Parkinson, J. A., Hall, J., & Everitt, B. J. (2002). Emotion and motivation: The role of the amygdala, ventral striatum and prefrontal cortex. Neuroscience and Biobehavioral Reviews, 26, 321–352. Chudasama, Y., & Robbins, T. W. (2003). Dissociable contributions of the orbitofrontal and infralimbic cortex to Pavlovian autoshaping and discrimination reversal learning: Further evidence for the functional heterogeneity of the rodent frontal cortex. Journal of Neuroscience, 23, 8771–8780. Clark, L., Cools, R., & Robbins, T. (2004). The neuropsychology of ventral prefrontal cortex: Decision-making and reversal learning. Brain and Cognition, 55, 41–53. Craig, A. D. (2002). How do you feel? Interoception: The sense of the physiological condition of the body. Nature Reviews: Neuroscience, 3, 655–666. Critchley, H., Elliott, R., Mathias, C. J., & Dolan, R. J. (2000). Neural activity relating to generation and representation of galvanic skin conductance responses: A functional magnetic resonance imaging study. Journal of Neuroscience, 20, 3033–3040. Critchley, H., Mathias, C., & Dolan, R. (2001). Neuroanatomical basis for first- and second-order representations of bodily states. Nature Neuroscience, 4, 207–212.
Denton, D., Shade, R., Zamarippa, F., Egan, G., Blair-West, J., McKinley, M., et al. (1999a). Correlation of regional cerebral blood flow and change of plasma sodium concentration during genesis and satiation of thirst. Proceedings of the National Academy of Sciences, USA, 96, 2532–2537. Denton, D., Shade, R., Zamarippa, F., Egan, G., Blair-West, J., McKinley, M., et al. (1999b). Neuroimaging of genesis and satiation of thirst and an interoceptor-driven theory of origins of primary consciousness. Proceedings of the National Academy of Sciences, USA, 96, 5304–5309. Drevets, W. C., Price, J. L., Simpson, J. R., Todd, R. D., Reich, T., Vannier, M., et al. (1997, April 24). Subgenual prefrontal cortex abnormalities in mood disorders. Nature, 386, 824–827. Ernst, M., Bolla, K., Moratidis, M., Contoreggi, C. S., Matochick, J. A., Kurian, V., et al. (2002). Decision-making in a risk taking task. Neuropsychopharmacology, 26, 682–691. Eslinger, P. J., & Damasio, A. R. (1985). Severe disturbance of higher cognition after bilateral frontal lobe ablation: Patient evr. Neurology, 35, 1731–1741. Fellows, L. K., & Farah, M. J. (2005). Different underlying impairments in decision making following ventromedial and dorsolateral frontal lobe damage in humans. Cerebral Cortex, 15, 58–63. Flynn, F. G., Benson, D. F., & Ardila, A. (1999). Anatomy of the insula: Functional and clinical correlates. Aphasiology, 13(1), 55–78. Fukui, H., Murai, T., Fukuyama, H., Hayashi, T., & Hanakawa, T. (2005). Functional activity related to risk anticipation during performance of the Iowa Gambling Task. NeuroImage, 24, 253–259. Harlow, J. M. (1848). Passage of an iron bar through the head. Boston Medical and Surgical Journal, 39, 389–393. Heims, H. C., Critchley, H. D., Dolan, R., Mathias, C. J., & Cipolotti, L. (2004). Social and motivational functioning is not critically dependent on feedback of autonomic responses: Neuropsychological evidence from patients with pure autonomic failure. Neuropsychologia, 42, 1979–1988. Hurliman, E., Nagode, J. C., & Pardo, J. V. (2005). Double dissociation of exteroceptive and interoceptive feedback systems in the orbital and ventromedial prefrontal cortex of humans. Journal of Neuroscience, 25, 4641–4648. James, W. (1884). What is an emotion? Mind, 9, 188–205.
Critchley, H., Wiens, S., Rotshtein, P., Ohman, A., & Dolan, R. (2004). Neural systems supporting interoceptive awareness. Nature Neuroscience, 7, 189–195.
Kringelbach, M. L., & Rolls, E. T. (2004). The functional neuroanatomy of the human orbitofrontal cortex: Evidence from neuroimaging and neuropsychology. Progress in Neurobiology, 72(5), 341–372.
Damasio, A. R. (1994). Descartes’ error: Emotion, reason, and the human brain. New York: Grosset/Putnam.
Lane, R., Reiman, E., Ahern, G., Schwartz, G., & Davidson, R. (1997). Neuroanatomical correlates of happiness, sadness, and disgust. American Journal of Psychiatry, 154, 926–933.
Damasio, A. R. (1999). The feeling of what happens: Body and emotion in the making of consciousness. New York: Harcourt. Damasio, A. R. (2003). Looking for Spinoza: Joy, sorrow, and the feeling brain. New York: Harcourt. Damasio, A. R., Grabowski, T. G., Bechara, A., Damasio, H., Ponto, L. L. B., Parvizi, J., et al. (2000). Subcortical and cortical brain activity during the feeling of self-generated emotions. Nature Neuroscience, 3, 1049–1056.
Lang, P. J., Bradley, M. M., Cuthbert, B. N., & Patrick, C. J. (1993). Emotion and psychopathology: A startle probe analysis. In L. J. Chapman, J. P. Chapman, & D. C. Fowles (Eds.), Experimental personality and psychopathology research (Vol. 16, pp. 163–199). New York: Spring Publishing. LeDoux, J. (1996). The emotional brain: The mysterious underpinnings of emotional life. New York: Simon & Schuster.
Damasio, A. R., Tranel, D., & Damasio, H. (1990). Individuals with sociopathic behavior caused by frontal damage fail to respond autonomically to social stimuli. Behavioral Brain Research, 41, 81–94.
Luria, A. R., Pribram, K. H., & Homskaya, E. D. (1964). An experimental analysis of the behavioral disturbance produced by a left frontal arachnoidal endothelioma (meningioma). Neuropsychologia, 2, 257–280.
Damasio, A. R., Tranel, D., & Damasio, H. (1991). Somatic markers and the guidance of behavior: Theory and preliminary testing. In H. S. Levin, H. M. Eisenberg, & A. L. Benton (Eds.), Frontal lobe function and dysfunction (pp. 217–229). New York: Oxford University Press.
Maia, T. V., & McClelland, J. L. (2004). A reexamination of the evidence for the somatic marker hypothesis: What participants really know in the Iowa Gambling Task. Proceedings of the National Academy of Sciences, USA, 101, 16075–16080.
Damasio, H., Grabowski, T., Frank, R., Galburda, A. M., & Damasio, A. R. (1994, May 20). The return of Phineas Gage: Clues about the brain from the skull of a famous patient. Science, 264, 1102–1104.
Maia, T. V., & McClelland, J. L. (2005). The somatic marker hypothesis: Still many questions but no answers: Response to Bechara et al. Trends in Cognitive Sciences, 9, 162–164.
c38.indd 760
8/17/09 3:07:34 PM
References 761 Martin, C., Denburg, N., Tranel, D., Granner, M., & Bechara, A. (2004). The effects of vagal nerve stimulation on decision-making. Cortex, 40, 1–8. Nagai, Y., Critchley, H. D., Featherstone, E., Trimble, M. R., & Dolan, R. J. (2004). Activity in ventromedial prefrontal cortex covaries with sympathetic skin conductance level: A physiological account of a “default mode” of brain function. NeuroImage, 22, 243–251. Nauta, W. J. H. (1971). The problem of the frontal lobes: A reinterpretation. Journal of Psychiatric Research, 8, 167–187. Neafsey, E. J. (1990). Prefrontal cortical control of the autonomic nervous system: Anatomical and physiological observations. In H. B. M. Uylings, C. G. Van Eden, J. P. C. De Bruin, M. A. Corner, & M. G. P. Feenstra (Eds.), Progress in brain research (Vol. 85, pp. 147–166). New York: Elsevier. North, N. T., & O’Carroll, R. E. (2001). Decision making in patients with spinal cord damage: Afferent feedback and the somatic marker hypothesis. Neuropsychologia, 39, 521–524. Ongur, D., & Price, J. L. (2000). The organization of networks within the orbital and medial prefrontal cortex of rats, monkeys and humans. Cerebral Cortex, 10, 206–219. Patterson, J. C., Ungerleider, L. G., & Bandettini, P. A. (2002). Task-independent functional brain activity correlation with skin conductance changes: An fMRI study. NeuroImage, 17, 1797–1806. Pears, A., Parkinson, J. A., Hopewell, L., Everitt, B. J., & Roberts, A. C. (2003). Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates. Journal of Neuroscience, 23, 11189–11201.
c38.indd 761
Rahman, S., Sahakian, B. J., Cardinal, R. N., Rogers, R. D., & Robbins, T. W. (2001). Decision making and neuropsychiatry. Trends in Cognitive Sciences, 6, 271–277. Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., & Shulman, G. L. (2001). A default mode of brain function. Proceedings of the National Academy of Sciences, USA, 98, 676–682. Rainville, P., Bechara, A., Naqvi, N., Virasith, A., Bilodeau, M., & Damasio, A. R. (2006). Basic emotions are associated with distinct patterns of cardiorespiratory activity. International Journal of Psychophysiology, 61, 5–18. Rolls, E. T. (1999). The brain and emotion. Oxford: Oxford University Press. Rolls, E. T. (2000). The orbitofrontal cortex and reward. Cerebral Cortex, 10, 284–294. Sanfey, A., Hastie, R., Colvin, M., & Grafman, J. (2003). Phineas Gage: Decision-making and the human prefrontal cortex. Neuropsychologia, 41, 1218–1229. Saver, J. L., & Damasio, A. R. (1991). Preserved access and processing of social knowledge in a patient with acquired sociopathy due to ventromedial frontal damage. Neuropsychologia, 29, 1241–1249. Tomb, I., Hauser, M., Deldin, P., & Caramazza, A. (2002). Do somatic markers mediate decisions on the gambling task? Nature Neuroscience, 5, 1103–1104. Tranel, D., & Damasio, H. (1994). Neuroanatomical correlates of electrodermal skin conductance responses. Psychophysiology, 31, 427–438. Welt, L. (1888). Uber charaktervaranderungen des menschen infoldge von lasionen des stirnhirns. Dutsch Archives of Klinical Medicine, 42, 339–390.
8/17/09 3:07:34 PM
Chapter 39
Neural Basis of Fear Conditioning DAVID E. A. BUSH, GLENN E. SCHAFE, AND JOSEPH E. LEDOUX
Emotion affects behavior in numerous ways. Emotion increases arousal, triggers hormonal stress responses, and alters motivation. Emotion can also alter how well something is learned and the strength of memory storage. But it is important to keep in mind that for these things to occur there needs to be a mechanism that assesses whether a given stimulus should trigger an emotional response in the first place. The problem is not so difficult for biologically significant unconditioned stimuli, such as an electric shock or the taste of a sweet food, because these stimuli are already hardwired to produce emotional responses. However, in everyday life we have emotional responses to all sorts of stimuli that are not innately aversive or appetitive. This is important because organisms need to respond to stimuli that predict dangerous or desirable events. Emotional stimuli thus serve as guides to direct behavior in adaptive directions. Novel stimuli acquire emotional significance through emotional learning. Emotional learning shares many characteristics with other forms of learning, especially at cellular and molecular levels, but there are also important differences. These differences exist essentially because of the distinct brain circuitry that underlies the interpretation and encoding of emotional information: the inputs, outputs, and processing resources in between. Fear conditioning is the best studied example of emotional learning. In this chapter we therefore focus on our current understanding of fear conditioning. However, it is important to keep in mind that fear is just one example of emotion. Emotional learning can also involve stimuli that predict positive or appetitive emotional properties. As we will see, the amygdala has emerged as a structure that is crucial for fear learning, but the role of the amygdala in appetitive emotional memories is still not well understood.
Although studies using appetitive conditioning in animals and humans find involvement of the amygdala, other studies that have examined humans with amygdala damage have found a selective role of the amygdala in negative emotion (Berntson, Bechara, Damasio, Tranel, & Cacioppo, 2007). For an in-depth survey of aspects of emotional learning related to positive or appetitive conditioning, see reviews by Balleine and Dickinson (1998), Cardinal, Parkinson, Hall, and Everitt (2002), and Holland and Gallagher (2004). AN OVERVIEW OF FEAR CONDITIONING In fear conditioning, the subject, typically a rat, is placed in an experimental chamber and given paired presentations of an innocuous conditioned stimulus (CS), such as a tone, together with an aversive unconditioned stimulus (US), such as a brief footshock. The CS does not elicit defensive behavior before fear conditioning, but after even a single CS-US pairing the animal begins to exhibit a range of conditioned responses (CRs), both to the CS and to the context (i.e., the conditioning chamber) in which conditioning occurs. In rats these responses include freezing or immobility (a species-typical behavioral response that makes a rodent less easily detected by predators), autonomic and endocrine responses (such as changes in heart rate and blood pressure, defecation, and increased levels of circulating stress hormones), and the potentiation of reflexes, such as the acoustic startle response (Blanchard & Blanchard, 1969; Davis, Walker, & Lee, 1997; Kapp, Frysinger, Gallagher, & Haselton, 1979; LeDoux, Iwata, Cicchetti, & Reis, 1988; Roozendaal, Koolhaas, & Bohus, 1991; Smith, Astley, Devito, Stein, & Walsh, 1980). Thus, as the result of a simple associative pairing, the CS comes to elicit many of the same defensive responses that are elicited by naturally aversive or threatening stimuli (see Figure 39.1). Similar responses occur in other mammals, including humans, allowing the fear conditioning procedure to be used to compare brain mechanisms across species.
This work was supported in part by National Institutes of Health grants MH 46516, MH 00956, MH 39774, and MH 11902, and a grant from the W. M. Keck Foundation to New York University. 762
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c39.indd 762
8/17/09 3:07:52 PM
The Neural Circuitry of Fear Conditioning (A) Conditioned Stimulus (CS) (Tone or light) Unconditioned Stimulus (US) (Foot shock)
Time
(B) Threatening Stimuli Natural Threat Fear CS Inputs Effector Systems
Fear Responses Defensive behavior Autonomic arousal Hypoalgesia Reflex potentiation Stress hormones
Figure 39.1 Pavlovian fear conditioning. Note: Fear conditioning involves the presentation of an initially innocuous stimulus, such as a tone (conditioned stimulus; CS), that is paired or associated with a noxious stimulus, such as a brief electric shock to the feet (unconditioned stimulus; US). Before conditioning, the CS elicits little response from the animal. After conditioning, the CS elicits a wide range of behavioral and physiological responses, including freezing, that are characteristically elicited by naturally aversive or threatening stimuli.
THE NEURAL CIRCUITRY OF FEAR CONDITIONING The fear conditioning paradigm has enabled researchers to be systematic about studying the way that emotional stimuli are processed in the brain, and this research has implicated the amygdala as a region that is crucial for assessing fear CS inputs, and then coordinating outputs to brain regions that mediate fear responses. In the sections that follow we review how auditory CS information reaches the amygdala, how this information then flows through different amygdala subregions, and how amygdala outputs coordinate fear responses. Input Pathways to the Amygdala The neural circuitry underlying Pavlovian fear conditioning, particularly auditory fear conditioning, has been well characterized (Figure 39.2). Cells in the lateral amygdala (LA) receive excitatory, glutamatergic projections (Farb, Aoki, Milner, Kaneko, & LeDoux, 1992; LeDoux & Farb, 1991) from areas of the auditory thalamus, including the medial division of the thalamic medial geniculate body (MGm) and the posterior intralaminar nucleus (PIN), and also from the auditory cortex (area TE3; Bordi & LeDoux, 1992; Doron & LeDoux, 1999; LeDoux, Farb, & Romanski, 1991; LeDoux, Ruggerio, & Reis, 1985; McDonald, 1998; Romanski & LeDoux, 1993). Thus, there are two general routes: a direct subcortical route from the auditory thalamus (MGm/PIN) and a more indirect route that continues from the thalamus to the cortex before descending to the amygdala from cortical regions (e.g., TE3). Thalamic and cortical inputs to the
c39.indd Sec1:763
763
LA, both capable of mediating fear learning (Romanski & LeDoux, 1992b), are believed to carry different types of information to the LA. The thalamic route (often called the “low road”) is believed to be critical for rapidly transmitting crude aspects of the CS to the LA, whereas the cortical route (known as the “high road”) is believed to carry highly refined information to the amygdala (LeDoux, 2000). Interestingly, whereas pretraining lesions of MGm/PIN impair auditory fear conditioning (LeDoux, Iwata, Pearl, & Reis, 1986; LeDoux, Sakaguchi, & Reis, 1984), similar lesions of the auditory cortex do not (LeDoux et al., 1984; Romanski & LeDoux, 1992a). Thus, the thalamic pathway between the MGm/PIN and the LA appears to be particularly important for auditory fear conditioning. This is not to say that the cortical input to the LA is not involved. Electrophysiological responses of cells in the auditory cortex are modified during fear conditioning (Edeline, Pham, & Weinberger, 1993), and posttraining lesions of the insular cortex attenuate auditory fear conditioning (Brunzell & Kim, 2001), suggesting that cortical inputs to the LA contribute to fear memory in the intact brain. Indeed, when conditioning depends on the ability of the animal to make fine discriminations between different auditory CSs, or when the CS is a complex auditory cue, such as an ultrasonic vocalization, then cortical regions appear to be crucial (Jarrell, Gentile, Romanski, McCabe, & Schneiderman, 1987; Lindquist & Brown, 2004). Recent evidence suggests that thalamic and cortical inputs terminate on different dendritic sites and have different
Auditory Cortex TE1 TE3 PRh
Auditory Thalamus
CPu (A) LA
(B) CE HPA Axis B
CS (Tone) US (Shock)
Amygdala
ANS Defensive Behavior (PAG)
Figure 39.2 Anatomy of the fear system. Note: A: Auditory fear conditioning involves the transmission of CS sensory information from areas of the auditory thalamus and cortex to the lateral amygdala (LA), where it can converge with incoming somatosensory information from the foot shock US. It is in the LA that alterations in synaptic transmission are thought to encode key aspects of the learning. B: During fear expression the LA engages the central nucleus of the amygdala (CE), which projects widely to many areas of the forebrain and brain stem that control the expression of fear CRs, including freezing, hypothalamic-pituitary-adrenal axis activation, and alterations in cardiovascular activity.
8/17/09 3:07:52 PM
764
Neural Basis of Fear Conditioning
cellular properties (Humeau & Lüthi, 2007). Nevertheless, the inputs from MGm/PIN and TE3 converge onto single cells in the LA (Li, Stutzmann, & LeDoux, 1996), and these same cells are also responsive to the footshock US (Romanski, Clugnet, Bordi, & LeDoux, 1993). Thus, individual cells in the LA are well suited to integrate information about the tone and shock during fear conditioning, which highlights the LA as a likely locus of the cellular events underlying fear acquisition. Consistent with this, behavioral studies have demonstrated that acquisition of auditory fear conditioning is disrupted both by conventional electrolytic or neurotoxic lesions of the LA and by reversible inactivation of LA cellular activity (Campeau & Davis, 1995; Helmstetter & Bellgowan, 1994; Kim, Rison, & Fanselow, 1993; LeDoux, Cicchetti, Xagoraris, & Romanski, 1990; Muller, Corodimas, Fridel, & LeDoux, 1997; Wilensky, Schafe, & LeDoux, 2000). Output Pathways from the Amygdala Although the LA is important for fear acquisition, its connections with other amygdaloid nuclei (Paré, Smith, & Paré, 1995; Pitkänen, Savander, & LeDoux, 1997), including the basal nucleus (B) and the central nucleus (CE), are essential for fear expression. During retrieval or expression of a fear memory, activation of the LA is thought to control CE activity through activation of B and/or the GABAergic intercalated cell masses situated along the lateral CE border (Paré, Quirk, & LeDoux, 2004). Auditory fear conditioning is disrupted by damage confined only to the LA and CE (Amorapanth, LeDoux, & Nader, 2000; Nader, Majidishad, Amorapanth, & LeDoux, 2001), suggesting that communication between the LA and the CE is sufficient to mediate fear conditioning. The connectivity of the CE with downstream brain regions is consistent with the traditional view that it serves as a principal output nucleus of the fear learning system. The CE projects to areas of the forebrain, the hypothalamus, and the brain stem, regions that control behavioral, endocrine, and autonomic CRs associated with fear learning (Davis, 1997; Davis et al., 1997; Kapp, Frysinger, Gallagher, & Haselton, 1979; LeDoux et al., 1988; Roozendaal et al., 1991). Projections from the CE to the midbrain periaqueductal gray, for example, have been shown to be particularly important for mediating behavioral and endocrine responses such as freezing and hypoalgesia (De Oca, DeCola, Maren, & Fanselow, 1998; Helmstetter & LandeiraFernandez, 1990; Helmstetter & Tershner, 1994; LeDoux et al., 1988), and projections to the lateral hypothalamus have been implicated in the control of conditioned cardiovascular responses (Iwata, LeDoux, & Reis, 1986; LeDoux et al., 1988). Importantly, whereas lesions of these individual areas can selectively impair expression of individual
c39.indd Sec1:764
CRs, damage to the CE interferes with the expression of all fear CRs (LeDoux, 2000). Thus, the CE is typically thought of as the principal output nucleus of the fear system that acts to orchestrate the collection of hardwired, and typically species-specific responses that underlie defensive behavior. Pretraining electrolytic lesions of the B, unlike lesions of the LA and the CE, do not disrupt fear conditioning, which suggests that the B is not essential for fear conditioning (Amorapanth et al., 2000). However, one study observed deficits in auditory and contextual fear conditioning when pretraining lesions were localized to the anterior, but not posterior, divisions of the B (Goosens & Maren, 2001). Thus, although not essential, projections from the LA to the B may be important under some circumstances. In support of this, if B lesions occur after fear conditioning rather than before, fear memory expression is impaired (Anglada-Figueroa & Quirk, 2005), which indicates that the B participates in fear memory when it is intact at the time of conditioning. Interestingly, the B is also important for mediating more complex responses to fear stimuli, such as the performance of instrumental responses that actively avoid or escape a threatening stimulus (Amorapanth et al., 2000). We will return to this topic in a later section. AMYGDALA SYNAPTIC PLASTICITY AND FEAR CONDITIONING Synaptic plasticity is believed to be a neural mechanism that underlies learning, as was discussed in detail in Chapter 1. Considerable evidence shows that synapses in the LA are plastic, which could enable the LA to store memories for fear conditioning by altering connections between converging CS and US inputs. Unlike most other brain regions, synaptic plasticity in the amygdala has been directly related to learning. Consequently, this research has facilitated research not only on fear and emotion, but also on learning and memory. Synaptic Plasticity in the Lateral Amygdala Induced by Fear Conditioning Individual cells in the LA alter their response properties after a CS and US are paired during fear conditioning. The LA cells that are only weakly responsive to auditory input prior to conditioning will respond vigorously to the same input after fear conditioning (Goosens, Hobin, & Maren, 2003; Goosens & Maren, 2004; Maren, 2000; Quirk, Armony, & LeDoux, 1997; Quirk, Repa, & LeDoux, 1995). Thus, as a consequence of the training, a change occurs in the response of LA cells to the auditory CS, which is consistent with the view that neural plasticity in the LA encodes key aspects of fear learning and memory storage (Blair, Schafe, Bauer, Rodrigues, & LeDoux, 2001; Fanselow & LeDoux, 1999;
8/17/09 3:07:53 PM
Amygdala Synaptic Plasticity and Fear Conditioning
Maren, 1999; Quirk, Armony, Repa, Li, & LeDoux, 1997; for review, see Maren & Quirk, 2004). Interestingly, single-unit studies have suggested that there are at least two populations of LA cells that undergo plastic changes during fear conditioning in unique ways (Repa et al., 2001). The first is a more dorsal population (near the border of the caudate/putamen) that shows enhanced firing to the CS in the initial stages of training and testing and is sensitive to fear extinction (see Figure 39.3; Repa et al., 2001). These so-called transiently plastic cells exhibit short-latency changes (within 10 to 15 ms after tone onset). These short latencies are consistent with a rapid, monosynaptic thalamic input. The second population of LA cells occupies a more ventral position. In contrast to the transiently plastic cells, the more ventral cells exhibit enhanced firing to the CS throughout training and testing and do not appear to be sensitive to
765
extinction. Further, these long-term plastic cells exhibit longer latencies (within 30 to 40 ms after tone onset), which indicates a polysynaptic pathway. Thus, it has been hypothesized that a dorsal-to-ventral network of neurons within the LA is responsible for triggering and storing fear memories, respectively (Medina, Repa, & LeDoux, 2002; Radwanska, Nikolaev, Knapska, & Kaczmarek, 2002; Repa et al., 2001). Long-Term Potentiation as a Mechanism for Lateral Amygdala Synaptic Plasticity Underlying Fear Conditioning The change in the responsiveness of LA cells during fear conditioning suggests that alterations in excitatory transmission between LA synapses might be critical for fear conditioning. Many of the recent studies that have examined
Percent Freezing
(A) Paired
60 50 40 30 20 10 0 ⫺10 ⫺20
Unpaired
Hab Conditioning
Extinction
(C)
(B)
10 Z score
8 6 4 2
% Change
0
(D) “Long-term plastic” cells
“Transiently plastic” cells 10
10-20 ms
“ Trigger Cells” Dorsal LAd
8 6
Conditioning
4
Habituation
10-20 ms
2 0 100 200
1 * * 0.8 * 0.6 0.4 0.2 0 Hab ConditionExtinction ing
0
Note: Pairing of CS and US during fear conditioning leads to changes in fear behavior A: and also to changes in the responsiveness of single LA cells to auditory stimuli. During fear conditioning there are two populations of cells that undergo plastic change. B: Transiently plastic cells are generally short latency and show enhanced firing shortly after training and during the initial phases of extinction, but not at other times. C: Long-term plastic cells are generally longer latency and show
“ Storage Cells” Ventral LAd
0 100 200
1 0.8 * * * * 0.6 * * * * * 0.4 0.2 0 Hab Condition- Extinction ing
Figure 39.3 Plasticity in the LA during fear conditioning.
c39.indd Sec2:765
Transiently plastic cells Long-lasting plastic cells
LA vm LA vl B
enhanced firing throughout training and extinction. D: Transiently plastic cells are generally found in the dorsal tip of the lateral amygdala (LAd), where they may serve to trigger the initial stages of memory formation. Long-term plastic cells, on the other hand, are found in the ventral regions of the LAd and may be important for long-term, extinction-resistant memory storage. From “Two Different Lateral Amygdala Cell Populations Contribute to the Initiation and Storage of Memory,” by Repa et al., 2001, Nature Neuroscience, 4, pp. 724–731. Adapted with permission.
8/17/09 3:07:53 PM
766
Neural Basis of Fear Conditioning
the biochemical basis of fear conditioning have drawn on a larger literature that has focused on the biochemical events that underlie long-term potentiation (LTP), an activity-dependent form of synaptic plasticity that was initially discovered in the hippocampus (Bliss & Lømo, 1973). Importantly, LTP has also been demonstrated, both in vivo and in vitro, in each of the major auditory input pathways to the LA, including the thalamic and cortical auditory pathways (Chapman, Kairiss, Keenan, & Brown, 1990; Clugnet & LeDoux, 1989; Huang & Kandel, 1998; Rogan & LeDoux, 1995; Weisskopf, Bauer, & LeDoux, 1999; Weisskopf & LeDoux, 1999). This includes tetanus-induced LTP, which appears to
depend on activation of the glutamatergic NMDA receptor (Bauer, Schafe, & LeDoux, 2002; Huang & Kandel, 1998), and also associative LTP, which is induced following pairing of subthreshold presynaptic auditory inputs with postsynaptic depolarizations of LA cells (Bauer et al., 2002; Huang & Kandel, 1998; Weisskopf et al., 1999). Unlike LTP induced by a tetanus, associative LTP in the LA is dependent on L-type voltage-gated calcium channels (VGCCs; Bauer et al., 2002; Humeau & Lüthi, 2007; Weisskopf et al., 1999). A number of findings have converged to support the hypothesis that fear conditioning is mediated by an associative LTP-like process in the LA (see Figure 39.4). First,
(B)
(A) Electrically Evoked Potential
% Change from baseline
200
1 V Stimulus onset
Paired Unpaired
10 ms
Auditory Evoked Potential
150 100 50 0 ⫺50
20 V
⫺100
Freezing (sec)
10 ms
Before LTP induction After LTP induction
10 9 0
(C)
Paired Unpaired
15
Before Training
After Training
(D) “Cortical” Stimulation
“Auditory Thalamic” Stimulation
250 EC CE LA B OT
Record
EPSP Slope (%)
IC
Figure 39.4 LTP in the LA. Note: A: (top) LTP is induced in the LA following high-frequency electrical stimulation of the MGm/PIN. The trace represents a stimulation-evoked field potential in the LA before and after LTP induction. (bottom) Following artificial LTP induction, processing of naturalistic auditory stimuli is also enhanced in the LA. The trace represents an auditory-evoked field potential in the LA before and after LTP induction. B: (top) Fear conditioning leads to electrophysiological changes in the LA in a manner similar to LTP. The figure represents a percentage change in the slope of the auditory-evoked field potential in the LA before, during, and after conditioning in both paired and unpaired rats. (bottom) Freezing behavior across training and
Repeated Pairing
Thalamic/Paired Cortical/Unpaired
200 150 100 50 0 ⫺10 ⫺5
c39.indd Sec2:766
During Training
0
5 10 Time (min)
15
20
25
30
testing periods. Note that both paired and unpaired groups show equivalent freezing behavior during training, but only the paired group shows an enhanced neural response. C: Associative LTP is induced in the amygdala slice by pairing trains of presynaptic stimulation of fibers coming from the auditory thalamus with depolarization of LA cells. Stimulation of fibers coming from cortical areas serves as a control for input specificity. D: LTP induced by pairing as measured by the change in the slope of the excitatory postsynaptic potential (EPSP) over time. In this case, the thalamic pathway received paired stimulation, whereas the cortical pathway received unpaired stimulation (i.e., trains and depolarizations, but in a noncontingent manner). The black bar represents the duration of the pairing.
8/17/09 3:07:54 PM
Beyond the Simple Fear Conditioning Circuit
LTP induction at thalamic inputs to the LA has been shown to enhance auditory processing, and thus natural information flow within the LA (Rogan & LeDoux, 1995). Second, fear conditioning has been shown to lead to electrophysiological changes in the LA in a manner that is very similar to those observed following artificial LTP induction, and these changes persist over days (McKernan & ShinnickGallagher, 1997; Rogan, Staubli, & LeDoux, 1997). Third, associative LTP in the LA has been shown to be sensitive to the same contingencies as fear conditioning. That is, LTP is strong when presynaptic trains precede the onset of postsynaptic depolarizations 100% of the time. However, LTP is much weaker if noncontingent depolarizations of the postsynaptic LA cell are interleaved within the same number of contiguous pairings (Bauer, LeDoux, & Nader, 2001). Thus, the LTP-induced change in synaptic efficacy within the LA depends on the contingency between preand postsynaptic activity rather than simply on temporal contiguity. Importantly, it is contingency, rather than temporal pairing, that is known to be critical for associative learning, including fear conditioning (Rescorla, 1968). Fourth, fear conditioning and LTP induction have been characterized by a common pharmacological and biochemical substrate. Fear conditioning, for example, has been shown to be impaired by pharmacological blockade of both NMDA receptors (Kim, DeCola, Landeira-Fernandez, & Fanselow, 1991; Miserendino, Sananes, Melia, & Davis, 1990; Rodrigues, Schafe, & LeDoux, 2001) and L-type VGCCs (Bauer et al., 2002) in the amygdala. Training-induced elevations in Ca2⫹ through both NMDA and L-type VGCCs in the LA appear to set in motion a process that is essential for both synaptic plasticity and fear memory formation, and this process appears to share essential features with that underlying LTP in the hippocampus and in other systems. Recent studies, for example, have demonstrated the involvement of Ca2⫹-regulated intracellular signaling cascades, including protein kinase A (PKA) and the mitogen-activated protein kinase (MAPK) in synaptic plasticity in fear memory consolidation. Each of these signaling cascades is thought to promote long-term synaptic plasticity and memory formation, in part, by activating transcription factors in the nucleus, including the cyclic adenosine monophosphate (cAMP)-response element binding (CREB) protein. In turn, CREB and cAMPresponse element (CRE)-mediated transcription is thought to promote the long-term structural and functional changes underlying memory formation. Many of these recent studies have used molecular genetic methods in which the molecules of interest have been manipulated in knockout or transgenic mouse lines (Abel et al., 1997; Bourtchuladze et al., 1994; Brambilla et al., 1997). Other recent studies have used pharmacological or viral transfection methods to
c39.indd Sec3:767
767
examine the involvement of these molecules specifically in the amygdala. For example, recent studies have shown that infusions of drugs into the LA that specifically block RNA or protein synthesis or PKA activity impair the formation of fear memories (Bailey, Kim, Sun, Thompson, & Helmstetter, 1999; Schafe & LeDoux, 2000). Further, extracellular signal-regulated kinase (ERK)/MAPK is activated in the LA following fear conditioning, and pharmacological blockade of this activation via localized infusions of ERK/ MAPK inhibitors impairs fear conditioning (Schafe et al., 2000). Another recent study has shown that overexpression of the transcription factor CREB in the LA facilitates formation of fear memories (Josselyn et al., 2001). Thus, as shown in Figure 39.5 (Rodrigues, Schafe, & LeDoux, 2004; Huang, Martin, & Kandel, 2000), similar biochemical signaling pathways and molecular events that are involved in amygdala LTP are also necessary for fear conditioning. Most studies have emphasized the role of postsynaptic processes in fear conditioning. However, recent studies suggest that LTP at LA synapses may involve pre- as well as postsynaptic mechanisms (Apergis-Schoute, Debiec, Doyère, LeDoux, & Schafe, 2005; Humeau, Shaban, Bissière, & Lüthi, 2003; Schafe et al., 2005).
BEYOND THE SIMPLE FEAR CONDITIONING CIRCUIT Fear conditioning to a discrete cue is probably the simplest form of emotional learning, but most emotions are far more complex. Stimuli that trigger fear responses can involve much more than a pure tone. Also, regardless of the trigger stimulus, emotions can lead to a wide range of responses beyond the initial fear reaction, including active responses that help the animal cope with the stimulus, as well as other mechanisms that help to reduce the intensity of fear reactions. In the sections that follow we first examine neural circuitry that is thought to underlie the processing of more complex stimuli, and then consider how established fear memories can be modified and diminished. Contextual Fear Conditioning In a typical auditory fear conditioning experiment, the animal learns to fear not only the footshock-paired tone, but also the context in which conditioning occurs. Contextual fear is also learned when footshock stimuli are presented in the absence of a discrete CS. With contextual fear conditioning, fear to the context is later measured by returning the rat to the conditioning chamber on the test day and measuring the CR, including freezing behavior (Blanchard, Dielman, & Blanchard, 1968; Fanselow, 1980).
8/17/09 3:07:54 PM
768
Neural Basis of Fear Conditioning
(A)
(B) Acquistion
(C)
Consolidation Protein Synthesis Inhibitor 100
Postsynaptic Neuron mGluR5
60 40 *
20
EPSP Slope (%)
mGluR5
190
80
Percent Freezing
Ca2⫹ from Intracellular Stores
STM (4 hr)
␣CaMKII AMPAR
Rho-GAP
60
20
90 120 150
180 210
30
60
90 120 150 180 210
30
60
170 140 100
STM (4 hr)
0
LTM (24 hr)
MAPK Inhibitor 190
L-VGCC
Initial Induction Mechanisms
Downstream Cascades
Macromolecular Synthesis
Receptors, Channels
Activation of Molecules at the Synapse
Second Messenger Systems, Signal Transduction Pathways
RNA and Protein Synthesis, Structural Remodeling
Percent Freezing
100
2⫹
Synaptic Activity
Figure 39.5 Molecular pathways underlying fear conditioning. Note: A: Illustration of the molecular pathways within cells of the LA that are needed for the acquisition and consolidation of fear conditioning and also for LA LTP. Both fear conditioning and LTP involve the release of glutamate and Ca2⫹ influx through either NMDA receptors or L-type VGCCs. The increase in intracellular Ca2⫹ leads to the activation of protein kinases, such as PKA and ERK/MAPK. Once activated, these kinases can translocate to the nucleus, where they activate transcription factors such as CREB. The activation of CREB by PKA and ERK/MAPK promotes CRE-mediated gene transcription and the synthesis of new proteins. From Figure 2, page 85 in “Molecular mechanisms underlying emotional learning and memory in the lateral amygdala,” by Rodrigues et al., 2004, Neuron, 44, pp. 75–91. Adapted with permission. B: Disruption of these molecular pathways in the LA interferes with fear memory formation. In these studies, rats received intra-amygdala infusions of anisomycin (a protein synthesis inhibitor; B-Top), Rp-cAMPS (a PKA inhibitor; B-Middle), or U0126 (a MEK inhibitor, which is an
In comparison to auditory fear conditioning, much less is known about the neural systems underlying contextual fear. Substrates of contextual fear have been identified primarily through the use of lesion methods, and, as in auditory fear conditioning, the amygdala appears to play an essential role. For example, lesions of the amygdala, including the LA and B, have been shown to disrupt
c39.indd Sec3:768
60
60
0 Protein Synthesis
Glutamate
␣CaMKII
*
40
Structural Remodeling of Dendrites and Spines
Ca
30
190
80
EPSP Slope (%)
RNA Synthesis
80 60 40
*
20
EPSP Slope (%)
AMPAR
MAPK
CREB
Percent Freezing
100
NR2B
Na+
0 PKA Inhibitor
NMDAR PKA
100
LTM (24 hr)
Nucleus NMDAR
140
60
0 Scaffol- PKC ding Proteins
170
170 140 100 60
0
0 STM (1 hr)
LTM (24 hr)
90 120 150 180 210 Time (min)
upstream regulator of ERK/MAPK activation; B-Bottom) at or around the time of training and were assayed for both short-term memory (1 to 4 hours later) and long-term memory (24 hours later) of auditory fear conditioning. In each figure vehicle-treated rats are represented by the gray bars, and drug-treated animals are represented by the black bars. *p ⬍ .05 relative to vehicle controls. C: Amygdala LTP has been shown to require the same biochemical processes. In these studies amygdala slices were treated with either anisomycin (C-Top), KT5720 (a PKA inhibitor; C-Middle), or PD098059 (a MEK inhibitor; C-Bottom) prior to and during tetanus of the thalamic pathway. In each experiment field recordings were obtained from the LA and expressed across time as a percentage of baseline. From “Both Protein Kinase A and Mitogen-Activated Protein Kinase Are Required in the Amygdala for the Macromolecular SynthesisDependent Late Phase of Long-Term Potentiation,” by Huang et al., 2000, Journal of Neuroscience, 20, pp. 6317–6325. Copyright 2000 by the Society for Neuroscience. Reprinted with permission.
both the acquisition and the expression of contextual fear conditioning (Kim et al., 1993; Maren, 1998; Phillips & LeDoux, 1992). Reversible inactivation, usually achieved by microinjecting muscimol or tetrodotoxin into the LA, has similar effects (Muller et al., 1997). Contextual fear conditioning is also impaired by infusion of NMDA receptor antagonists, RNA and protein synthesis inhibitors,
8/17/09 3:07:54 PM
Beyond the Simple Fear Conditioning Circuit
and inhibitors of PKA into the amygdala (Bailey et al., 1999; Goosens, Holt, & Maren, 2000; Huff & Rudy, 2004; Kim et al., 1991; Rodrigues et al., 2001; but see Walker, Paschall, & Davis, 2005). Collectively these findings suggest that essential aspects of the memory are encoded and stored in the amygdala. At this time, however, there is little evidence that allows us to distinguish between the involvement of different amygdala subnuclei in contextual fear, although recent lesion evidence suggests that the LA and anterior B, but not the posterior regions of the B, are critical (Goosens & Maren, 2001). The CE is, of course, also essential for the expression of contextual fear, as it is for auditory fear conditioning (Goosens & Maren, 2001). However, although lesions of the CE disrupt both cued and contextual fear, lesion studies suggest that other brain regions, including the bed nucleus of the stria terminalis, appear to be required only for the expression of contextual (not cued) fear (Sullivan et al., 2004). The hippocampus has also been implicated in contextual fear conditioning, although its exact role has been difficult to define. A number of studies have shown that electrolytic and neurotoxic lesions of the hippocampus disrupt contextual, but not auditory, fear conditioning (see Figure 39.6, Kim & Fanselow, 1992; Kim et al., 1993; Maren, Aharonov, & Fanselow, 1997; Phillips & LeDoux, 1992). However, only lesions given shortly after training disrupt contextual fear conditioning (Frankland, Cestari,
769
Filipkowski, McDonald, & Silva, 1998). If rats are given hippocampal lesions 28 days after training, there is no memory impairment (Kim & Fanselow, 1992). This “retrograde gradient” of recall suggests that hippocampal-dependent memories are gradually transferred over time to other regions of the brain for permanent storage, an idea that is consistent with the findings of hippocampal-dependent episodic memory research in humans (Milner, Squire, & Kandel, 1998). It is clear, however, that the hippocampus undergoes plastic changes during fear conditioning, some of which may be necessary for memory formation of contextual fear. For example, intrahippocampal infusion of the NMDA receptor antagonist APV impairs contextual fear conditioning (Stiedl, Birkenfeld, Palve, & Spiess, 2000; Young, Bohenek, & Fanselow, 1994). Also, fear conditioning to a context, but not to an auditory CS, is impaired in mice that lack the NR1 subunit of the NMDA receptor exclusively in area CA1 of the hippocampus (Rampon et al., 2000). Fear conditioning also leads to increases in the activation of Calmodulin-dependent Protein Kinase II (CaMKII), PKC, ERK/MAPK, and CRE-mediated gene expression in the hippocampus (Atkins, Selcher, Petraitis, Trzaskos, & Sweatt, 1998; Hall, Thomas, & Everitt, 2000; Impey et al., 1998). These findings add support to the notion that NMDA receptor-dependent plastic changes in the hippocampus, in addition to the amygdala, are required for
(A) Hippocampal Lesions (D) Training
1
(B)
7
14
80
SHAM Lesion
60 40 20
100 1. Hippocampus
80 60
CS
40 SHAM Lesion
20 0
0
Permanent Storage (⬎14 d)
Auditory Memory
% Freezing
100
1 7 14 28 Day of Lesion
1 7 14 28 Day of Lesion
Figure 39.6 Hippocampal-dependent contextual fear. Note: Contextual fear conditioning requires the dorsal hippocampus, but only for a limited time. A: Experimental protocol from Kim and Fanselow (1992), where rats were trained with tone-shock pairings and then given lesions of the dorsal hippocampus either 1, 7, 14, or 28 days later. B: Contextual memory was impaired when lesions were given 1 day after training, but not if given 28 days after training. C: Auditory fear conditioning was not affected by hippocampal lesions. In each panel, the lesioned rats are represented by the black circles. From “ModalitySpecific Retrograde Amnesia of Fear,” by J. J. Kim, and M. S. Fanselow, Science, 256, May 1, 1992, p. 676. Copyright 1992 by the American
c39.indd Sec3:769
3. Cortex
(C) Contextual Memory
% Freezing
28
2. US
Temporary Storage (⬍14 d)
Amygdala Fear Responses
Association for the Advancement of Science (AAAS). Reprinted with permission. D: A model of the neural system underlying contextual fear conditioning. The hippocampus (1) is necessary for forming an initial representation of the context and for providing that information as a CS to the amygdala (2) during fear conditioning. In the amygdala, the contextual CS can converge with the footshock US, and it is here that the memory of contextual fear is thought to be formed. Over time, however, the contextual memory formed by the hippocampus is transferred to the cortex (3) for permanent storage. At this point, the hippocampus is not necessary to retrieve the memory.
8/17/09 3:07:56 PM
770
Neural Basis of Fear Conditioning
contextual fear conditioning. However, it should be emphasized that the exact contribution of these plastic changes to contextual fear conditioning remains unclear. Most of these studies cannot distinguish between a role for NMDA receptor-mediated plasticity in fear memory formation and in the formation of contextual representations that then serve as the fear CS (Rudy, Huff, & Matus-Amat, 2004). Further, regulation of intracellular signaling cascades in the hippocampus by fear conditioning, though potentially indicative of some type of memory storage, does not necessarily indicate that these changes are related to the acquisition of fear memories. They may be related to episodic memories of the training experience that are acquired at the same time as fearful memories (LeDoux, 2000; and see later discussion). Indeed, a number of studies have shown that hippocampal cells undergo plastic changes during and after fear conditioning (Doyère et al., 1995; Moita, Rosis, Zhou, LeDoux, & Blair, 2003), including auditory fear conditioning, which is spared following hippocampal lesions (Kim & Fanselow, 1992). Although auditory fear conditioning can be learned independently of the hippocampus, it has recently been shown that hippocampal involvement is recruited with weaker fear conditioning protocols that involve low footshock intensity and few training trials (Quinn, Wied, Ma, Tinsley, & Fanselow, 2008). Thus, the amygdala and hippocampus normally cooperate in the intact brain to store different components of the fear learning experience. The amygdala independently stores the direct association between the cue and the footshock, and the hippocampus stores more general features of the conditioning episode that can contribute to the fear memory. Altering Established Fear Memories Thus far we have focused on how a fear memory is formed. But what happens after a fear memory has been retrieved? Two paradigms have been used to examine how fear memories change with retrieval: reconsolidation and extinction. Reconsolidation Blockade The traditional way of thinking about memory formation is that memories are laid down by a time-dependent process, called consolidation, that stabilizes the neuronal representation of the memory trace. Newly acquired memories, for example, are thought to be inherently unstable, acquiring stability only over time as RNA and protein synthesisdependent processes kick in. According to this view, after the memory has been consolidated retrieval simply involves going back and reactivating the original trace. However, over the years a number of studies have challenged this linear notion of memory formation and retrieval. In these studies, manipulations that are known to disrupt memory
c39.indd Sec3:770
consolidation when given around the time of initial learning have also been found to disrupt the integrity of an established memory when given around the time of memory retrieval (see Sara, 2000). These findings suggest that the retrieval process renders a memory susceptible to disruption, similar to the susceptibility that exists prior to the consolidation of a newly formed memory. Recent studies using the fear conditioning paradigm have rekindled interest in this phenomenon. For example, infusion of the protein synthesis inhibitor anisomycin into the amygdala immediately after retrieval of an auditory fear memory was shown to impair memory retrieval on subsequent tests (Nader, Schafe, & LeDoux, 2000). This effect was clearly dependent on retrieval of the memory because no subsequent memory deficit was observed if the CS exposure was omitted. Further, the anisomycin-induced memory deficit was observed not only when the initial CS exposure was given shortly after training (i.e., 1 day), but also when the CS exposure was given 14 days after initial training, suggesting that the effect could not be attributable to disruption of late phases of protein synthesis necessary for consolidation. Thus, following active retrieval of a previously consolidated fear memory, that memory appears to undergo a second stabilization process (socalled reconsolidation) that requires protein synthesis in the amygdala. There is much that remains unknown about reconsolidation, but recent work has extended these findings by showing that amygdala CREB activation is also required for reconsolidation, because transient overexpression of a dominant negative isoform of CREB at the time of memory retrieval disrupts memory for both auditory and contextual fear conditioning (Kida et al., 2002), suggesting that a nuclear event is involved. Recent studies have demonstrated a similar role for activation of the ERK/MAPK pathway in the LA. Interestingly, postretrieval blockade of ERK/MAPK signaling in the LA not only disrupts fear memory consolidation, but also reverses the fear retrievalinduced potentiation of LA field potentials that accompanies the fear response (Doyère, Debiec, Monfils, Schafe, & LeDoux, 2007). In essence, reconsolidation blockade is associated not only with disruption of the fear memory, but also with disruption of the potentiated electrophysiological response that is associated with the retrieved memory. The therapeutic implications of this ability to disrupt fear memory reconsolidation still need to be explored. Of course, protein synthesis inhibition, or even disruption of ERK/MAPK and CREB signaling, is not likely to be useful for reconsolidation-based fear reduction manipulations in the clinic. Therefore, based on reconsolidation experiments in rats with other kinds of memories (Sara, 2000), it has recently been shown that beta-adrenergic receptor antagonists (e.g., propranolol), which are already
8/17/09 3:07:56 PM
Beyond the Simple Fear Conditioning Circuit
used in humans for other purposes, can also disrupt fear memory reconsolidation (Debiec & LeDoux, 2004). This suggests that beta blockers, given in conjunction with the retrieval of traumatic memories, might be able to reduce the potency of fear-related pathologies, such as Posttraumatic Stress Disorder (Debiec & LeDoux, 2006).
771
Early studies showed that selective lesions of the ventral mPFC retard the extinction of fear to an auditory CS, while having no effect on initial fear acquisition (Morgan & LeDoux, 1995; Morgan, Romanski, & LeDoux, 1993; but see Gewirtz, Falls, & Davis, 1997). Further, neurons in the mPFC alter their response properties as a result of extinction (Garcia, Vouimba, Baudry, & Thompson, 1999; Herry, Vouimba, & Garcia, 1999). However, recent work suggests that the role of the ventral mPFC in extinction is complex. Briefly, the mPFC may not be necessary for the initial acquisition of fear extinction, but rather in the posttraining storage of information needed later to rapidly retrieve extinction learning under appropriate circumstances (see Figure 39.7; Quirk, Russo, Barron, & Lebron, 2000; Milad & Quirk, 2002). For example, rats with mPFC lesions are able to extinguish within a session but show impaired extinction retrieval when tested in a later session (Quirk et al., 2000). Further, neurons in the mPFC fire strongly to a tone CS after behavioral extinction has occurred, and artificial stimulation of the mPFC that resembles responding in an extinguished rat is sufficient to inhibit behavioral expression of fear in nonextinguished rats (Milad & Quirk, 2002). Consistent with this, blockade of mPFC NMDA receptors shortly after extinction
Extinction Extinction is a more traditional way of decreasing the potency of established fear memories. Extinction is a process whereby repeated presentations of the CS in the absence of the US lead to a weakening of the expression of conditioned responding. Unlike reconsolidation blockade, which is thought to disrupt the original memory trace, extinction involves the formation of a new inhibitory memory (i.e., a CS-No US trace) that competes with the original trace for control over behavior. Extinction of conditioned fear has been well documented in the behavioral literature, but we know comparatively little about its neurobiological substrate. However, research over the past 2 decades has implicated a circuit that involves complex interactions among the amygdala, the ventral medial prefrontal cortex (mPFC), and the hippocampus in fear extinction learning and retrieval.
(A) Day 1
Day 2 Sham vmPFC-i
% Freezing
80
(C)
60
IL stim.
40
Tone ⫺0.5
20 0
Extinction Trials
Habit. Cond.
100
Spikes
20
20
10
10
10
0
1
2
0 ⫺1
0 1 2 Time (s)
Day 1
Day 2
Day 3
No stim.
With IL stim.
No stim.
0 ⫺1
75 50 25
0
Figure 39.7 The role of the medial prefrontal cortex (mPFC) in long-term retention of fear extinction. Note: A: Rats with lesions of the mPFC can acquire and extinguish auditory fear conditioning normally (Day 1). However, they cannot retain their memory for extinction (Day 2; 24 hours later). In each panel, the lesioned animals are represented by the black circles. From “The Role of Ventromedial Prefrontal Cortex in the Recovery of Extinguished Fear,” by G. J. Quirk, G. K. Russo, J. L. Barron, and K. Lebron, 2000, Journal of Neuroscience, 20, p. 6227. Copyright 2000 by the Society for
c39.indd Sec3:771
30
Extinction % Freezing
Extinction
20
0 ⫺1
0.5 Time (s)
Extinction
(B) Habit.⫹Cond.
0
1
2
0
Conditioning
Extinction Trial blocks
Test
Neuroscience. Adapted with permission. B: Single cells in the mPFC are generally unresponsive to tones during training and extinction (Day 1), but signal vigorously during long-term recall of extinction (Day 2; 24 hours later). C: Direct stimulation of the mPFC during the early phases of extinction (Day 2) results in a dramatic reduction in fear, which is longlasting (Day 3; 24 hours later). In each figure the stimulated animals are represented by the black squares. From “Neurons in Medial Prefrontal Cortex Signal Memory for Fear Extinction,” by M. R. Milad and G. J. Quirk, November 7, 2002, Nature, 420, pp. 70–74. Copyright 2002 by Macmillan Publishers Ltd. Adapted with permission.
8/17/09 3:07:56 PM
772
Neural Basis of Fear Conditioning
Figure 39.8 The amygdala and fear extinction. Note: Extinction of fear-potentiated startle (FPS) can be impaired or facilitated by pharmacological manipulations of the amygdala. A: Extinction of FPS is impaired in a dose-dependent manner following infusion of AP5, an NMDA receptor antagonist, into the amygdala. White bars represent preextinction startle baselines; black bars represent the amount of startle potentiation after an extinction session in each group. Note that with increasing doses there is less extinction. From “Extinction of Fear-Potentiated Startle: Blockade by Infusion of an NMDA Antagonist into the Amygdala,” by W. A. Falls, M. J. Miserendino, and M. Davis, 1992, Journal of Neuroscience, 12, pp. 854–863. Copyright 1992 by the Society for Neuroscience. Adapted with permission. B: Extinction of FPS can be facilitated by infusion of a partial agonist of the NMDA receptor in the amygdala. Rats that were given intra-amygdala infusions of D-cycloserine (DCS; DCS/saline), a partial agonist of the glycine recognition site of the NMDA receptor, had facilitated extinction relative to controls (saline/saline). This effect could be reversed by HA966 (DCS/HA966), an antagonist of the glycine recognition site that has no effect on extinction itself (saline/HA966). In each group white bars represent preextinction startle baselines, and black bars represent the amount of startle potentiation after drug treatment and an extinction session. From “Facilitation of Conditioned Fear Extinction by Systemic Administration or Intra-Amygdala Infusions of D-Cycloserine as Assessed with FearPotentiated Startle in Rats,” by D. L. Walker, K. J. Ressler, K.-T. Lu, and M. Davis, 2002, Journal of Neuroscience, 22, pp. 2343–2351. Copyright 2002 by the Society for Neuroscience. Adapted with permission. C: Intra-amygdala
c39.indd Sec3:772
training disrupts the subsequent retrieval of the extinction memory (Burgos-Robles, Vidal-Gonzalez, Santini, & Quirk, 2007), suggesting that cells in the mPFC store features of the fear extinction experience after training is complete. Importantly, extinction is known to be context-specific. That is, if a fear-conditioned rat is given fear extinction training in one context (Context A), the ability to inhibit fear is apparently linked to that context because renewal of the fear response occurs if the rat is presented with the CS outside of the extinction context (Bouton & Ricker, 1994). This fact, together with the finding that fully extinguished memories are capable of reinstating upon presentation of the US (Rescorla & Heth, 1975), has led to the long-held view that extinction does not result in the erasure of the original memory trace but is instead a new kind of learning that serves to inhibit expression of the old memory (Pavlov, 1927). Not surprisingly, recent studies have indicated that the hippocampus plays an important role in the contextual modulation of fear extinction. Maren and colleagues (Hobin, Goosens, & Maren, 2003), for example, have shown that training-induced neurophysiological responses in the LA readily extinguish within a fear extinction session, but that this neural representation of extinction, like the behavior itself, is specific to the context in which extinction has taken place. Further, functional inactivation of the hippocampus using the GABA-A agonist muscimol can impair the contextspecific expression of fear extinction (Corcoran & Maren, 2001). The requirements for both hippocampal and mPFC activity in extinction suggest that connections from the hippocampus to the mPFC are important for encoding contextual constraints on fear extinction learning. Beyond this, these findings have led researchers to propose a broad circuit model for fear extinction that involves projections from the hippocampus to the mPFC, and from the mPFC to the amygdala. The hippocampal-mPFC connection is needed to appropriately contextualize extinction, and the mPFCamygdala connection is needed to express extinction by inhibiting fear outputs from the fear circuitry of the amygdala (LA/B–intercalated cell masses–CE) that was discussed earlier in this chapter (Corcoran & Quirk, 2007; Hobin et al., 2003; Maren & Quirk, 2004; Paré et al., 2004; Quirk & Mueller, 2008; Sotres-Bayon, Bush, & LeDoux, 2004).
infusion of a MAP kinase inhibitor (PD98095) blocks extinction of FPS. Rats were infused with PD98095 before the 1st extinction session. The 2nd extinction session was given drug-free. Note the absence of extinction on the 1st session. In each group, black bars represent preextinction startle baselines, and white bars represent the amount of startle potentiation after an extinction session. From “Mitogen-Activated Protein Kinase Cascade in the Basolateral Nucleus of Amygdala Is Involved in Extinction of Fear-Potentiated Startle,” by K. T. Lu, D. L. Walker, and M. Davis, 2001, Journal of Neuroscience, 21, RC162. Copyright 2001 by the Society for Neuroscience. Adapted with permission.
8/17/09 3:07:57 PM
Beyond the Simple Fear Conditioning Circuit
These advances in our understanding of hippocampal and mPFC control over extinction retrieval have been important steps for the field. But questions about the neural mechanisms that underlie the actual formation of the extinction memory are not explained by this model. Insights into this problem come from a number of studies that have implicated the amygdala as an essential site of plasticity for the acquisition of fear extinction (Figure 39.8; Falls, Miserendino, & Davis, 1992; Lu, Walker, & Davis, 2001; Walker, Ressler, Lu, & Davis, 2002). Infusions of NMDA receptor antagonists or ERK/MAPK inhibitors into the amygdala have been shown to impair fear extinction (Davis, 2002; Falls et al., 1992; Lu et al., 2001). Conversely, both systemic and intra-amygdala infusions of partial agonists of the NMDA receptor facilitate fear extinction (Walker et al., 2002). These experiments suggest that some type of activity-dependent synaptic plasticity must take place in the amygdala during extinction learning, as it does during initial learning. In fact, unlike the mPFC, the amygdala appears to be necessary for the acquisition of fear extinction because blockade of NR2B-containing NMDA receptors in the LA prevents rats from the fear extinction learning that occurs across trials within a single extinction training session (Sotres-Bayon, Bush, & LeDoux, 2007). In contrast, disruption of BDNF-TrkB signaling with viral vector-mediated amygdala expression of dominant-negative TrkB was found to disrupt the consolidation, but not the acquisition, of fear extinction, suggesting that BDNF participates in the consolidation of the extinction memory within the intrinsic circuitry of the amygdala (Chhatwal, StanekRattiner, Davis, & Ressler, 2006). When considered together with the systems-level circuit discussed earlier, these findings suggest that fear extinction is first encoded in the amygdala during extinction training, and subsequently the amygdala trains the hippocampal-mPFC circuit so that the extinction memory can be later retrieved under contextually appropriate circumstances. The mechanisms for this systems-level consolidation process are not yet understood but may involve an amygdala-driven rehearsal of the extinction training experience via reciprocal connections from the amygdala to the hippocampus and mPFC. Fear-Motivated Instrumental Learning Reconsolidation blockade and extinction both represent mechanisms for diminishing the intensity of fear memories. Active coping is yet another mechanism for reducing the behavioral and emotional impact of fear. Pavlovian fear conditioning is useful for learning to detect a dangerous object or situation, but animals must also be able to use this information to guide ongoing behavior that is instrumental in avoiding that danger. Successful avoidance, made possible by Pavlovian associations that provide advance
c39.indd Sec3:773
773
warning of danger, is therefore a potentially positive (and behaviorally reinforcing) outcome following CS exposure. In experimental situations this type of learning can be modeled by requiring the animal to make a response (i.e., move away, press a bar, turn a wheel) that will allow it to avoid presentation of a shock or danger signal, a form of learning known as “active avoidance.” In other experimental situations, the animal can be required to learn not to respond, known as “passive avoidance.” Both of these are examples of instrumental conditioning, and the amygdala, cooperating with other brain regions, plays a vital role in each. Previously, we mentioned that only the LA and CE were critical for Pavlovian fear conditioning. However, we have recently begun to appreciate the significance of projections from the LA to the basal nucleus of the amygdala (B, as defined earlier). Studies that employ fear learning tasks that require rats to learn both classical and instrumental components have begun to develop our knowledge of how emotional information can be used to motivate goaldirected responses (Amorapanth et al., 2000; Killcross, Robbins, & Everitt, 1997). Amorapanth et al., for example, first trained rats to associate a tone with footshock (the Pavlovian component). Next, rats learned to move from one side of a 2-compartment box to the other to avoid presentation of the tone (the instrumental component), a so-called escape-from-fear task. Findings showed that whereas lesions of the LA impaired both types of learning, lesions of the CE impair only the Pavlovian component (i.e., the tone-shock association). Conversely, lesions of the B impaired only the instrumental component (learning to move to the second compartment). Thus, different outputs of the LA appear to mediate Pavlovian and instrumental behaviors elicited by a fear-arousing stimulus (Amorapanth et al., 2000). It is important to note, however, that these findings do not indicate that the B is a site of motor control or a locus of memory storage for instrumental learning. Rather, the B likely guides fear-related behavior and reinforcement learning via its projections to nearby striatal regions that are known to be necessary for instrumental learning and reward processes (Everitt, Cador, & Robbins, 1989; Everitt et al., 1999; Robbins, Cador, Taylor, & Everitt, 1989). Our knowledge of how the amygdala transfers emotional information to brain regions involved in motivation and instrumental learning is still in its infancy. Research that addresses this issue is needed to unite these two related but sparsely integrated disciplines within behavioral neuroscience. Modulation of Explicit Memory by Fear Arousal Pavlovian fear conditioning is an implicit form of learning and memory. However, during most emotional experiences, including fear conditioning, explicit or conscious
8/17/09 3:07:57 PM
774
Neural Basis of Fear Conditioning
memories are also formed (LeDoux, 1996). These occur through the operation of the medial temporal lobe memory system involving the hippocampus and related cortical areas (Eichenbaum, 2000; Milner et al., 1998). The role of the hippocampus in the explicit memory of an emotional experience is much the same as its role in other kinds of experiences, with one important exception. During fearful or emotionally arousing experiences, the amygdala activates neuromodulatory systems in the brain and hormonal systems in the body via its projections to the hypothalamus, which can drive the hypothalamic-pituitary-adrenal (HPA) axis. Neurohormones released by these systems can, in turn, feed back to modulate the function of forebrain structures such as the hippocampus and serve to enhance the storage of the memory in these regions (McGaugh, 2000). The primary support for this model comes from studies of inhibitory avoidance learning, a type of passive avoidance learning, briefly introduced in the preceding section, whereby the animal must learn to not enter a chamber in which it previously received a shock. In this paradigm, various pharmacological manipulations of the amygdala that affect neurotransmitter or neurohormonal systems modulate the strength of the memory. For example, immediate posttraining blockade of intra-amygdala noradrenergic or glucocorticoid receptors impairs retention of inhibitory avoidance, whereas facilitation of these systems in the amygdala enhances acquisition and memory storage (McGaugh, 2000; McGaugh et al., 1993). The exact subnuclei in the amygdala that are critical for memory modulation remain unknown, as do the areas of the brain where these amygdala projections influence memory storage. Candidate areas include the hippocampus and entorhinal and parietal cortices (Izquierdo et al., 1997). Indeed, it would be interesting to know whether the changes in unit activity, or the activation of intracellular signaling cascades, in the hippocampus during and after fear conditioning, as discussed earlier, might be related to formation of such explicit memories, and how regulation of these signals depends on the integrity of the amygdala and its neuromodulators. Interestingly, it has been shown that stimulation of the B can modulate the persistence of LTP in the hippocampus (Frey, BergadoRosado, Seidenbecher, Pape, & Frey, 2001), which provides a potential mechanism whereby the amygdala can modulate hippocampal-dependent memories (Roozendaal, Barsegyan, & Lee, 2008; Roozendaal, Okuda, de Quervain, & McGaugh, 2006).
FEAR CONDITIONING IN HUMANS Studies of fear conditioning in humans have corroborated the findings from fear conditioning research in rodents.
c39.indd Sec3:774
In general, advances in our understanding of the brain’s contributions to human emotions have come from two broad categories of neuropsychology research: studies in patients with damage to localized brain regions and studies that involve brain imaging in healthy subjects. The former provide evidence for a causal link between loss of function and region-specific damage, but the extent of brain damage cannot be easily controlled. The latter provide greater spatial and temporal precision. Neuropsychology of Fear in Brain-Lesioned Patients One of the most important insights gained from studies in brain-lesioned patients is that emotional learning and conscious awareness are dissociable phenomena. Patients with damage to the medial temporal lobe region, including the amygdala, show deficits in the ability to acquire conditioned fear responses, even when conscious awareness of the fear conditioning experience is intact. Conversely, patients with selective damage to the hippocampus and related areas of the medial temporal lobe show the opposite pattern: impaired declarative memory for the fear conditioning experience but intact implicit emotional responses to the CS (Bechara et al., 1995; Hamann, Monarch, & Goldstein, 2002; LaBar, LeDoux, Spencer, & Phelps, 1995). Hippocampal damage also appears to remove contextual constraints on fear extinction (LaBar & Phelps, 2005). In addition to the hippocampus, damage to the ventral mPFC also produces fear extinction deficits (Bechara, Damasio, Damasio, & Anderson, 1994; Davidson, Putnam, & Larson, 2000; Rolls, Hornak, Wade, & McGrath, 1994), which corresponds to deficits observed in rats with ventral mPFC lesions (Lebron, Milad, & Quirk, 2004; Morgan & LeDoux, 1995; Morgan et al., 1993; Morgan, Schulkin, & LeDoux, 2003; Quirk et al., 2000; Sierra-Mercado, Corcoran, Lebron-Milad, & Quirk, 2006). Functional Brain Imaging of Fear in Healthy Subjects Functional imaging during fear conditioning of healthy volunteers consistently reveals increased amygdala activation during fear conditioning and early phases of extinction (Buchel, Dolan, Armony, & Friston, 1999; Buchel, Morris, Dolan, & Friston, 1998; Cheng, Knight, Smith, Stein, & Helmstetter, 2003; Knight, Cheng, Smith, Stein, & Helmstetter, 2004; LaBar, Gatenby, Gore, LeDoux, & Phelps, 1998; Phelps, Delgado, Nearing, & LeDoux, 2004). In fact, individual differences in fear have been found to correlate with the degree of amygdala activity (Cheng et al., 2003; Furmark, Fischer, Wik, Larsson, & Fredrikson, 1997; LaBar
8/17/09 3:07:57 PM
Summary
et al., 1998). Interestingly, the strongest amygdala activation is observed during the early phase of conditioning (Buchel et al., 1998; LaBar et al., 1998), which is reminiscent of the transiently plastic cells observed in the dorsal regions of the LA in the rat (Repa et al., 2001). In addition to learning about danger from direct contact with an unconditioned stimulus, humans also learn in indirect ways. This is illustrated by a paradigm called “instructed fear,” whereby the subjects are told that one stimulus may be paired with a shock, but the subjects never receive a shock. Nevertheless, the CS leads to amygdala activation (Phelps et al., 2001), and damage to the amygdala disrupts the expression of the CS-elicited autonomic responses (Funayama, Grillon, Davis, & Phelps, 2001). Another way that fear is learned indirectly is by observation; that is, subjects who observe others being conditioned also develop conditioned responses to the CS. Such a CS then leads to amygdala activation (Olsson & Phelps, 2007). Fear conditioning leads to CS-induced amygdala activation even when subjects are unaware of the CS due to subliminal presentation techniques (Morris, Ohman, & Dolan, 1999). Similarly, a subliminal CS elicits amygdala activation after observational learning but not after instructed fear conditioning (Olsson & Phelps, 2004). Examples of fMRI imaging of amygdala activity after fear conditioning, instructed fear, and observational fear are shown in Figure 39.9 (courtesy of Elizabeth A. Phelps).
Clinical Implications The close correspondence between the brain regions involved in rodent and human fear conditioning suggests that insights gained from animal studies can be applied to the clinical setting. Exposure therapy is procedurally similar to fear extinction training and is currently the most effective method for treating anxiety disorders, especially phobias. However, similar to the postextinction renewal of fear seen in rat studies, when the CS is presented outside the extinction training context, patients often experience relapse of fear symptoms when they leave the therapeutic setting (Rodriguez, Craske, Mineka, & Hladek, 1999). Because animal studies conducted by Davis and colleagues have shown that partial NMDA agonists can facilitate fear extinction (Walker et al., 2002), the same group was able to use a similar pharmacological treatment and successfully enhance the clinical efficacy of exposure therapy in human patients (Ressler et al., 2002). This kind of translational research is becoming an increasingly important emphasis in emotion research.
c39.indd Sec4:775
775
(A)
(B)
(C)
R
L
Figure 39.9 Fear-induced amygdala activation in humans. Note: CS presentations to humans cause similar increases in amygdala activation after A: fear conditioning, in which subjects are given paired presentations of the CS and US, B: instructed fear, in which subjects are instructed about the CS-US association but do not directly experience the association, and C: observational fear learning, in which subjects observe someone else undergoing fear conditioning. Figure shows structural MRI of the human brain. Figure courtesy of Elizabeth A. Phelps.
SUMMARY In just over 2 decades we have seen a remarkable resurgence of interest in emotion research. Advances in brain research have been systematically combined with the fear conditioning paradigm, which has enabled us to trace how stimuli are attributed with fear-eliciting properties through their temporal association with innately aversive events. The amygdala has emerged as a crucial site of convergence for CS and US input pathways, and we now have knowledge of the cellular and molecular events that are needed to encode and store fear memories. This has led to important discoveries about the different ways and mechanisms through which an established fear memory can be modified, including reconsolidation, extinction, and the learning of active coping responses. These discoveries, in turn,
8/17/09 3:07:57 PM
776
Neural Basis of Fear Conditioning ing stimulation of the perforant path. Journal of Physiology, 232, 331–356.
have provided empirical data that indicate a separation of the brain mechanisms that mediate emotion and conscious awareness, as well as improved understanding of how these dissociable processes interact. Finally, these insights have begun to suggest new methods that can be introduced into the clinical setting.
Bordi, F., & LeDoux, J. E. (1992). Sensory tuning beyond the sensory system: An initial analysis of auditory properties of neurons in the lateral amygdaloid nucleus and overlying areas of the striatum. Journal of Neuroscience, 12, 2493–2503.
REFERENCES
Bouton, M. E., & Ricker, S. T. (1994). Renewal of extinguished responding in a second context. Animal Learning and Behavior, 22, 317–324.
Abel, T., Nguyen, P. V., Barad, M., Deuel, T. A., Kandel, E. R., & Bourtchouladze, R. (1997). Genetic demonstration of a role for PKA in the late phase of LTP, and in hippocampus-based long-term memory. Cell, 88, 615–626. Amorapanth, P., LeDoux, J. E., & Nader, K. (2000). Different lateral amygdala outputs mediate reactions and actions elicited by a fear-arousing stimulus. Nature Neuroscience, 3, 74–79. Anglada-Figueroa, D., & Quirk, G. J. (2005). Lesions of the basal amygdala block expression of conditioned fear but not extinction. Journal of Neuroscience, 25, 9680–9685. Apergis-Schoute, A. M., Debiec, J., Doyère, V., LeDoux, J. E., & Schafe, G. E. (2005). Auditory fear conditioning and long-term potentiation in the lateral amygdala require ERK/MAP kinase signaling in the auditory thalamus: A role for presynaptic plasticity in the fear system. Journal of Neuroscience, 25, 5730–5739. Atkins, C. M., Selcher, J. C., Petraitis, J. J., Trzaskos, J. M., & Sweatt, J. D. (1998). The MAPK cascade is required for mammalian associative learning. Nature Neuroscience, 1, 602–609.
Bourtchuladze, R., Frenguelli, B., Blendy, J., Cioffi, D., Schutz, G., & Silva, A. J. (1994). Deficient long-term memory in mice with a targeted mutation of the cAMP-responsive element-binding protein. Cell, 79, 59–68.
Brambilla, R., Gnesutta, N., Minichiello, L., White, G., Roylance, A. J., Herron, C. E., et al. (1997, November 20). A role for the ras signaling pathway in synaptic transmission and long-term memory. Nature, 390, 281–286. Brunzell, D. H., & Kim, J. J. (2001). Fear conditioning to tone, but not to context, is attenuated by lesions of the insular cortex and posterior extension of the intralaminar complex in rats. Behavioral Neuroscience, 115, 365–375. Buchel, C., Dolan, R. J., Armony, J. L., & Friston, K. J. (1999). Amygdalahippocampal involvement in human aversive trace conditioning revealed through event-related functional magnetic resonance imaging. Journal of Neuroscience, 19, 10869–10876. Buchel, C., Morris, J., Dolan, R. J., & Friston, K. J. (1998). Brain systems mediating aversive conditioning: An event-related fMRI study. Neuron, 20, 947–957. Burgos-Robles, A., Vidal-Gonzalez, I., Santini, E., & Quirk, G. J. (2007). Consolidation of fear extinction requires NMDA receptor-dependent bursting in the ventromedial prefrontal cortex. Neuron, 53, 871–880.
Bailey, D. J., Kim, J. J., Sun, W., Thompson, R. F., & Helmstetter, F. J. (1999). Acquisition of fear conditioning in rats requires the synthesis of mRNA in the amygdala. Behavioral Neuroscience, 113, 276–282.
Campeau, S., & Davis, M. (1995). Involvement of the central nucleus and basolateral complex of the amygdala in fear conditioning measured with fear-potentiated startle in rats trained concurrently with auditory and visual conditioned stimuli. Journal of Neuroscience, 15, 2301–2311.
Balleine, B. W., & Dickinson, A. (1998). Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates. Neuropharmacology, 37(4–5), 407–419.
Cardinal, R. N., Parkinson, J. A., Hall, J., & Everitt, B. J. (2002). Emotion and motivation: The role of the amygdala, ventral striatum, and prefrontal cortex. Neuroscience and Biobehavioral Reviews, 26, 321–352.
Bauer, E. P., LeDoux, J. E., & Nader, K. (2001). Fear conditioning and LTP in the lateral amygdala are sensitive to the same stimulus contingencies. Nature Neuroscience, 4, 687–688.
Chapman, P. F., Kairiss, E. W., Keenan, C. L., & Brown, T. H. (1990). Longterm synaptic potentiation in the amygdala. Synapse, 6, 271–278.
Bauer, E. P., Schafe, G. E., & LeDoux, J. E. (2002). NMDA receptors and L-type voltage-gated calcium channels contribute to long-term potentiation and different components of fear memory formation in the lateral amygdala. Journal of Neuroscience, 22, 5239–5249. Bechara, A., Damasio, A. R., Damasio, H., & Anderson, S. W. (1994). Insensitivity to future consequences following damage to human prefrontal cortex. Cognition, 50(1/3), 7–15. Bechara, A., Tranel, D., Damasio, H., Adolphs, R., Rockland, C., & Damasio, A. R. (1995, August 25). Double dissociation of conditioning and declarative knowledge relative to the amygdala and hippocampus in humans. Science, 269, 1115–1118. Berntson, G. G., Bechara, A., Damasio, H., Tranel, D., & Cacioppo, J. T. (2007). Amygdala contribution to selective dimensions of emotion. Social Cognitive and Affective Neuroscience, 2, 123–129. Blair, H. T., Schafe, G. E., Bauer, E. P., Rodrigues, S. M., & LeDoux, J. E. (2001). Synaptic plasticity in the lateral amygdala: A cellular hypothesis of fear conditioning. Learning and Memory, 8, 229–242.
Cheng, D. T., Knight, D. C., Smith, C. N., Stein, E. A., & Helmstetter, F. J. (2003). Functional MRI of human amygdala activity during Pavlovian fear conditioning: Stimulus processing versus response expression. Behavioral Neuroscience, 117, 3–10. Chhatwal, J. P., Stanek-Rattiner, L., Davis, M., & Ressler, K. J. (2006). Amygdala BDNF signaling is required for consolidation but not encoding of extinction. Nature Neuroscience, 9, 870–872. Clugnet, M. C., & LeDoux, J. E. (1989). Synaptic plasticity in fear conditioning circuits: Induction of LTP in the lateral nucleus of the amygdala by stimulation of the medial geniculate body. Journal of Neuroscience, 10, 2818–2824. Corcoran, K. A., & Maren, S. (2001). Hippocampal inactivation disrupts contextual retrieval of fear memory after extinction. Journal of Neuroscience, 21, 1720–1726. Corcoran, K. A., & Quirk, G. J. (2007). Recalling safety: Cooperative functions of the ventromedial prefrontal cortex and the hippocampus in extinction. CNS Spectrums, 12(3), 200–206.
Blanchard, R. J., & Blanchard, D. C. (1969). Crouching as an index of fear. Journal of Comparative Physiological Psychology, 67, 370–375.
Davidson, R. J., Putnam, K. M., & Larson, C. L. (2000). Dysfunction in the neural circuitry of emotion regulation: A possible prelude to violence. Science, 289, 591–594.
Blanchard, R. J., Dielman, T. E., & Blanchard, D. C. (1968). Postshock crouching: Familiarity with the shock situation. Psychonomic Science, 10, 371–372.
Davis, M. (1997). Neurobiology of fear responses: The role of the amygdala. Journal of Neuropsychiatry and Clinical Neurosciences, 9, 382–402.
Bliss, T. V. P., & Lømo, T. (1973). Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit follow-
c39.indd Sec4:776
Davis, M. (2002). Role of NMDA receptors and MAP kinase in the amygdala in extinction of fear: Clinical implications for exposure therapy. European Journal of Neuroscience, 16, 395–398.
8/17/09 3:07:59 PM
References 777 Davis, M., Walker, D. L., & Lee, Y. (1997). Roles of the amygdala and bed nucleus of the stria terminalis in fear and anxiety measured with the acoustic startle reflex: Possible relevance to PTSD. Annals of the New York Academy of Sciences, 821, 305–331. Debiec, J., & LeDoux, J. E. (2004). Disruption of reconsolidation but not consolidation of auditory fear conditioning by noradrenergic blockade in the amygdala. Neuroscience, 129, 267–272. Debiec, J., & LeDoux, J. E. (2006). Noradrenergic signaling in the amygdala contributes to the reconsolidation of fear memory: Treatment implications for PTSD. Annals of the New York Academy of Sciences, 1071, 521–524.
Furmark, T., Fischer, H., Wik, G., Larsson, M., & Fredrikson, M. (1997). The amygdala and individual differences in human fear conditioning. NeuroReport, 8, 3957–3960. Garcia, R., Vouimba, R. M., Baudry, M., & Thompson, R. F. (1999, November 18). The amygdala modulates prefrontal cortex activity relative to conditioned fear. Nature, 402, 294–296. Gewirtz, J. C., Falls, W. A., & Davis, M. (1997). Normal conditioned inhibition and extinction of freezing and fear-potentiated startle following electrolytic lesions of medial prefrontal cortex in rats. Behavioral Neuroscience, 111, 712–726.
De Oca, B. M., DeCola, J. P., Maren, S., & Fanselow, M. S. (1998). Distinct regions of the periaqueductal gray are involved in the acquisition and expression of defensive responses. Journal of Neuroscience, 18, 3426–3432.
Goosens, K. A., Hobin, J. A., & Maren, S. (2003). Auditory-evoked spike firing in the lateral amygdala and Pavlovian fear conditioning: Mnemonic code or fear bias? Neuron, 40, 1013–1022.
Doron, N. N., & LeDoux, J. E. (1999). Organization of projections to the lateral amygdala from auditory and visual areas of the thalamus in the rat. Journal of Comparative Neurology, 412, 383–409.
Goosens, K. A., Holt, W., & Maren, S. (2000). A role for amygdaloid, PKA and PKC in the acquisition of long-term conditional fear memories in rats. Behavioural Brain Research, 114(1–2), 145–152.
Doyère, V., Debiec, J., Monfils, M. H., Schafe, G. E., & LeDoux, J. E. (2007). Synapse-specific reconsolidation of distinct fear memories in the lateral amygdala. Nature Neuroscience, 10, 414–416.
Goosens, K. A., & Maren, S. (2001). Contextual and auditory fear conditioning are mediated by the lateral, basal, and central amygdaloid nuclei in rats. Learning and Memory, 8, 148–155.
Doyère, V., Redini-Del Negro, C., Dutrieux, G., Le Floch, G., Davis, S., & Laroche, S. (1995). Potentiation or depression of synaptic efficacy in the dentate gyrus is determined by the relationship between the conditioned and unconditioned stimulus in a classical conditioning paradigm in rats. Behavioural Brain Research, 70, 15–29.
Goosens, K. A., & Maren, S. (2004). NMDA receptors are essential for the acquisition, but not expression, of conditional fear and associative spike firing in the lateral amygdala. European Journal of Neuroscience, 20, 537–548.
Edeline, J.-M., Pham, P., & Weinberger, N. M. (1993). Rapid development of learning-induced receptive field plasticity in the auditory cortex. Behavioral Neuroscience, 107, 539–551.
Hall, J., Thomas, K. L., & Everitt, B. J. (2000). Rapid and selective induction of BDNF expression in the hippocampus during contextual learning. Nature Neuroscience, 3, 533–535.
Eichenbaum, H. (2000). A cortical-hippocampal system for declarative memory. Nature Reviews: Neuroscience, 1, 41–50.
Hamann, S., Monarch, E. S., & Goldstein, F. C. (2002). Impaired fear conditioning in Alzheimer ’s disease. Neuropsychologia, 40, 1187–1195.
Everitt, B. J., Cador, M., & Robbins, T. W. (1989). Interactions between the amygdala and ventral striatum in stimulus-reward associations: Studies using a second-order schedule of sexual reinforcement. Neuroscience, 30, 63–75.
Helmstetter, F. J., & Bellgowan, P. S. (1994). Effects of muscimol applied to the basolateral amygdala on acquisition and expression of contextual fear conditioning in rats. Behavioral Neuroscience, 108, 1005–1009.
Everitt, B. J., Parkinson, J. A., Olmstead, M. C., Arroyo, M., Robledo, P., & Robbins, T. W. (1999). Associative processes in addiction and reward: The role of amygdala-ventral striatal subsystems. Annals of the New York Academy of Sciences, 877, 412–438.
Helmstetter, F. J., & Landeira-Fernandez, J. (1990). Conditional hypoalgesia is attenuated by naltrexone applied to the periaqueductal gray. Brain Research, 537, 88–92.
Falls, W. A., Miserendino, M. J., & Davis, M. (1992). Extinction of fearpotentiated startle: Blockade by infusion of an NMDA antagonist into the amygdala. Journal of Neuroscience, 12, 854–863.
Helmstetter, F. J., & Tershner, S. A. (1994). Lesions of the periaqueductal gray and rostral ventromedial medulla disrupt antinociceptive but not cardiovascular aversive conditional responses. Journal of Neuroscience, 14, 7099–7108.
Fanselow, M. S. (1980). Conditional and unconditional components of postshock freezing. Pavlovian Journal of Biological Science, 15, 177–182.
Herry, C., Vouimba, R. M., & Garcia, R. (1999). Plasticity in the mediodorsal thalamo-prefrontal cortical transmission in behaving mice. Journal of Neurophysiology, 82, 2827–2832.
Fanselow, M. S., & LeDoux, J. E. (1999). Why we think plasticity underlying Pavlovian fear conditioning occurs in the basolateral amygdala. Neuron, 23, 229–232.
Hobin, J. A., Goosens, K. A., & Maren, S. (2003). Context-dependent neuronal activity in the lateral amygdala represents fear memories after extinction. Journal of Neuroscience, 23, 8410–8416.
Farb, C. R., Aoki, C., Milner, T., Kaneko, T., & LeDoux, J. E. (1992). Glutamate immunoreactive terminals in the lateral amygdaloid nucleus: A possible substrate for emotional memory. Behavioural Brain Research, 593, 145–158.
Holland, P. C., & Gallagher, M. (2004). Amygdala-frontal interactions and reward expectancy. Current Opinion in Neurobiology, 4, 148–155.
Frankland, P. W., Cestari, V., Filipkowski, R. K., McDonald, R. J., & Silva, A. J. (1998). The dorsal hippocampus is essential for context discrimination but not for contextual conditioning. Behavioral Neuroscience, 112, 863–874. Frey, S., Bergado-Rosado, J., Seidenbecher, T., Pape, H. C., & Frey, J. U. (2001). Reinforcement of early long-term potentiation (earlyLTP) in dentate gyrus by stimulation of the basolateral amygdala: Heterosynaptic induction mechanisms of late-LTP. Journal of Neuroscience, 21, 3697–3703. Funayama, E. S., Grillon, C., Davis, M., & Phelps, E. A. (2001). A double dissociation in the affective modulation of startle in humans: Effects
c39.indd Sec5:777
of unilateral temporal lobectomy. Journal of Cognitive Neuroscience, 13, 721–729.
Huang, Y. Y., & Kandel, E. R. (1998). Postsynaptic induction and PKAdependent expression of LTP in the lateral amygdala. Neuron, 21, 169–178. Huang, Y. Y., Martin, K. C., & Kandel, E. R. (2000). Both protein kinase A and mitogen-activated protein kinase are required in the amygdala for the macromolecular synthesis-dependent late phase of long-term potentiation. Journal of Neuroscience, 20, 6317–6325. Huff, N. C., & Rudy, J. W. (2004). The amygdala modulates hippocampusdependent context memory formation and stores cue-shock associations. Neuroscience, 118, 53–62. Humeau, Y., & Lüthi, A. (2007). Dendritic calcium spikes induce bi-directional synaptic plasticity in the lateral amygdala. Neuropharmacology, 52, 234–243.
8/17/09 3:08:00 PM
778
Neural Basis of Fear Conditioning
Humeau, Y., Shaban, H., Bissière, S., & Lüthi, A. (2003, December 18). Presynaptic induction of heterosynaptic associative plasticity in the mammalian brain. Nature, 426, 841–845.
LeDoux, J. E., & Farb, C. R. (1991). Neurons of the acoustic thalamus that project to the amygdala contain glutamate. Neuroscience Letters, 134, 145–149.
Impey, S., Smith, D. M., Obrietan, K., Donahue, R., Wade, C., & Storm, D. R. (1998). Stimulation of cAMP response element (CRE)-mediated transcription during contextual learning. Nature Neuroscience, 1, 595–601.
LeDoux, J. E., Farb, C. R., & Romanski, L. M. (1991). Overlapping projections to the amygdala and striatum from auditory processing areas of the thalamus and cortex. Neuroscience Letters, 134, 139–144.
Iwata, J., LeDoux, J. E., & Reis, D. J. (1986). Destruction of intrinsic neurons in the lateral hypothalamus disrupts the classical conditioning of autonomic but not behavioral emotional responses in the rat. Behavioural Brain Research, 368, 161–166.
LeDoux, J. E., Iwata, J., Cicchetti, P., & Reis, D. J. (1988). Different projections of the central amygdaloid nucleus mediate autonomic and behavioral correlates of conditioned fear. Journal of Neuroscience, 8, 2517–2529.
Izquierdo, I., Quillfeldt, J. A., Zanatta, M. S., Quevedo, J., Schaeffer, E., Schmitz, P. K., et al. (1997). Sequential role of hippocampus and amygdala, entorhinal cortex and parietal cortex in formation and retrieval of memory for inhibitory avoidance in rats. European Journal of Neuroscience, 9, 786–793. Jarrell, T. W., Gentile, C. G., Romanski, L. M., McCabe, P. M., & Schneiderman, N. (1987). Involvement of cortical and thalamic auditory regions in retention of differential bradycardia conditioning to acoustic conditioned stimuli in rabbits. Brain Research, 412, 285–294. Josselyn, S. A., Shi, C., Carlezon, W. A., Jr., Neve, R. L., Nestler, E. J., & Davis, M. (2001). Long-term memory is facilitated by cAMP responseelement binding protein overexpression in the amygdala. Journal of Neuroscience, 21, 2404–2412. Kapp, B. S., Frysinger, R. C., Gallagher, M., & Haselton, J. R. (1979). Amygdala central nucleus lesions: Effect on heart rate conditioning in the rabbit. Physiology and Behavior, 23, 1109–1117. Kida, S., Josselyn, S. A., de Oritz, S. P., Kogan, J. H., Chevere, I., Masushige, S., et al. (2002). CREB required for the stability of new and reactivated fear memories. Nature Neuroscience, 5, 348–355. Killcross, S., Robbins, T. W., & Everitt, B. J. (1997, July 24). Different types of fear-conditioned behaviour mediated by separate nuclei within amygdala. Nature, 388, 377–380. Kim, J. J., DeCola, J. P., Landeira-Fernandez, J., & Fanselow, M. S. (1991). N-Methy-D-Aspartate receptor antagonist APV blocks acquisition but not expression of fear conditioning. Behavioral Neuroscience, 105, 126–133. Kim, J. J., & Fanselow, M. S. (1992, May 1). Modality-specific retrograde amnesia of fear. Science, 256, 675–677. Kim, J. J., Rison, R. A., & Fanselow, M. S. (1993). Effects of amygdala, hippocampus, and periaqueductal gray lesions on short- and long-term contextual fear. Behavioral Neuroscience, 107, 1–6. Knight, D. C., Cheng, D. T., Smith, C. N., Stein, E. A., & Helmstetter, F. J. (2004). Neural substrates mediating human delay and trace fear conditioning. Journal of Neuroscience, 24, 218–228. LaBar, K. S., Gatenby, J. C., Gore, J. C., LeDoux, J. E., & Phelps, E. A. (1998). Human amygdala activation during conditioned fear acquisition and extinction: A mixed-trial fMRI study. Neuron, 20, 937–945. LaBar, K. S., LeDoux, J. E., Spencer, D. D., & Phelps, E. A. (1995). Impaired fear conditioning following unilateral temporal lobectomy in humans. Journal of Neuroscience, 15, 6846–6855. LaBar, K. S., & Phelps, E. A. (2005). Reinstatement of conditioned fear in humans is context dependent and impaired in amnesia. Behavioral Neuroscience, 119, 677–686. Lebron, K., Milad, M. R., & Quirk, G. J. (2004). Delayed recall of fear extinction in rats with lesions of ventral medial prefrontal cortex. Learning and Memory, 11, 544–548. LeDoux, J. E. (1996). The emotional brain. New York: Simon & Schuster. LeDoux, J. E. (2000). Emotion circuits in the brain. Annual Review of Neuroscience, 23, 155–184. LeDoux, J. E., Cicchetti, P., Xagoraris, A., & Romanski, L. M. (1990). The lateral amygdaloid nucleus: Sensory interface of the amygdala in fear conditioning. Journal of Neuroscience, 10, 1062–1069.
c39.indd Sec5:778
LeDoux, J. E., Iwata, J., Pearl, D., & Reis, D. J. (1986). Disruption of auditory but not visual learning by destruction of intrinsic neurons in the rat medial geniculate body. Behavioural Brain Research, 371, 395–399. LeDoux, J. E., Ruggerio, D. A., & Reis, D. J. (1985). Projections to the subcortical forebrain from anatomically defined regions of the medial geniculate body in the rat. Journal of Comparative Neurology, 242, 182–213. LeDoux, J. E., Sakaguchi, A., & Reis, D. J. (1984). Subcortical efferent projections of the medial geniculate nucleus mediate emotional responses conditioned by acoustic stimuli. Journal of Neuroscience, 4, 683–698. Li, X. F., Stutzmann, G. E., & LeDoux, J. E. (1996). Convergent but temporally separated inputs to lateral amygdala neurons from the auditory thalamus and auditory cortex use different postsynaptic receptors: In vivo intracellular and extracellular recordings in fear conditioning pathways. Learning and Memory, 3(2/3), 229–242. Lindquist, D. H., & Brown, T. H. (2004). Temporal encoding in fear conditioning revealed through associative reflex facilitation. Behavioral Neuroscience, 118, 395–402. Lu, K. T., Walker, D. L., & Davis, M. (2001). Mitogen-activated protein kinase cascade in the basolateral nucleus of amygdala is involved in extinction of fear-potentiated startle. Journal of Neuroscience, 21, RC162. Maren, S. (1998). Overtraining does not mitigate contextual fear conditioning deficits produced by neurotoxic lesions of the basolateral amygdala. Journal of Neuroscience, 18, 3088–3097. Maren, S. (1999). Long-term potentiation in the amygdala: A mechanism for emotional learning and memory. Trends in Neuroscience, 22, 561–567. Maren, S. (2000). Auditory fear conditioning increases CS-elicited spike firing in lateral amygdala neurons even after extensive overtraining. European Journal of Neuroscience, 12, 4047–4054. Maren, S., Aharonov, G., & Fanselow, M. S. (1997). Neurotoxic lesions of the dorsal hippocampus and Pavlovian fear conditioning in rats. Behavioural Brain Research, 88, 261–274. Maren, S., & Quirk, G. J. (2004). Neuronal signaling of fear memory. Nature Reviews: Neuroscience, 5, 844–852. McDonald, A. J. (1998). Cortical pathways to the mammalian amygdala. Progress in Neurobiology, 55, 257–332. McGaugh, J. L. (2000, January 14). Memory: A century of consolidation. Science, 287, 248–251. McGaugh, J. L., Introini-Collison, I. B., Cahill, L. F., Castellano, C., Dalmaz, C., Parent, M. B., et al. (1993). Neuromodulatory systems and memory storage: Role of the amygdala. Behavioural Brain Research, 58, 81–90. McKernan, M. G., & Shinnick-Gallagher, P. (1997, December 11). Fear conditioning induces a lasting potentiation of synaptic currents in vitro. Nature, 390, 607–611. Medina, J. F., Repa, J. C., & LeDoux, J. E. (2002). Parallels between cerebellum- and amygdala-dependent conditioning. Nature Reviews: Neuroscience, 3, 122–131. Milad, M. R., & Quirk, G. J. (2002, November 7). Neurons in medial prefrontal cortex signal memory for fear extinction. Nature, 420, 70–74. Milner, B., Squire, L. R., & Kandel, E. R. (1998). Cognitive neuroscience and the study of memory. Neuron, 20, 445–468.
8/17/09 3:08:00 PM
References 779 Miserendino, M. J. D., Sananes, C. B., Melia, K. R., & Davis, M. (1990, June 21). Blocking of acquisition but not expression of conditioned fear-potentiated startle by NMDA antagonists in the amygdala. Nature, 345, 716–718. Moita, M. A., Rosis, S., Zhou, Y., LeDoux, J. E., & Blair, H. T. (2003). Hippocampal place cells acquire location-specific responses to the conditioned stimulus during auditory fear conditioning. Neuron, 37, 485–497. Morgan, M. A., & LeDoux, J. E. (1995). Differential contribution of dorsal and ventral medial prefrontal cortex to the acquisition and extinction of conditioned fear in rats. Behavioral Neuroscience, 109, 681–688. Morgan, M. A., Romanski, L. M., & LeDoux, J. E. (1993). Extinction of emotional learning: Contribution of medial prefrontal cortex. Neuroscience Letters, 163, 109–113. Morgan, M. A., Schulkin, J., & LeDoux, J. E. (2003). Ventral medial prefrontal cortex and emotional perseveration: The memory for prior extinction training. Behavioural Brain Research, 146(1/2), 121–130. Morris, J. S., Ohman, A., & Dolan, R. J. (1999). A subcortical pathway to the right amygdala mediating “unseen” fear. Proceedings of the National Academy of Sciences, USA, 96, 1680–1685. Muller, J., Corodimas, K. P., Fridel, Z., & LeDoux, J. E. (1997). Functional inactivation of the lateral and basal nuclei of the amygdala by muscimol infusion prevents fear conditioning to an explicit conditioned stimulus and to contextual stimuli. Behavioral Neuroscience, 111, 683–691. Nader, K., Majidishad, P., Amorapanth, P., & LeDoux, J. E. (2001). Damage to the lateral and central, but not other, amygdaloid nuclei prevents the acquisition of auditory fear conditioning. Learning and Memory, 8, 156–163. Nader, K., Schafe, G. E., & LeDoux, J. E. (2000, August 17). Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval. Nature, 406, 722–726. Olsson, A., & Phelps, E. A. (2004). Learned fear of “unseen” faces after Pavlovian, observational, and instructed fear. Psychological Sciences, 15, 822–828. Olsson, A., & Phelps, E. A. (2007). Social learning of fear. Nature Neuroscience, 10, 1095–1102. Paré, D., Quirk, G. J., & LeDoux, J. E. (2004). New vistas on amygdala networks in conditioned fear. Journal of Neurophysiology, 92, 1–9. Paré, D., Smith, Y., & Paré, J. F. (1995). Intra-amygdaloid projections of the basolateral and basomedial nuclei in the cat: Phaseolus vulgarisleucoagglutinin anterograde tracing at the light and electron microscopic level. Neuroscience, 69, 567–583.
Quirk, G. J., Armony, J. L., Repa, J. C., Li, X.-F., & LeDoux, J. E. (1997). Emotional memory: A search for sites of plasticity. Cold Spring Harbor Symposia on Biology, 61, 247–257. Quirk, G. J., & Mueller, D. (2008). Neural mechanisms of extinction learning and retrieval. Neuropsychopharmacology, 33, 56–72. Quirk, G. J., Repa, C., & LeDoux, J. E. (1995). Fear conditioning enhances short-latency auditory responses of lateral amygdala neurons: Parallel recordings in the freely behaving rat. Neuron, 15, 1029–1039. Quirk, G. J., Russo, G. K., Barron, J. L., & Lebron, K. (2000). The role of ventromedial prefrontal cortex in the recovery of extinguished fear. Journal of Neuroscience, 20, 6225–6231. Radwanska, K., Nikolaev, E., Knapska, E., & Kaczmarek, L. (2002). Differential response of two subdivisions of lateral amygdala to aversive conditioning as revealed by c-Fos and P-ERK mapping. NeuroReport, 13, 2241–2246. Rampon, C., Tang, Y. P., Goodhouse, J., Shimizu, E., Kyin, M., & Tsien, J. Z. (2000). Enrichment induces structural changes and recovery from nonspatial memory deficits in CA1 NMDAR1-knockout mice. Nature Neuroscience, 3, 238–244. Repa, J. C., Muller, J., Apergis, J., Desrochers, T. M., Zhou, Y., & LeDoux, J. E. (2001). Two different lateral amygdala cell populations contribute to the initiation and storage of memory. Nature Neuroscience, 4, 724–731. Rescorla, R. A. (1968). Probability of shock in the presence and absence of CS in fear conditioning. Journal of Comparative Physiological Psychology, 66, 1–5. Rescorla, R. A., & Heth, C. D. (1975). Reinstatement of fear to an extinguished conditioned stimulus. Journal of Experimental Psychology: Animal Behavior Processes, 1(1), 88–96. Ressler, K. J., Rothbaum, B. O., Tannenbaum, L., Anderson, P., Graap, K., Zimand, E., et al. (2002). Cognitive enhancers as adjuncts to psychotherapy: Use of D-cycloserine in phobic individuals to facilitate extinction of fear. Archives of General Psychiatry, 61, 1136–1144. Robbins, T. W., Cador, M., Taylor, J. R., & Everitt, B. J. (1989). Limbicstriatal interactions in reward-related processes. Neuroscience Biobehavioral Reviews, 13, 155–162. Rodrigues, S. M., Schafe, G. E., & LeDoux, J. E. (2001). Intraamygdala blockade of the NR2B subunit of the NMDA receptor disrupts the acquisition but not the expression of fear conditioning. Journal of Neuroscience, 21, 6889–6896.
Pavlov, I. P. (1927). Conditioned reflexes. London: Oxford University Press.
Rodrigues, S. M., Schafe, G. E., & LeDoux, J. E. (2004). Molecular mechanisms underlying emotional learning and memory in the amygdala. Neuron, 44, 75–91.
Phelps, E. A., Delgado, M. R., Nearing, K. I., & LeDoux, J. E. (2004). Extinction learning in humans: Role of the amygdala and vmPFC. Neuron, 43, 897–905.
Rodriguez, B. I., Craske, M. G., Mineka, S., & Hladek, D. (1999). Contextspecificity of relapse: Effects of therapist and environmental context on return of fear. Behaviour Research and Therapy, 37, 845–862.
Phelps, E. A., O’Connor, K. J., Gatenby, J. C., Gore, J. C., Grillon, C., & Davis, M. (2001). Activation of the left amygdala to a cognitive representation of fear. Nature Neuroscience, 4, 437–441.
Rogan, M., & LeDoux, J. E. (1995). LTP is accompanied by commensurate enhancement of auditory-evoked responses in a fear conditioning circuit. Neuron, 15, 127–136.
Phillips, R. G., & LeDoux, J. E. (1992). Differential contribution of amygdala and hippocampus to cued and contextual fear conditioning. Behavioral Neuroscience, 106, 274–285.
Rogan, M., Staubli, U., & LeDoux, J. E. (1997, December 11). Fear conditioning induces associative long-term potentiation in the amygdala. Nature, 390, 604–607.
Pitkänen, A., Savander, V., & LeDoux, J. E. (1997). Organization of intraamygdaloid circuitries in the rat: An emerging framework for understanding functions of the amygdala. Trends in Neuroscience, 20, 517–523.
Rolls, E. T., Hornak, J., Wade, D., & McGrath, J. (1994). Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. Journal of Neurology, Neurosurgery, and Psychiatry, 57, 1518–1524.
Quinn, J. J., Wied, H. M., Ma, Q. D., Tinsley, M. R., & Fanselow, M. S. (2008). Dorsal hippocampus involvement in delay fear conditioning depends upon the strength of the tone-footshock association. Hippocampus, 18, 640–654.
c39.indd Sec5:779
Quirk, G. J., Armony, J. L., & LeDoux, J. E. (1997). Fear conditioning enhances different temporal components of tone-evoked spike trains in auditory cortex and lateral amygdala. Neuron, 19, 613–624.
Romanski, L. M., Clugnet, M. C., Bordi, F., & LeDoux, J. E. (1993). Somatosensory and auditory convergence in the lateral nucleus of the amygdala. Behavioral Neuroscience, 107, 444–450.
8/17/09 3:08:01 PM
780
Neural Basis of Fear Conditioning
Romanski, L. M., & LeDoux, J. E. (1992a). Bilateral destruction of neocortical and perirhinal projection targets of the acoustic thalamus does not disrupt auditory fear conditioning. Neuroscience Letters, 142, 228–232.
Smith, O. A., Astley, C. A., Devito, J. L., Stein, J. M., & Walsh, R. E. (1980). Functional analysis of hypothalamic control of the cardiovascular responses accompanying emotional behavior. Federation Proceedings, 39, 2487–2494.
Romanski, L. M., & LeDoux, J. E. (1992b). Equipotentiality of thalamoamygdala and thalamo-cortico-amygdala circuits in auditory fear conditioning. Journal of Neuroscience, 12, 4501–4509.
Sotres-Bayon, F., Bush, D. E. A., & LeDoux, J. E. (2004). Emotional perseveration: An update on prefrontal-amygdala interactions in fear extinction. Learning and Memory, 11, 525–535.
Romanski, L. M., & LeDoux, J. E. (1993). Information cascade from primary auditory cortex to the amygdala: Corticocortical and corticoamygdaloid projections of temporal cortex in the rat. Cerebral Cortex, 3, 515–532.
Sotres-Bayon, F., Bush, D. E. A., & LeDoux, J. E. (2007). Acquisition of fear extinction requires activation of NR2B-containing NMDA receptors in the lateral amygdala. Neuropsychopharmacology, 32, 1929–1940.
Roozendaal, B., Barsegyan, A., & Lee, S. (2008). Adrenal stress hormones, amygdala activation, and memory for emotionally arousing experiences. Progressive Behavioural Brain Research, 167, 79–97. Roozendaal, B., Koolhaas, J. M., & Bohus, B. (1991). Attenuated cardiovascular, neuroendocrine, and behavioral responses after a single footshock in central amygdaloid lesioned male rats. Physiology and Behavior, 50, 771–775. Roozendaal, B., Okuda, S., de Quervain, D. J., & McGaugh, J. L. (2006). Glucocorticoids interact with emotion-induced noradrenergic activation in influencing different memory functions. Neuroscience, 138, 901–910. Rudy, J. W., Huff, N. C., & Matus-Amat, P. (2004). Understanding contextual fear conditioning: Insights from a two-process model. Neuroscience and Biobehavioral Reviews, 28, 675–685. Sara, S. J. (2000). Retrieval and reconsolidation: Toward a neurobiology of remembering. Learning and Memory, 7, 73–84.
Stiedl, O., Birkenfeld, K., Palve, M., & Spiess, J. (2000). Impairment of conditioned contextual fear of C57BL/6J mice by intracerebral injections of the NMDA receptor antagonist APV. Behavioural Brain Research, 116, 157–168. Sullivan, G. M., Apergis, J., Bush, D. E. A., Johnson, L. R., Hou, M., & LeDoux, J. E. (2004). Lesions in the bed nucleus of the stria terminalis disrupt corticosterone and freezing responses elicited by a contextual but not by a specific cue-conditioned fear stimulus. Neuroscience, 128, 7–14. Walker, D. L., Paschall, G. Y., & Davis, M. (2005). Glutamate receptor antagonist infusions into the basolateral and medial amygdala reveal differential contributions to olfactory vs. context fear conditioning and expression. Learning and Memory, 12, 120–129. Walker, D. L., Ressler, K. J., Lu, K.-T., & Davis, M. (2002). Facilitation of conditioned fear extinction by systemic administration or intra-amygdala infusions of D-cycloserine as assessed with fear-potentiated startle in rats. Journal of Neuroscience, 22, 2343–2351.
Schafe, G. E., Atkins, C. M., Swank, M. W., Bauer, E. P., Sweatt, J. D., & LeDoux, J. E. (2000). Activation of ERK/MAP kinase in the amygdala is required for memory consolidation of Pavlovian fear conditioning. Journal of Neuroscience, 20, 8177–8187.
Weisskopf, M. G., Bauer, E. P., & LeDoux, J. E. (1999). L-type voltagegated calcium channels mediate NMDA-independent associative long-term potentiation at thalamic input synapses to the amygdala. Journal of Neuroscience, 19, 10512–10519.
Schafe, G. E., Bauer, E. P., Rosis, S., Farb, C. R., Rodrigues, S. M., & LeDoux, J. E. (2005). Memory consolidation of Pavlovian fear conditioning requires nitric oxide signaling in the lateral amygdala. European Journal of Neuroscience, 22, 201–211.
Weisskopf, M. G., & LeDoux, J. E. (1999). Distinct populations of NMDA receptors at subcortical and cortical inputs to principal cells of the lateral amygdala. Journal of Neurophysiology, 81, 930–934.
Schafe, G. E., & LeDoux, J. E. (2000). Memory consolidation of auditory Pavlovian fear conditioning requires protein synthesis and protein kinase A in the amygdala. Journal of Neuroscience, 20, RC96.
Wilensky, A. E., Schafe, G. E., & LeDoux, J. E. (2000). The amygdala modulates memory consolidation of fear-motivated inhibitory avoidance learning but not classical fear conditioning. Journal of Neuroscience, 20, 7059–7066.
Sierra-Mercado, D., Jr., Corcoran, K. A., Lebron-Milad, K., & Quirk, G. J. (2006). Inactivation of the ventromedial prefrontal cortex reduces expression of conditioned fear and impairs subsequent recall of extinction. European Journal of Neuroscience, 24, 1751–1758.
Young, S. L., Bohenek, D. L., & Fanselow, M. S. (1994). NMDA processes mediate anterograde amnesia of contextual fear conditioning induced by hippocampal damage: Immunization against amnesia by context preexposure. Behavioral Neuroscience, 108, 19–29.
c39.indd Sec5:780
8/17/09 3:08:01 PM
Chapter 40
Neural Basis of Pleasure and Reward CLIFFORD M. KNAPP AND CONAN KORNETSKY
Rewarding stimuli all share the properties of engendering approach behaviors and of being able to act as unconditioned stimuli. Natural rewards include food, water, and copulation, all of which are closely linked to the survival of a species. Food and water will serve as motivators of animals to expend energy in tasks such as lever pressing. Similar behaviors are produced by certain artificial rewarding stimuli that include pharmacological agents such as cocaine and heroin and the electrical stimulation of select brain areas. Human experience indicates that interaction with rewarding stimuli is frequently associated with the experiencing of both short-lived pleasurable sensations and longer lasting periods of elevated mood. We can infer from the behavior of animals that they may also experience the hedonic effects of rewarding stimuli. This inference is strengthened by evidence that many of the systems that have been implicated in the experience of pleasure in humans are similar to their counterparts in other mammals. One example of this arises from the finding that the electrical stimulation of what were characterized as “septal” areas in humans in early experiments produce reports of pleasurable responses (Bishop, Elder, & Heath, 1963), while the delivery of brain stimulation to comparable regions in the rat is rewarding enough to maintain sustained responding for this stimulation (J. Olds & Milner, 1954). What had been called septal areas have now been identified as regions linked to the functioning of mesocorticolimbic systems. These systems have been implicated in the regulation of reward-related behavior. Within these mesocorticolimbic systems the ventral tegmental area (VTA) sends neuronal projections to other mesocorticolimbic structures, most notably the nucleus accumbens and the prefrontal cortical (see Figure 40.1; Emson & Koob, 1978; Hasue & Shammah-Lagnado, 2002). These projections are from cells that contain the neurotransmitter dopamine. When released from these cells, dopamine may interact with five types of receptors,
Prefrontal Cortex
Nucleus Accumbens
Ventral Pallidum
Amygdala
Ventral Tegmental Area
Figure 40.1 Simplified schematic of projections of dopaminergic neurons (arrows) from the ventral tegmental area to target structures within the mesocorticolimbic system.
which are divided into two basic classes. One class includes the dopamine D1-like receptors (the D1 and D5 receptors), and the second consists of D2-like receptors (the D2, D3, and D4 receptors). Several approaches are used to measure the hedonic effects of rewarding stimuli in human subjects. Such measures have been extensively developed in the field of drug abuse studies. Subjects may be asked to place a mark on visual analogue scales (i.e., Likert scales) to indicate to what degree they like a drug or experience a high after receiving a drug (Fischman & Foltin, 1991). Questionnaires have been developed based on the responses of drug users to series of questions that allow for the measure of different aspects of the subjective effects of drugs. The Addiction Research Center Inventory is one such questionnaire that allows rating of euphoric responses to drugs using the Morphine Benzedrine Group scale (ARCI-MBG) of the inventory (Haertzen, Hill, & Belleville, 1963).
781
Handbook of Neuroscience for the Behavioral Science, edited by Gary G. Berntson and John T. Cacioppo. Copyright # 2009 John Wiley & Sons, Inc. c40.indd 781
8/17/09 3:08:37 PM
Neural Basis of Pleasure and Reward
The likelihood that a drug will be abused is related to the ability of the drug to produce elevations in subjective scales of drug liking and euphoric effects as assessed using scales such as the ARCI-MBG. All of the commonly abused drugs, including amphetamine, heroin, and morphine, produce such elevations in recovering addict populations (Preston & Jasinski, 1991). Although these findings indicate that the hedonic effects of abused drugs play a role in the development of drug dependence, the determination of the precise role of these effects in this process remains a challenge. The study of the hedonic effects of stimuli in animals requires less direct approaches than are used for human subjects because hedonic value of any stimulus in an animal is based on inference. Measuring response rates for rewards delivered under schedules of reinforcement has been one approach to assessing the extent of reward produced by these stimuli. The progressive ratio schedule, which allows the determination of a break point, that is, the point of maximal response to obtain a reward (Roberts, Loh, & Vickers, 1989), is an example of such a schedule. Rates of response remain an indirect measure of reward value because responding may be driven in some cases by negative reinforcement, such as withdrawal symptoms, or may be depressed by conditions that involve impairment of motor function. In the conditioned place-preference approach, the amount of time that an animal will spend in an area that has been paired with a rewarding stimulus is regarded as another measure of the reward produced by a stimulus. Several factors, however, other than the degree of reward produced by a stimulus can influence conditioned place preference. These include factors that regulate learning, memory, and responsiveness to conditioned cues. Measures of level of currents that will maintain responding for brain stimulation offer a more direct means of examining the effects of rewarding stimuli on brain reward systems than do other behavioral measures. This is because responding for brain stimulation reward (BSR) involves responses to direct activation of the brain’s reward systems. Sensitivity to brain stimulation reward, as reflected in the lowering of current thresholds for brain stimulation reward responding, is increased by a variety of commonly abused drugs, such as alcohol, cocaine, and morphine (Kornetsky, 2004), that have hedonic effects in humans. An example of the effects of three of these drugs—heroin, methamphetamine, and nicotine—on brain stimulation reward thresholds is shown in Figure 40.2. Sensitivity to rewarding brain stimulation is measured by determining thresholds for some level of responding for this stimulation. One method of determining thresholds involves determining the current intensity or frequency at which animals will exhibit a half-maximal response for
c40.indd 782
2 1
MAMP Heroin Nicotine
0 Z-Score
782
⫺1 ⫺2 ⫺3 ⫺4 0.00
0.25
0.50 Dose
0.75
1.00
Figure 40.2 The effects of heroin, methamphetamine (MAMP), and nicotine on brain stimulation reward thresholds as a function of dose. Note: Current thresholds, determined by the rate-independent method, are expressed as standardized z-scores using performance on saline as the baseline condition. Consequently, a value of zero represents the threshold obtained when saline was administered. A z-score of 2 or greater represents a significant change from the saline condition, with p .05. Note that at the highest dose tested heroin raised the reward threshold, indicating what is often a common U-shaped dose-response.
rewarding stimulation. An alternative method of determining brain stimulation thresholds involves the use of classic psychophysics to generate a rate-independent threshold (Kornetsky, 2004). This approach involves the presentation of different current intensities during discrete trials; the threshold is taken to be the current intensity at which responding is maintained 50% of the time for a certain level of stimulation. The rate-independent method for assessing BSR thresholds is less influenced by the effects of drugs on motor behaviors and thus in some circumstances may more accurately reflect the effects of drugs on reward systems than do rate-dependent methods (Markou & Koob, 1992). Fundamental questions remain unanswered with respect to the neural basis of the hedonic effects of rewarding stimuli. It is not clear to what extent there is overlap in the neuronal networks that produce the hedonic effects associated with the wide variety of types of rewarding stimuli. For example, how similar are the neuronal networks that mediate the hedonic effects of natural rewards such as food compared to those of drugs, or those involved in the hedonic actions of psychomotor stimulants such as cocaine compared to other classes of drugs such as the opioids? Studies using multiple electrodes to detect the firing of individual neurons in the nucleus accumbens, a central structure in the mesolimbic system, indicate that distinct neuron populations encode information about cocainerelated reward compared to natural rewards such as food and water (Carelli & Wondolowski, 2003; Deadwyler,
8/17/09 3:08:38 PM
Neural Basis of Pleasure and Reward 783
Hayashizaki, Cheer, & Hampson, 2004). Distinct patterns of discharge have been observed in the nucleus accumbens for the time preceding response for a reward and the period after the reward delivery (Deadwyler et al., 2004). The actual response of the brain to rewarding stimuli most likely involves extensively distributed networks of neurons that consist of, at least, thousands of cells. It is not clear, then, that studies in which the activity of only a few cells is monitored can provide a comprehensive picture of the changes in neuronal activity that occur following exposure to rewarding stimuli. One problem that makes it difficult to identify the neuronal networks involved in the production of the hedonic effects of rewarding stimuli is that many rewarding stimuli have a diverse range of actions other than activation of reward processes. Drugs such as cocaine, for example, in addition to their rewarding effects, can produce anxiety, enhance locomotor activity, and increase arousal levels. One approach to dealing with this problem is to examine the effects of brain stimulation on changes in regional brain activity. The advantages of this approach are, first, that brain stimulation reward can be delivered to discrete brain regions and, second, that the stimulation is presumed to activate regions directly involved in the production of rewarding effects. Changes in neuronal activity result in alterations of glucose metabolism that can be monitored using 2-[14C]deoxyglucose autoradiography. This technique offers the advantage of allowing identification of changes of neuronal activity throughout the brain. Rewarding brain stimulation delivered to the medial forebrain bundle at the level of the lateral hypothalamus resulted in increases in metabolic activity in several discrete brain regions (Porrino, Huston-Lyons, Bain, Sokoloff, & Kornetsky, 1990). These included the nucleus accumbens, olfactory tubercle, lateral septum, medial prefrontal cortex, and VTA. The olfactory tubercle appears to be functionally linked to the nucleus accumbens (Ikemoto, 2007). Overall, then, these findings implicate the mesocorticolimbic systems as being activated by rewarding brain stimulation. One limitation of the 2-[14C]deoxyglucose method is that it does not provide spatial resolution down to the cellular level. This limitation is not associated with techniques in which the product of the immediate early gene c-Fos is measured using immunohistochemical techniques. Increased Fos levels are indicative of enhanced neuronal activity. The number of Fos-positive cells has been found to be greater in several brain regions in animals receiving brain stimulation reward delivered to the medial forebrain bundle, the ventral pallidum, and the medial prefrontal cortex. Fos-like immunoreactivity is increased by brain stimulation delivered to the medial forebrain bundle in many of the structures that were also found to be increased
c40.indd 783
in regional glucose metabolism studies. These structures included the nucleus accumbens shell, medial prefrontal cortex, VTA, and lateral septum (Hunt & McGregor, 1998). Other structures in which Fos-like immunoreactivity increased were the locus coeruleus, bed nucleus of the stria terminalis, and central nucleus of the amygdala. The ventral pallidum, which receives projections from the nucleus accumbens, will also support responding for brain stimulation reward (Panagis et al., 1997). Self-stimulation of this structure has been found to produce elevations in Fos-like immunoreactivity in the medial prefrontal cortex, nucleus accumbens, and posterior lateral hypothalamus (Panagis et al., 1997). A slightly different pattern of increases in Foslike immunoreactivity has been seen as a consequence of self-stimulation of the medial prefrontal cortex. These increases were found to be located in the prelimbic cortex, cingulate cortex, nucleus accumbens, lateral hypothalamus, amygdala, and anterior portion of the VTA (Arvanitogiannis, Tzschentke, Riscaldino, Wise, & Shizgal, 2000). An alternative approach to establishing which brain regions are involved in the production of rewarding effects is to identify discrete regions of the brain into which animals will micro-inject pharmacologically active agents. In an early study, for example, it was shown that rats will self-administer heroin directly into the nucleus accumbens, establishing this structure as important in the production of the rewarding effects of -opioid receptor agonists (M. E. Olds, 1982). Another example of the infusion mapping approach includes findings that rats also will selfadminister amphetamine into the nucleus accumbens (McBride, Murphy, & Ikemoto, 1999), particularly the medial shell of this structure (Ikemoto, Qin, & Liu, 2005). Amphetamine administration by rats into the nucleus accumbens is antagonized by the concurrent infusion of either dopamine D1 or D2 receptor antagonists, suggesting that dopamine receptors mediate the rewarding effects of amphetamine. Rats will also self-administer high concentrations of cocaine into the shell of the nucleus accumbens (Ikemoto et al., 2005). The ventral olfactory tubercle will also support cocaine self-administration (Ikemoto, 2003). Both cocaine and morphine increase the rate of glucose utilization in the olfactory tubercle (Kornetsky, HustonLyons, & Porrino, 1991) in rats responding for brain stimulation reward. This suggests that structure may play an important role in reward processes, at least in rodents. Functional magnetic resonance imaging (fMRI) is another approach used to identify areas in human subjects that may play a role in mediating the hedonic effects of rewarding stimuli. Functional MRI can provide information concerning changes in regional blood flow in the brain that may reflect changes in neuronal activity. For example, the presentation of pleasant images of erotic and romantic
8/17/09 3:08:38 PM
784
Neural Basis of Pleasure and Reward
interactions between couples were found to produce increases in activity in the nucleus accumbens and medial prefrontal cortex of healthy subjects (Sabatinelli, Bradley, Lang, Costa, & Versace, 2007). In contrast, unpleasant but arousing and neutral pictures failed to produce this effect. When a single dose (0.6 mg/kg) of cocaine was administered to cocaine-dependent subjects, fMRI signals increased in the nucleus accumbens, putamen, ventral tegmentum, cingulate prefrontal, and temporal cortices, and several other regions (Breiter et al., 1997). Subjects in this study first experienced sensations of “rush” and “high” that were then followed by feeling of “craving” and “low.” During the repeated self-administration of cocaine by subjects dependent on this drug, self-ratings on the intensity of the drug-induced high were found to correlate with decreased activity in several brain regions, including the nucleus accumbens, frontal cortical areas, and the anterior cingulate (Risinger et al., 2005). These subjects received doses of cocaine of 20 mg/70 kg up to six times. Functional MRI signals were found to be increased in smokers following the intravenous injection of nicotine in the nucleus accumbens, amygdala, cingulate, and frontal lobes (Stein et al., 1998). These changes occurred in association with feelings of “rush” and “high” and a sustained feeling of pleasantness. When administered to healthy volunteers, morphine produced a different pattern of changes in regional brain activity (Becerra, Harter, Gonzalez, & Borsook, 2006). A low dose of morphine that produced mild euphoria in subjects increased activity in the nucleus accumbens, the hippocampus, the hypothalamus, the orbitofrontal cortex, and the putamen. Morphine administration also resulted in the decreased activation of several cortical structures, including the dorso-lateral prefrontal cortex, the temporal lobe, and the anterior cingulate. Overall, the results of these mapping approaches demonstrate that mesocorticolimbic structures play a key role in the processing of rewarding stimuli. While dopamine in the mesocorticolimbic systems has been regarded as a key neurotransmitter in the processing of rewarding stimuli, clearly many other systems are involved because activity within brain neuronal networks is always the product of the interaction of a wide variety of neurotransmitters. In addition to dopaminergic receptors, cholinergic, GABAergic, glutamatergic, opioid, and serotonergic systems have been implicated in regulating the actions of rewarding stimuli. In this chapter we consider the putative roles of these receptor systems in regulating reward processes. Much of our emphasis is on the actions of brain stimulation reward and drugs of abuse on brain receptor systems. Other reviews are available that have a greater focus on the neural basis of the rewarding effects of food (see, e.g., Kelley, Baldo, Pratt, & Will, 2005; Rolls, 2006).
c40.indd 784
MESOLIMBIC DOPAMINE AND REWARD The nucleus accumbens may play a key role in the processing of rewarding stimuli. This structure receives information concerning rewarding stimuli from both cortical and limbic structures and appears to integrate this information to regulate reward-related behaviors. The nucleus accumbens receives dense dopaminergic innervation from VTA afferents (see Figure 40.1). It also receives glutamatergic afferents that originate from the prefrontal cortex, amygdala, hippocampus, and thalamus (see Figure 40.3; Groenewegen, Wright, Beijer, & Voorn, 1999). Dopaminergic and glutamatergic systems interact to regulate activity within the nucleus accumbens by influencing a network of medium spiny neurons located in this structure (West, Floresco, Charara, Rosenkranz, & Grace, 2003). This network processes incoming information and sends out afferent projections that release the inhibitory neurotransmitter gamma amino butyric acid (GABA) to the ventral pallidum and the VTA. The ventral pallidum provides feedback from the nucleus accumbens to cortical areas via the mediodorsal thalamus (O’Donnell, Lavín, Enquist, Grace, & Card, 1997). Dopaminergic neurons in the mesocorticolimbic systems may serve a number of functions. One may involve the processing of reward-predictive signals. Midbrain dopaminergic neurons exhibit phasic (i.e., short-duration) activation following the presentation of stimuli that predict the availability of a reward (Schultz, 2007). A second possible and related function of dopaminergic neurons is to elicit drug-seeking behavior. Animals trained to self-administer drugs will stop responding if saline is substituted for the drug. Exposure to a priming dose of the drug will reinstate responding. Cocaine actions result, in part, from the blockade of the reuptake of dopamine, norepinephrine, and serotonin by binding to monoamine transporter proteins. Evidence of the involvement of dopamine in reinstatement of responding for cocaine
Thalamus Prefrontal Cortex Nucleus Accumbens
Ventral Pallidum
Hippocampus Amygdala
Figure 40.3 Simplified schematic of glutamatergic neuron projections to the nucleus accumbens from cortical and limbic structures (arrows). Note: Dashed lines indicate nonglutamatergic connections between structures.
8/17/09 3:08:38 PM
Mesolimbic Dopamine and Reward 785
includes the finding that responding on a lever previously associated with cocaine administration is reinstated by the administration of dopamine transporter (DAT) but not norepinephrine transporter (NET) or serotonin transporter (SERT) protein inhibitors (Schmidt & Pierce, 2006). Similar effects are produced by the infusion of selective dopamine D1 and D2 receptor agonists into the shell of the nucleus accumbens (Schmidt, Anderson, & Pierce, 2006). A third function of mesocorticolimbic dopaminergic neurons may be the modulation and possibly the direct activation of reward systems resulting in the production of hedonic effects. An increase in the extracellular concentrations of dopamine in the nucleus accumbens has been observed following exposure to a wide variety of rewarding stimuli (see Table 40.1). These elevations occur over time spans of minutes to tens of minutes that may produce changes in the tonic activity of neurons in the accumbens. The administration of amphetamine, cocaine, and morphine has been found to produce more pronounced increases in dopamine extracellular concentrations in the shell as compared to the core of the nucleus accumbens (Pontieri, Tanda, & Di Chiara, 1995). The chronic self-administration of cocaine (Lecca, Cacciapaglia, Valentini, Acquas, & Di Chiara, 2007), heroin (Lecca, Valentini, Cacciapaglia, Acquas, & Di Chiara, 2007), or nicotine (Lecca et al., 2006) preferentially increased extracellular dopamine levels in the shell as compared to the core of the nucleus accumbens. Clinical studies implicate drug-induced elevation in ventral striatal dopamine levels in the production of drugrelated hedonic effects. Dopamine release in the human brain can be assessed by measuring the displacement of highly selective dopamine ligands produced by the administration of indirect dopamine agonists such as amphetamine
table 40.1 Example of rewards that produce sustained elevations in nucleus accumbens extracellular dopamine concentrations during consummatory or self-administration periods
c40.indd Sec1:785
Reward
Class
Reference
Food
Natural
Martel & Fantino (1996)
Sucrose
Natural
Hajnal et al. (2004)
Copulation
Natural
Fiorino et al. (1997)
Brain stimulation
Artificial stimulant
Hernandez et al. (2006)
Amphetamine
Psychomotor stimulant
Ranaldi et al. (1999)
Cocaine
Psychomotor stimulant
Bradberry et al. (2000)
Nicotine
Nicotinic agonist
Lecca et al. (2006)
Heroin
Opioid agonist
Lecca et al. (2007)
Ethanol
Positive GABAA receptor modulator
Doyon et al. (2003)
or cocaine that increase extracellular dopamine levels. Such displacement can be detected by labeling drugs with high affinities for a particular dopamine receptor subtype with a radioactive isotope such as carbon 11 (11C). The concentration of this radio-labeled drug in the brain can be measured using positron emission tomography (PET). Displacement of the dopamine D2 receptor selective agent raclopride in the striatum is produced by the intravenous administration of the psychomotor stimulant methylphenidate to healthy subjects (Volkow et al., 1999). The magnitude of this displacement was found to correlate with increased ratings of sensations of “high” and “rush.” Similarly, amphetamineinduced displacement of raclopride in the ventral striatum was found to correlate with increases in ratings of euphoric feelings in healthy subjects (Drevets et al., 2001; Martinez et al., 2003; Oswald et al., 2005). Reductions in the binding of raclopride in the ventral striatum have also been found to correlate with hedonic responses to nicotine delivered in cigarette smoke (Barrett, Boileau, Okker, Pihl, & Dagher, 2004). The administration of a variety of abused drugs and selective dopamine and opioid agonists has been found to lower thresholds for rewarding brain stimulation delivered to the medial forebrain bundle (see Table 40.2). This suggests that agents that either promote dopamine release into the nucleus accumbens or directly stimulate dopamine receptors act to enhance the sensitivity of the brain to rewarding stimuli. This link is supported by the observations that the infusion of amphetamine directly into the nucleus accumbens results in a lowering of brain stimulation reward thresholds (Knapp, Lee, Foye, Ciraulo, & Kornetsky, 2001) and in enhanced rates of responding for brain stimulation reward (Broekkamp, Pijnenburg, Cools, & Van Rossum, 1975). Perhaps the clearest evidence that dopaminergic receptor stimulation results in rewarding effects is that administration of either the direct dopamine D2 receptor agonist bromocriptine (Knapp & Kornetsky, 1994; Steiner, Katz, & Carroll, 1980) or the selective DAT inhibitor GBR 12909 (Baldo, Jain, Veraldi, Koob, & Markou, 1999; MaldonadoIrizarry, Stellar, & Kelley, 1994) enhances the effects of rewarding brain stimulation. Also, the selective DAT inhibitor GBR 12783 will produce conditioned placed preference (Le Pen, Duterte-Boucher, & Costentin, 1996). Finally, rats and monkeys (Wise, Murray, & Bozarth, 1990; Woolverton, Goldberg, & Ginos, 1984) will self-administer bromocriptine, and rhesus monkeys will respond for GBR 12909 (Stafford, LeSage, Rice, & Glowa, 2001; Wojnicki & Glowa, 1996), which is also consistent with these agents having rewarding actions. The administration of dopamine receptor antagonists would be expected to attenuate the effects of rewarding
8/17/09 3:08:39 PM
786
Neural Basis of Pleasure and Reward
table 40.2 Drugs that produce lowering of brain stimulation reward thresholds Drug
Class
Reference
Amphetamine
Psychomotor stimulant
Kornetsky & Esposito (1979)
Cocaine
Psychomotor stimulant
Gill et al. (2004)
MDMA
Psychomotor stimulant
Hubner et al. (1988)
Bromocriptine
D2 receptor agonist
Knapp & Kornetsky (1994)
GBR 12909
Selective DAT inhibitor
Baldo et al. (1999)
Buprenorphine
Opioid agonist
Hubner & Kornetsky (1988)
Heroin
Opioid agonist
Hubner & Kornetsky (1992)
Morphine
Opioid agonist
Esposito & Kornetsky (1977)
DAMGO
Mu-receptor opioid agonist
Duvauchelle, Fleming, & Kornetsky (1997)
DPDPE
Delta-receptor opioid agonist
Duvauchelle et al. (1997)
Nicotine
Nicotinic agonist
Huston-Lyons, Sarkar, & Kornetsky (1993)
Ethanol
Positive GABAA modulator
Kornetsky, Moolten, & Bain (1991)
brain stimulation if dopamine receptor systems acted to positively modulate these effects. Experimental results concerning the actions of dopamine receptor antagonist administration on responding for brain stimulation reward may be influenced by the nonspecific effects of these agents on cognitive and motor performance. Treatment with the dopamine D2 selective antagonist pimozide, however, was found to increase reward threshold levels at doses that did not influence attentional processes (Bird & Kornetsky, 1990). The administration of the dopamine receptor antagonist haloperidol elevated reward thresholds for BSR at doses that did not influence a measure of motor performance, that is, latency of response (Esposito, Faulkner, & Kornetsky, 1979). Similar effects were produced by the administration of the selective dopamine D1 receptor antagonist SCH 23390. It has not, however, always been possible to separate blockade of rewarding stimulation from impairment of motor effects. Administration of the D2 selective antagonist raclopride, for example, was shown to increase response latencies as it elevated brain stimulation reward thresholds (Baldo, Jain, Veraldi, Koob, & Markou, 1999). The infusion of dopamine antagonists into discrete brain areas would be expected to produce less nonselective disruption of responding for rewarding brain stimulation than does systemic administration of these drugs. Microinjection
c40.indd Sec1:786
of the dopamine antagonist cis-flupenthixol into the nucleus accumbens attenuated the effects of brain stimulation reward (Stellar & Corbett, 1989). Infusion of the dopamine D1 receptor antagonist SCH 23390 into the nucleus accumbens decreased rates of responding for rewarding brain stimulation, but administration of the D2 receptor selective antagonist raclopride did not produce a significant effect (Cheer et al., 2007). Interestingly, microinjection of SCH 23390 into the nucleus accumbens elevated brain stimulation thresholds; in contrast, thresholds were lowered when this drug was administered into the prefrontal cortex (Duvauchelle, Fleming, & Kornetsky, 1998). Pharmacological agents that block dopamine receptors can be considered to have only relative selectivity for the different dopamine receptor subtypes. Attempts have been made to address this problem by using animals in which the genes that express the proteins needed to form a particular type of receptor have been deleted or rendered inactive. This typically involves the use of transgenic mice. A transgenic mouse is one that carries a foreign gene that has been inserted into its genome. The foreign gene is constructed using recombinant DNA. In knockout mice the replacement gene (or null gene) is nonfunctional. In homozygous mice, who receive the null gene from both parents, the expression of a specific protein may be completely absent. In heterozygotic animals, in which the null gene is received from only one parent, the level expression of the protein is greatly reduced. Wild-type animals have parents that are both nontransgenic. Selective deletion of specific dopamine receptor proteins in knockout mice allows for the assessment of brain stimulation reward in animals in whom the expression of specific dopamine receptor subtypes has been blocked. One group of researchers found that thresholds for brain stimulation were elevated in dopamine D1 receptor knockout mice compared to thresholds obtained for wild-type mice (Tran et al., 2005). This finding implicates dopamine D1 receptors in the regulation of responses to brain stimulation reward. In contrast, this same group of investigators reported that responding for brain stimulation reward was unaltered in dopamine D2 receptor knockout mice compared to wild-type animals (Tran et al., 2002). This suggests that dopamine D2 receptors do not play an essential role in the maintenance of baseline levels of responding for brain stimulation. Other researchers, however, have found that significantly higher levels of current intensity are needed to maintain responding for brain stimulation to obtain stimulation in D2 receptor knockout mice than was required for wild-type mice (Elmer et al., 2005). Knockout mice have been used to examine the role played by dopamine receptor systems in mediating the
8/17/09 3:08:39 PM
GABA and Reward 787
rewarding effects of drugs of abuse. Dopamine D2 receptor knockout mice will self-administer cocaine (Caine et al., 2002). The intake of cocaine by these animals is higher at high doses of cocaine than for wild-type mice. These results indicate that dopamine D2 receptors may not be essential for mediating the rewarding effects of cocaine. Cocaine-induced conditioned place preference was not found to differ significantly in either dopamine D1 receptor or dopamine D3 receptor knockout mice compared to that seen in wild-type animals (Karasinska, George, Cheng, & O’Dowd, 2005). Dopamine D1 receptor knockout mice, however, failed to reliably self-administer cocaine (Caine et al., 2007). Overall these studies tentatively suggest that the dopamine D2 receptor may not be essential for the induction of cocaine’s rewarding actions, whereas D1 receptors might play a critical role in these actions. If the rewarding effects of cocaine are related to increases in extracellular levels of dopamine in the brain, then the DAT protein would be expected to be the site of action at which this drug would bind to produce these increases. However, DAT knockout mice will self-administer cocaine (Rocha et al., 1998). Cocaine administration also has been found to induce conditioned place preference in DAT knockout mice (Sora et al., 1998). These results are not consistent with a role for DAT in mediating the rewarding effects of cocaine. There is evidence, however, that the regulation by neurotransmitters of reward system function is altered in DAT knockout mice. Most notably, nisoxetine, a NET inhibitor, and fluoxetine, a SERT inhibitor, both produce conditioned place preference in DAT knockout mice, although they do not have similar actions in wild-type animals (Hall et al., 2002). In an attempt to circumvent the problems associated with neuronal development in animals that never express DAT, one group of researchers has developed a strain of DAT knockin mice in which DAT that has a low affinity for cocaine is expressed (R. Chen et al., 2006). The DAT expressed in these animals remains functional with respect to the transport of dopamine, but it does not interact with cocaine. In the DAT knockin mice cocaine administration failed to produce conditioned place preference. This result supports the idea that DAT is involved in mediating the rewarding effects of cocaine in animals in which this transporter protein remains functional. In clinical studies the role of dopamine receptors in mediating the hedonic effects of commonly abused stimulants has been assessed by examining how these effects are modified by the administration of dopamine receptor antagonists. In one study, an injection of the dopamine antagonist haloperidol had no effect on the sensation of “rush” produced by intravenous cocaine, but ratings of “good” feelings and “high” were significantly diminished (Sherer,
c40.indd 787
Kumor, & Jaffe, 1989) by administration of this drug. In another study, haloperidol administration blocked euphoria induced by the stimulant methylphenidate in manicdepressive subjects (Wald, Ebstein, & Belmaker, 1978). Ratings of “high” and “good effects” produced by cocaine were reduced by concurrent treatment with the dopamine D1 antagonist ecopipam (Romach et al., 1999). Administration of the atypical neuroleptic clozapine to cocaine-dependent individuals reduced cocaine-induced increases in feelings of “high” and “rush” (Farren et al., 2000). Haloperidol administration also was found to decrease the euphoric and stimulant effects of ethanol in social drinkers, suggesting that dopamine antagonists can block the hedonic effects of abuse drugs that are not stimulants (Enggasser & de Wit, 2001). Not all findings are consistent with the notion that the administration of dopamine antagonists will block the hedonic effects of drugs of abuse. Amphetamine-induced euphoria was not blocked by administration of the dopamine D2 antagonist pimozide (Brauer & DeWit, 1997). Administration of either haloperidol or the atypical neuroleptic risperidone did not alter the subjective response of healthy volunteers to methamphetamine (Wachtel, Ortengren, & de Wit, 2002). Findings from a few studies suggest that dopamine receptor antagonists may not block the hedonic effects of nicotine. The positive subjective effects produced by nicotine have not been reduced by the administration of either ecopipam (Chausmer, Smith, Kelly, & Griffiths, 2003) or haloperidol (Walker, Mahoney, Ilivitsky, & Knott, 2001). Whether differences in factors such as drug doses used can explain the discrepancies among studies concerning the effects of dopamine antagonists on drug-induced hedonic effects remains to be determined. GABA AND REWARD GABA may act on both GABAA and GABAB receptors within the brain to regulate the effects of rewarding stimuli. The ventral pallidum is a subcortical structure that receives GABAergic input from the nucleus accumbens. This structure sends inhibitory GABAergic projections into the VTA (Wu, Hrycyshyn, & Brudzynski, 1996). GABA released within the VTA may act on GABAA receptors located on interneurons in the VTA. This may result in an inhibition of release of GABA from these interneurons. GABA released from interneurons in the VTA interacts with GABAB receptors located on dopamine neurons to inhibit the activity of dopaminergic neurons (Kalivas, Duffy, & Eberhardt, 1990). Thus, activation of GABAA neurons within the VTA may lead to a disinhibition of dopaminergic activity by indirectly preventing the stimulation of GABAB receptors located on dopamine neurons.
8/17/09 3:08:39 PM
788
Neural Basis of Pleasure and Reward
There also appear to be extensive interactions between dopaminergic and GABAergic systems within the structures that regulate reward in the nucleus accumbens (Geldwert et al., 2006). Coinfusion of either bicuculline, a GABAA receptor antagonist, or the GABAB antagonist phaclofen with the selective DAT inhibitor GBR 12909 leads to increases in dopamine concentrations within the nucleus accumbens significantly above those seen with GBR 12909 alone (Rahman & McBride, 2002). These results suggest that the GABA receptors regulate the release of dopamine within the nucleus accumbens. Rats will self-administer the GABAA receptor agonist muscimol into the VTA, suggesting that stimulation of GABAA receptors in this structure produces rewarding effects (Ikemoto, Murphy, & McBride, 1998). The finding that muscimol injection into the VTA can result in conditioned place preference is consistent with this idea (Laviolette & van der Kooy, 2001). Place preference resulting from intra-VTA muscimol administration was blocked by treatment with the dopamine antagonist -flupenthixol (Laviolette & van der Kooy, 2001), indicating that muscimolrewarding actions may be linked to dopaminergic-related reward processes. The infusion of muscimol into the rostral portion of the VTA has been found to increase break points for cocaine delivered under a progressive ratio schedule (D. Y. Lee et al., 2007). This is consistent with muscimol activating the same reward pathways as does cocaine. This activation may result from stimulation of GABAA receptors located on interneurons in the VTA that, in turn, leads to inhibitory effects that block the release of GABA from these neurons (Kalivas et al., 1990). GABA is metabolized in the brain by the enzyme GABA transaminase. Inhibition of this enzyme can be produced by administration of the GABA transaminase inhibitor vigabatrin, resulting in marked elevations in brain GABA levels. Treatment with vigabatrin results in the blockade of increases in nucleus accumbens dopamine levels that are produced by the administration of methamphetamine, heroin, or ethanol (Gerasimov et al., 1999). Vigabatrin, then, would be expected to block the rewarding effects of many commonly abused drugs by suppressing the increased release of dopamine that would otherwise occur when they are administered. This drug has been found to decrease the self-administration of cocaine, ethanol, and morphine (Buckett, 1981; Stromberg, Mackler, Volpicelli, O’Brien, & Dewey, 2001) and block heroin-induced place preference (Paul, Dewey, Gardner, Brodie, & Ashby, 2001). Break points for cocaine self-administration are decreased by concurrent treatment with vigabatrin, indicating a reduction in cocaine’s rewarding effects (Kushner, Dewey, & Kornetsky, 1999). In addition, the administration of vigabatrin
c40.indd 788
blocks cocaine-induced lowering of brain stimulation reward thresholds (Kushner, Dewey, & Kornetsky, 1997). Vigabatrin administration also decreases responding for food, which has led some investigators to question the specificity of its effects (Barrett, Negus, Mello, & Caine, 2005). Both the GABAB receptor agonist baclofen and the positive modulator of GABAB activity GS39783 will attenuate cocaine-induced enhancement of animals’ sensitivity to brain stimulation reward (Slattery, Markou, Froestl, & Cryan, 2005). When administered alone at a higher dose baclofen will significantly elevate brain stimulation reward thresholds. These results are in accord with the view that GABAB receptors may act to inhibit the rewarding effects of both brain stimulation and cocaine. This may be related to inhibition of mesolimbic dopamine release produced by the activation of GABAB receptors located on dopaminergic neurons. Cocaine self-administration is inhibited by the systemic administration of the GABAA agonist muscimol or the GABAB receptor agonist baclofen (Barrett et al., 2005). These findings suggest that both the GABAA and GABAB receptors may have inhibitory actions on the rewarding effects of cocaine. This finding also indicates that systemically administered muscimol may have actions distinct from those caused by the infusion of this drug into the VTA. It should also be noted that baclofen and muscimol decreased cocaine self-administration only at doses that also decreased food-maintained responding, which raises the question of whether these GABA agonists when administered systemically selectively act on cocaine-induced reward (Barrett et al., 2005). These agents also either may have nonselective effects on lever pressing or may alter the rewarding effects of food (Barrett et al., 2005). Both barbiturates and benzodiazepine sedative-hypnotics act at the GABAA receptor complex to enhance the effects of GABA on this complex. Clinical studies indicate that the administration of barbiturates, including pentobarbital (Carter, Richards, Mintzer, & Griffiths, 2006) and butabarbital (Zawertailo, Busto, Kaplan, Greenblatt, & Sellers, 2003), produce elevations in ratings of drug liking and other measures of pleasant drug effects in subjects with a history of recreational sedative use. Similar effects are seen following the administration of benzodiazepines to sedative users (Carter et al., 2006; Zawertailo et al., 2003), abstinent alcoholics (Ciraulo et al., 1997), and children of alcoholics (Ciraulo, Barnhill, Ciraulo, Greenblatt, & Shader, 1989; Ciraulo et al., 1996). Thus, sedative-hypnotics that enhance the activity of GABAA receptors produce rewarding actions in human subjects. Treatment with pentobarbital, however, did not result in alterations in brain stimulation reward thresholds (Kornetsky, 2004). When administered
8/17/09 3:08:40 PM
Opioid Systems and Reward 789
systemically the benzodiazepines diazepam (Invernizzi, Pozzi, & Samanin, 1991) and midazolam (Finlay, Damsma, & Fibiger, 1992) both decreased dopamine concentrations in the nucleus accumbens. These results suggest that the hedonic effects of barbiturates and benzodiazepines may result from dopamine-independent processes.
OPIOID SYSTEMS AND REWARD There are three types of opioid receptors: the , , and , receptors (De Vries & Shippenberg, 2002). Most commonly used opioid analgesics act at the -receptor, producing elevation of mood and euphoria in many individuals. A few of the available opioid analgesics act by stimulating -receptors, but these drugs may sometimes produce unpleasant (dysphoric) feelings. Endogenous opioids known as the enkephalins activate -receptors. The administration of opioid drugs with -opioid receptor agonist effects can result in lowering brain stimulation reward thresholds (see Table 40.2). In contrast, the administration of the selective opioid -receptor agonist U-69593 may elevate brain stimulation reward thresholds (Todtenkopf, Marcus, Portoghese, & Carlezon, 2004). The -receptor agonist ethylketocyclazocine, on the other hand, has been found to have no effect on these thresholds (Unterwald, Sasson, & Kornetsky, 1987). Several findings indicate that changes in dopaminergic activity can influence the effects of -opioid receptor agonists on brain stimulation thresholds, suggesting that this neurotransmitter may be a mediator of the effects of these agents on reward systems. Injection of morphine into the VTA enhances the effects of brain stimulation reward (Broekkamp & Phillips, 1979). This action may result from morphine-induced increases in the firing of dopaminergic cells within the VTA (Gysling & Wang, 1983; Kiyatkin & Rebec, 1997). Opioid-induced increases in VTA dopamine cell firing may account for the increase in dopamine efflux in the nucleus accumbens that occurs during the administration of either morphine (Pontieri et al., 1995) or heroin (Lecca et al., 2007). The increase in firing produced by opioid administration may result from the suppression of GABA release from interneurons that act to inhibit dopaminergic neuronal activity in the VTA (Bergevin, Girardot, Bourque, & Trudeau, 2002; Kalivas et al., 1990; Klitenick, DeWitte, & Kalivas, 1992). Evidence of this inhibitory effect includes the finding that the firing rate of GABA neurons in the VTA is reduced following the self-administration of heroin (Steffensen et al., 2006). The infusion of vigabatrin into either the VTA or ventral pallidum resulted in the suppression of heroin self-administration (Xi & Stein, 2000).
c40.indd 789
Systemic administration of the GABAB receptor antagonist 2-OH-saclofen antagonized the inhibitory effects of vigabatrin administered into the VTA on heroin selfadministration (Xi & Stein, 2000). This finding suggests that the activation of GABAB receptors in the VTA may antagonize the rewarding effects of opioid agonists. Evidence of the involvement of the dopaminergic systems in the interaction between opioids and brain stimulation reward includes the finding that low doses of apomorphine, which may act presynaptically to block dopamine release, blocks morphine-induced lowering of brain stimulation reward thresholds (Knapp & Kornetsky, 1996). Also, the systemic administration of the nonselective dopamine antagonist cis-flupenthixol blocked the stimulation reward threshold lowering effects produced by the microinjection into the accumbens of either the -opioid receptor agonist DAMGO or the -opioid agonist DPDPE (Duvauchelle, Fleming, & Kornetsky, 1997). Finally, in dopamine D2 receptor knockout mice, morphine administration produces an elevation as opposed to a decrease in frequency thresholds for rewarding brain stimulation, suggesting that the rewarding effects of morphine do not occur in these animals (Elmer et al., 2005). In a complementary self-administration study, dopamine D2 receptor knockout mice did not show greater responses for morphine than they did for saline (Elmer et al., 2002). This result is consistent with the notion that dopamine D2 receptor systems are needed for the production of the rewarding actions of opioids. There are some findings that do not support the idea that dopamine is needed for the production of the rewarding effects of opioids. Heroin self-administration has been found to persist in animals in which dopamine terminals in the nucleus accumbens have been destroyed using 6-hydroxydopamine (Pettit, Ettenberg, Bloom, & Koob, 1984). Injection of the dopamine antagonist -flupenthixol into the nucleus accumbens failed to block morphineinduced conditioned place preference in opioid-naive rats (Laviolette, Nader, & van der Kooy, 2002). Morphineinduced conditioned place preference also is produced in dopamine-deficient mutant mice (Hnasko, Sotak, & Palmiter, 2005). At present it is hard to reconcile the inconsistent evidence concerning the role of dopamine in the production of the rewarding effects of opioids. One likely possibility is that opioid-induced reward may be mediated by both dopamine-dependent and dopamine-independent mechanisms. Several studies implicate CB1 cannabinoid receptor systems in the regulation of the rewarding effects of opioids. Administration of the CB1 receptor antagonist SR1451716A produces a decrease in the break points for heroin delivered under a progressive ratio schedule
8/17/09 3:08:40 PM
790
Neural Basis of Pleasure and Reward
(Caillé & Parsons, 2003; De Vries, Homberg, Binnekade, Raasø, & Schoffelmeer, 2003). The administration of this antagonist also blocked morphine-induced place preference (Mas-Nieto et al., 2001; Navarro et al., 2001). Other evidence of involvement of CB1 receptors in modulating the rewarding effects of opioids includes findings that morphine self-administration is not seen in CB1 receptor knockout mice (Cossu et al., 2001). Acquisition of morphine-induced place preference may be blocked in these animals (Martin, Ledent, Parmentier, Maldonado, & Valverde, 2000), although this has not been a consistent finding (Rice, Gordon, & Gifford, 2002). The lack of rewarding effects of morphine in CB1 receptor knockout mice may be related to a reduction of morphine-induced accumbens dopamine release in these animals (Mascia et al., 1999). The threshold lowering of several psychomotor stimulants, including cocaine (Bain & Kornetsky, 1987), d-amphetamine (Esposito, Perry, & Kornetsky, 1980), amfonelic acid (Knapp & Kornetsky, 1989), and 3,4-methlenedioxymethamphetamine (Hubner, Bird, Rassnick, & Kornetsky, 1988), are attenuated by the concurrent administration of high doses of the opioid antagonist naloxone. Treatment with naloxone will significantly decrease response rates for the self-administration of cocaine (Kiyatkin & Brown, 2003). When microinjected into the ventral pallidum, naloxone also blocks cocaine-induced place preference (Skoubis & Maidment, 2003). These results suggest that endogenous opioid peptides may play a role in the modulation of the rewarding actions of psychomotor stimulants. The -opioid receptor has been implicated as the opioid receptor subtype that regulates drug-induced reward. Evidence of this includes the finding that intracerebroventricular infusion of the selective -receptor antagonist CTAP prevents the development of cocaine-induced conditioned place preference (Schroeder et al., 2007). Selective deletion of the OPRM1 (i.e., the -opioid receptor) gene from mice results in the failure of cocaine to produce conditioned place preference (Becker et al., 2002; Hall, Goeb, Li, Sora, & Uhl, 2004). The finding that ethanol reward and ethanol-induced place preference are attenuated in OPRM1 knockout mice suggests that this opioid receptor is also involved in mediating the rewarding effects of alcohol (Hall, Sora, & Uhl, 2001). Several studies have examined the interaction between the opioid antagonist naltrexone and commonly abused drugs on the subjective response of human subjects to these agents. Treatment with naltrexone decreased ratings of “high” but not ratings of “like the drug” in healthy volunteers challenged with a dose of amphetamine (JayaramLindström, Wennberg, Hurd, & Franck, 2004). Ratings
c40.indd 790
of cocaine’s “good effects” were decreased by naltrexone administration in crack cocaine users (Sofuoglu et al., 2003). In contrast, in one study with subjects with a history of cocaine and heroin use, naltrexone treatment had no significant effect on ratings of either cocaine-associated “high,” “good effects,” or “liking” (Walsh, Sullivan, Preston, Garner, & Bigelow, 1996). The administration of opioid antagonists may attenuate the hedonic effects of rewarding substances other than cocaine, including those of food. Administration of naltrexone reduced ratings of the pleasantness of food (M. R. Yeomans & Gray, 1996, 1997). The euphoria-inducing effects of nicotine gum in smokers were blocked by pretreatment with naltrexone (Knott & Fisher, 2007). Treatment with naltrexone reduced ratings by alcoholic subjects of ethanol-induced “high” (Volpicelli, Watson, King, Sherman, & O’Brien, 1995) and by heavy drinkers of ratings of alcohol “liking” (McCaul, Wand, Eissenberg, Rohde, & Cheskin, 2000). In alcoholic individuals the initial stimulatory effects of ethanol are decreased by the administration of either naltrexone or the opioid antagonist nalmefene (Drobes, Anton, Thomas, & Voronin, 2004).
SEROTONIN AND NOREP