ADVANCES IN PROTEIN CHEMISTRY Volume 51
Linkage Thermodynamics of Macromolecular Interactions
JEFFRIES WYMAN (1901-1...
19 downloads
390 Views
23MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN PROTEIN CHEMISTRY Volume 51
Linkage Thermodynamics of Macromolecular Interactions
JEFFRIES WYMAN (1901-1 995)
ADVANCES IN PROTEIN CHEMISTRY EDITED BY FREDERIC M. RICHARDS
DAVID S. EISENBERG
Department of Molecular Biophysics and Biochemistry Yale University New Haven. Connecticut
Department of Chemistry and Biochemistry University of California, Los Angeles Los Angeles, California
PETER S. KIM Department of Biology Massachusetts Institute of Technology Whitehead Institute for Biomedical Research Howard Hughes Medical Institute Research Laboratories Cambridge, Massachusetts
VOLUME 51
Linkage Thermodynamics of Macromolecular Interactions EDITED BY ENRICO DI CERA Department of Biochemist/y and Molecular Biophysics Washington University Medical School St. Louis, Missouri
ACADEMIC PRESS San Diego London Boston New York Sydney Tokyo Toronto
Frontispiece reprinted from A History of Biochaist?y 36 (1985), pages 99-190, with kind permission of Elsevier Science - NL, Sara Burgerhartstraat 25, 1055KV Amsterdam, The Netherlands.
This book is printed on acid-free paper. @ Copyright 0 1998 by ACADEMIC PRESS All Rights Reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the Publisher. The appearance of the code at the bottom of the first page of a chapter in this book indicates the Publisher’s consent that copies of the chapter may be made for personal or internal use of specific clients. This consent is given on the condition, however, that the copier pay the stated per copy fee through the Copyright Clearance Center, Inc. (222 Rosewood Drive, Danvers, Massachusetts 01923), for copying beyond that permitted by Sections 107 or 108 of the US. Copyright Law. This consent does not extend to other kinds of copying, such as copying for general distribution, for advertising or promotional purposes, for creating new collective works, or for resale. Copy fees for pre-1997 chapters are as shown on the title pages. If no fee code appears on the title page, the copy fee is the same as for current chapters. 0065-3233/98 $25.00
Academic Press a division of Harcourt Brace & Company 525 B Street, Suite 1900, San Diego, California 921014495, USA http://www.academicpress.com Academic Press Limited 2428 Oval Road, London NWl 7DX, UK http://www.hbuk.co.uk/ap/ International Standard Book Number: 0-12-034251-0 PRINTED IN THE UNITED STATES OF AMERICA 98 99 00 01 02 03 MM 9 8 7 6 5 4
3 2
1
CONTENTS
INTRODUCTION
.
ix
Electrostatic Contributions to Molecular Free Energies in Solution MICHAEL SCHAEFER, HERMAN W. T. VAN VLIJMEN, AND MARTIN
I. Introduction . 11. Theory and Calculational Methods . 111. Applications . IV. Outlook . References .
~
L
U
S
1 3 20 53 54
.
Site-Specific Analysis of Mutational Effects in Proteins
ENRICODI CERA
I. Introduction . 11. The Reference Cycle
59 61 63
.
111. Structural Mapping of Energetics . IV. Site-SpecificAnalysis of Mutational Effects . in Proteins V. Site-Specific Dissection of Thrombin Specificity VI. Concluding Remarks . References .
.
73 79 113 115
Allosteric Transitions of the Acetylcholine Receptor STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
I. Introduction . 11. Mechanistic Models. . 111. Recovery from Desensitization. . IV. Kinetic Basis of Dose-Response Curves . V. Multiple Phenotypes . VI. Deductions from Singlechannel Measurements V
.
121 127 133 137 141 149
.
vi
CONTENTS
VII. Mlosteric Effectors and Coincidence Detection VIII. General Considerations . References .
.
163 166 173
Deciphering the Molecular Code of Hemoglobin Allostery GARY K. ACKERS I. Introduction . 11. Overview. 111. Binding Curves and Stoichiometric Information . Iv. Site-Specific Aspects of Oxygen Binding . V. Experimental Determination of Site-Specific Cooperativity Terms . VI. How the Molecular Code Was Deciphered . . VII. Concluding Remarks References .
.
185 190 198 206 211 221 246 248
Statistical Thermodynamic Linkage between Conformational and Binding Equilibria FREIRE ERNESTO I. Introduction . . 11. The Most Probable Distribution . 111. Coupling of Statistical Weights to Ligands Iv. Modulation of Distribution of States by Specific Ligands . V. Modulation of Distribution of States . by Denaturants . VI. Ligand-Induced Conformational Changes VII. The Distribution of Conformational States According to . Their Gibbs Energy. VIII. Is the Unfolded State the State with the Highest Gibbs Energy? . IX. The Gibbs Energy Scale of Conformational States . X. Statistical Descriptors of the . Conformational Ensemble XI. Conclusions . References .
255 257 257 259 262 263 263 267 269 271 278 278
vii
CONTENTS
Analysis of Effects of Salts and Uncharged Solutes on Protein and Nucleic Acid Equilibria and Processes: A Practical Guide to Recognizing and Interpreting Polyelectrolyte Effects, Hofmeister Effects, and Osmotic Effects of Salts
M. THOMAS RECORD, JR., WENTAO ZHANC, AND CHARLES F. ANDERSON I. Introduction . 11. Overview of Concentration-Dependent Effects of Perturbing Solutes on Processes Involving Biopolymers . 111. Preferential Interaction Coefficients as Fundamental Measures of Thermodynamic Effects due to . Solute-Biopolymer Interactions N . Preferential Interactions of Nonelectrolyte Molecules with an Uncharged Biopolymer . V. Preferential Interactions of Electrolyte Ions with a . Charged Biopolymer Use of Three-Component Preferential Interaction VI. Coefficients to Analyze Effects of Solute Concentration on Equilibrium Constants, Transition Temperatures, or Free Energy Changes of Biopolymer Processes . VII . Two-Domain Predictions of Functional Forms of Effects of Nonelectrolyte Concentration on Equilibria ( K O b ) and Transition Temperatures ( T , ) of Uncharged Biopolymers in Aqueous Solution . VIII. Polyelectrolyte and Two-Domain Predictions of Functional Forms of Effects of Salt Concentration on Equilibria (Kobs)and Transition Temperatures ( T , ) of Charged Biopolymers in Aqueous Solution . . IX. Conclusions and Future Directions. References .
282 286 295 303 31 1
319
326
330 349 350
Control of Protein Stability and Reactions by Weakly Interacting Cosolvents: The Simplicity of the Complicated SERGE N. TIMASHEFF I. Introduction . 11. Preferential Interactions . 111. Wyman Linkages in Preferential Interactions .
.
356 360 377
...
CONTENTS
vlll
IV. Linkage Control of Protein Stability V. Linkage Control of Protein Reactions
VI. Sources of Exclusion VII. Osmolytes VIII. Conclusion References AUTHOR INDEX SUBJECT INDEX
. .
. . .
.
. .
387 409 416 423 425 428 433 453
INTRODUCTION
In a paper published fifty years ago in Advances in Protein Chemistly [ (1948) Adv. Protein Chem. 4,407-5311, Jeffries Wyman laid the founda-
tions for the theory of linked functions and brought the rigor of Gibbs’ thermodynamics to biochemistry. Wyman was the first to realize and correctly formulate the reciprocity principle between effects elicited by two ligands on each others binding. Linkage is the inevitable consequence of the first two laws of thermodynamics and permeates every aspect of macromolecular function. Through the principle of linked functions, Wyman captured the essential property of proteins as macromolecular transducers of different effects: a simple property that lies at the very foundations of biological complexity. On page 198 of the third edition of “Molecular Biology of the Cell,” Alberts, Bray, Lewis, Raf€, Roberts, and Watson write, “The [linkage] relationships . . . underlie all of cell biology. They seem so obvious in retrospect that we now take them for granted.” A fundamental consequence of the principle of linked functions has become more widely known than the principle itself. In an attempt to explain the mechanism of the Bohr effect of hemoglobin, i.e., how oxygen binding is influenced by proton binding, Wyman and Allen [(1951) J. Polymer Sci. 7, 499-5181 proposed that hemoglobin would exist in different conjgurations showing different affinities for oxygen and the proton and that the linkage between these ligands would be mediated by a change in configuration involving the hemoglobin molecule as a whole. Fourteen years later, this radically new idea was articulated in the classical Monod-Wyman-Changeux model of allosteric transitions [ (1965) J. MoZ. Biol. 12, 88-1181 and provided a monumental advance to our understanding of how proteins work. Impressive applications of allostery and linkage to many biologically important systems have since appeared in the literature, and a lucid treatment of these basic concepts and their applications was given by Wyman and Gill in the landmark book “Binding and Linkage” (1990, University Science Books, Mill Valley, CA). The contributions to this volume of Advances in Protein Chemistly document the central role and impact that the theory of linked functions maintains in all disciplines concerned with a quantitative understanding of binding energetics and structure-function relations. Schaefer, van ix
X
INTRODUCTION
Vlijmen, and Karplus discuss a new method for the calculation of protein pKa’s and pH-dependent electrostatic free energies in solution based on the theory of linked functions. The method is applied to the calculation of lysozyme pKa’sand the pH dependence of the stability of lysozyme and the capsid of the foot-and-mouth disease virus. Di Cera develops a site-specific analysis of mutational effects in proteins as an extension of the theory of linked functions. The analysis leads to the formulation of a new strategy for dissecting ligand recognition, and an application is given for the case of thrombin-substrate interactions. Important new applications of the theory of linked functions and allostery are given by Edelstein and Changeux for the acetylcholine receptor and by Ackers for hemoglobin. Freire extends the original linkage concept to model the statistical ensemble of conformers accessible to native proteins and discusses the implications of this new approach to the analysis of conformational and binding equilibria. Record, Zhang, and Anderson give a lucid analysis of linked effects arising from salts and uncharged solutes on proteins and nucleic acid equilibria. Their analysis extends the theory of linked functions to encompass the biologically relevant cases in which solutes affect the properties of macromolecules beyond simple mass action effects. Timasheff refines the concept of preferential interactions to bring new light to the effects of weakly interacting cosolvents in macromolecular systems. The exciting contributions to the field of ligand binding and linkage contained in this volume remind us of the timeliness and importance of Wyman’s theory of linked functions, which provides the logical tools to link structural and computational biology to macromolecular energetics. Work done in this field, pioneered by Wyman fifty years ago, will continue to produce important new developments and a deeper understanding of structure-function relations.
ENRICO DI CERA AND JOHN T. EDSALL
ELECTROSTATIC CONTRIBUTIONS TO MOLECULAR FREE ENERGIES IN SOLUTION By MICHAEL SCHAEFER,' HERMAN W. T. VAN VLIJMENP and MARTIN KARPLUW 'Leboratoire de Chimie Biophysique, institut ie Eel, Universitd Loulr Pasteur, 67000 Strasbourg, France, and ?Departmentof Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts02138
I. Introduction ....................................................... 11. Theory and Calculational Methods .................................... A. Titration Calculation ............................................ B. Electrostatic Free Energy Difference between Conformations ......... C. The Independent Sites Model .................................... D. Standard, Intrinsic, and Effective pK. ............................. E. Electrostatic Free Energy of a Protonation State ................... F. Electrostatic Free Energy Calculation ............................. G. Monte Carlo Titration .......................................... 111. Applications ....................................................... A. Protein Dielectric Constant ...................................... B. Variation of Stability of Lysozyme with pH ........................ C. Capsid Stability of Foot-and-Mouth Disease Virus .................. N. Outlook .......................................................... References ........................................................
1 3
5 7 8 13
16 17 19 20 21 25 39 53 54
I. INTRODUCTION Protein titration provides a notable example of the specific binding of multiple ligands (protons) to a macromolecule, the subject of the linked function theory introduced by Wyman (1948) and fully elaborated in the book by Wyman and Gill (1990).It is of particular interest because it is, at present, the only case for which theoretical evaluation of the binding constants is possible. Moreover, every protein has a large number of binding sites for protons, and in certain cases the individual site binding constants, as well as the overall binding (titration) behavior, have been measured (Tanford and Roxby, 19'72;Bashford and Karplus, 1990). The protonation of titratable sites in proteins clearly falls into the category of specific binding, i.e., every successivebinding of a proton can be expressed as a chemical reaction in which the chemical activity of the protein species and the proton chemical potential are related by the law of mass action (Wyman and Gill, 1990). The theory of linked functions thus provides the theoretical basis for a quantitative treatment
* f i e s a t address: Biogen Inc., 12 Cambridge Center, Cambridge, Massachusetts 02142. 1 ADVANCES IN PROTWN CHEMISTRY, Vol. 51
Copyright 0 1998 by Academic Press. All rights of reproduction in any form reserved. 00653239/98 $25.00
2
MICHAEL SCHAEFER ET AL.
of protein titration (Wyman, 1964; Szabo and Karplus, 1972; Schellman, 1975). Further, an accurate theoretical understanding of the pK,’s of titratable sites in proteins and the pH dependence of electrostatic free energies are important because ionizable groups play an essential role in numerous processes of biological interest, e.g., enzyme catalysis (Jhowles, 1976; Warshel et al., 1989), proton transport (Warshel, 1979; Zhou et d., 1993),and the stabilityof molecules and molecular assemblies (Yang and Honig, 1993; Antosiewicz et al., 1994; Schaefer et al., 1997). In this review of protein titration, methods for the calculation of protein pK,’s and pH-dependent electrostatic free energies in solution will be developed on the basis of linked function theory, and applied to the calculation of lysozyme pK,’s, the pH dependence of the stability of native lysozyme,and the pH dependence of the stability of the capsid of foot-and-mouth disease virus. In the latter application, the contributions from individual sites will be addressed, in particular those that are responsible for the acid lability of the capsid. Furthermore, a simplified model for the pH dependence of binding free energies will be derived based on the pK,’s of the system (“independent sites” model). To formulate the problem, we consider only two protonation states for each site, unprotonated and protonated. This “two-state model” represents an implicit averaging over multiple protonated (unprotonated) states of the anionic (cationic) sites. For example, only one unprotonated state for histidine is considered where the proton charge is equally distributed among Hg,and HCz(see Table I in Section II,D). The protonation state of a protein with Nsites can then be described by a vector S with N components, where the component si is
si=
[
0,
if site i is unprotonated
1,
if site i is protonated
(1)
When the fully unprotonated state of the protein is used as the reference, the overall reaction that leads to the protonation state S is
Y((Ti) + n(Z)H++ 9 ( S )
(2)
where we have written 8(q for the protein in state S, the fully unprotonated state is denoted S = 0, and n(S) is the number of protons bound in state S. The equilibrium constant, a(;),for this reaction is defined as (Wyman and Gill, 1990)
where x is the proton chemical activity, and where the activity coeffi-
ELECTROSTATIC FREE ENERGY IN SOLUTION
3
cients of the protein species are included in the equilibrium constant so that molecular concentrations can be used. The equilibrium constants a(?)are generally referred to as Adair constants (Wyman and Gill, 1990) after the analysis of Adair (1925) on hemoglobin. The binding polynomial, E , which is a macroscopic analog of the partition function, is defined as the sum over the concentrations of all different species relative to that of the reference species (here, the unprotonated state). For the present case, it has the form (Wyman and Gill, 1990)
where the sum is over all 2Nprotonationstates. We use the term partition function for E in what follows. On the right-hand side of Eq. (4), we have written out explicitly the contributions from the fully unprotonated state [set equal to 1 , see Eq. (3)] and from the fully protonated state, S = i.In contrast to the nomenclature that is often used in the context of multiple ligand binding, we distinguish the ( f )protonation states with k protons bound, where the expression in parentheses is the binomial coefficient. This distinction is required because the titratable sites in a protein are, in general, not equivalent, such that the electrostatic free energies of different protonation states with the same number of protons bound are distinct and lead to different statistical weights, a(?), in the partition function. In principle, it would be possible to group all terms with the same number k protons in the partition function, introducing Adair constants that depend only on the total number of protons bound. However, for an analysis of the titration behavior of the individual sites and the calculation of their pK,’s it is necessary to keep the microscopic equilibrium constants a ( i )in the generating function. 11. THEORY AND CALCULATIONAL METHODS Each term in the binding polynomial, Eq. (4), can be rewritten as a Boltzmann weight factor involving the pHdependent electrostatic free energy, AG(S, pH), of the protonation state S relative to the free energy of the unprotonated reference state:
where R is the gas constant and T is the absolute temperature. The quantity AG(S,pH) is the electrostatic free energy relative to that of
4
MICHAEL SCHAEFER ET A L
the unprotonated state because the term in the binding polynomial, Eq. 30) titrating sites, where it is computationallyimpractical to calculate the electrostatic free energy from the generating function, we use Eq. (17) and obtain for the free energy difference:
8
MICHAEL SCHAEFER ET AL
where GA(0) and G:,((0) are the electrostatic free energies of the unprotonated system in conformation A and B, and where and (n)Bare the titration curves of A and B, respectively. Equation (20) in differential form was first shown to be applicable to the pH stability of lysozyme by Tanford and Roxby (1972),where it was demonstrated that the derivative of log KDwith respect to pH (KD is the equilibrium constant for the denaturation) correlates well with the difference between the titration curves of the native and the unfolded protein. According to Eq. (19), the absolute free energy difference is given by the difference between the electrostatic energies of the unprotonated states, minus (In 1O)RT times the area between the titration curves of the two conformers in the pH range (pH, m) (see Fig. 1). Consequently, a stabilization of conformation A relative to conformation B with respect to the limiting value AGY(G) results for a pH range where (n)* is larger than (n)B.The quantity AAG,(pH) is the relative free energy difference between the two conformers. The relative stability as defined in Eq. (20) approaches 0 in the limit pH + 03, irrespective of the system and conformations that are studied. The advantage of the absolute, as compared with the relative, electrostatic free energy difference is that it makes it possible to analyze electrostatic free energy charges at any given pH, e.g., due to conformational change (including binding) or due to mutations in the sequence of a protein.
C. The Independent Sites Model If one makes the simplifylngassumption that all sites of a system titrate as simple acids, i.e., according to the Henderson-Hasselbach equation (Fersht, 1985), it is possible to calculate the generating function, the free energy, and the titration curve of the system from simple analytic formulas based on knowledge of the pK,’s alone (MacKerell et al., 1995). The assumption of simple titration behavior for the sites is equivalent to the requirement that the sites titrate independently, such that the electrostatic free energy of a protonation state relative to the fully unprotonated state is the sum of individual site contributions; i.e.,
ELECTROSTATIC FREE ENERGY IN SOLUTION
conformer B
9
conformer A
I I
"reference medium" Ei - - - - - - E s , ions
"solution"
AG,B,,,(~ ref) - - - - - - - - - -
AG,A,,,(P') - - - _ -
!
AG e( PH)
FIG. 1. Thermodynamic cycle describing the contributions to the absolute, pHdependent electrostatic free energy difference AG,(pH) between two conformations, A and B, of a titrating system. The unprotonated state is used as the reference state in the upper half of the cycle, Ire'=0. For the calculation of Coulomb energies (A,!$&, = E$,,,, - Et,,,,,) and solvation free energies (A&"), the "reference medium" (on top of the horizontal dashed line) is assigned zero ionic strength and the dielectric constant of the protein interior, ei. The pHdependent second term of the electrostatic free energy difference in Eq. (19) is concerned with the lower half of the cycle, while the pHindependent first term is derived from the upper half of the cycle.
10
MICHAEL SCHAEFER ET AL.
N
AGIs(S,pH)=
c AGiS(si,pH)
1=
1
The model is thus referred to as the “independent sites” (IS) model. From the Henderson-Hasselbach equation (Fersht, 1985), it follows that the pHdependent free energy difference between the protonated and unprotonated state of a simple acid with given pK, is AGHH(pH)= (In 10)RT(pH - pK,)
(23)
If one considers only two protonation states for each site, unprotonated (si = 0) and protonated (si = l ) , Eq. (23) can be generalized to yield the free energy contribution from site i in charge state si (protonated or unprotonated) , relative to the unprotonated states: AGfS(si,pH)= (In 10)RTsi(pH- P K , ~ )
(24)
The right-hand side of Eq. (24) is multiplied by si such that AG’iS (0,pH) = 0. The fact that the electrostatic free energy of the protonation state S in Eq. (22) is a sum of Nindependent terms permits the factorization of the generating function (Wyman and Gill, 1990) as defined in Eq. (8):
n exp [ - p AGiS(si,pH)] N
= n{l i= I
+ exp [-p
AGIS(l,pH)]}
(25)
where we have made use of the fact that in the two-state model there are only two protonation states for each site with relative free energies of 0 and AG$( 1,pH), respectively. From the generating function, Eq. (25), it follows immediately that the electrostatic free energy relative to that of the unprotonated state and the titration curve of the system are the sums of the independent contributions from the sites: N
AG,,(pH) = - R T xi= 1 In (1 + exp[-P AGtS(l,pH)]} (n(pH))ls=
exp [-/3 AGIS(l,pH)l 1 exp [-p AGjS(l,pH)]
2+
(26) (27)
ELECTROSTATIC FREE ENERGY IN SOLUTION
11
where we made use of Eqs. (9) and (10) defining the pHdependent relative free energy and the titration curve. In the literature, Eqs. (25) to (27) have been used to describe a simplified model of the unfolded state of proteins, termed the “null” or “zero interaction” model (Yang and Honig, 1993; Antosiewicz et al., 1994; Schaefer et aZ., 1997), where the pK,’s of all sites are assumed to be equal to the standard pK,’s of the amino acids. The independent sites model is also of interest for a comparison between the actual site titration curves of a protein, Eq. (1 1),and the site titration curves (terms i = 1 to N) according to Eq. (27), where the effective pK,’s from the former titration calculation are used in the latter. This makes it possible to quantify the departure of the titration curves of the individual sites from the ideal titration curve of a simple acid. Finally, the independent sites model is useful for an interpretation of free energy differences between conformations on the basis of pK, shifts alone. In accordance with the definition of the absolute pHdependent electrostatic free energy, we add the free energy of the unprotonated state on both sides of Eq. (26) and obtain N
GIs(pH) =
GeI(6)- R Ti=z1 ln{l
+ exp [-PAG$(l,pH)]}
(28)
We then rewrite the absolute electrostatic free energy as a function of the proton concentration, [H+],and the acid dissociation constants of all sites, Ka,,, N
GIS(pH) = Ge1(8) - (In 1O)RTX log I=
I
(29)
The free energy difference between two conformations A and B [see respectively, Eq. (IS)] with the acid dissociation constants KL,iand is then (Szabo and Karplus, 1972; Ascenzi et al., 1990; Casale et al., 1995)
where APK,~= P K : , ~- P K , , and ~ AGy(6) = G$(6) - GE,(G). Equation (30) has been employed in a study by Ascenzi et al. (1990) on the binding
12
MICHAEL SCHAEFER ET AL.
of the bovine pancreatic trypsin inhibitor to human and bovine factor Xa, and by MacKerell et al. (1995) on the binding of guanosine monophosphate to ribonuclease, T1. For these cases the two conformations correspond to free and bound inhibitor/substrate. In the limit pH + -03, the only protonation state that is enesetically accessible is the fully protonated state, i.e., AGIs(-00) = AGy( 1) [compare Eq. (15) for the limit pH + m of the electrostatic free energy]. Furthermore, the logarithmic term in Eq. (30) vanishes in the limit of infinite proton concentration, such that the remainder of the right-hand side is equal to the electrostatic free energy difference between the fully protonated conformers. Equation (30) can thus be rewritten in the form AGE(pH) = AGy((1) - (In 1O)RTZ log K:, + [H+I i=l K , , + [H’] where S = i is the fully protonated charge state. From Eqs. (30) and (31), it follows that in the independent sites model, the difference between AGy(fi) and A G y ( i ) is proportional to the sum of the pK, shifts of all titratable sites: N
(AGy(6) - AGy(i))Is= (In 10)RTx ApK,,, i= 1
(32)
This relation between the fully protonated and the fully unprotonated states of a system with independent sites has been given in Wyman and Gill (1990) for a single conformer (i.e., without the A’s) and referred to as the free energy of converting the unligated macromolecule to the fully liganded macromolecule. The dissociation constant of a molecular complex, K,, is related to the free energy of binding according to (In 10)RT pKI(pH) = -A GAB(pH). Equation (30) can thus be rewritten in a form that yields the pH dependence of pK, as a function of the pK,’s of ionizable groups in the complex and in the dissociated molecules. The present treatment of pHdependent electrostatic binding free energies provides, therefore, a basis for the analysis of pHdependent ligand-binding studies. In particular, the IS model, although simplified, avoids the shortcomings of some models, in which it is assumed that complex formation occurs for only a single charge state of the protein and the inhibitor, thus neglecting the contributions from the multiple alternate protonation states, as pointed out by Knowles (19’76) and by Brocklehurst (1994).
ELECTROSTATIC FREE ENERGY IN SOLUTION
13
D. Standard, Intrinsic, and Effective pK, Three different pK,’s of a titrating site are distinguished: the standard, the intrinsic, and the effective pK,. The standard pK, of an amino acid side chain is the experimental pK, of the isolated amino acid with Nterminal and Cterminal blocking groups (Fersht, 1985; Nozaki and Tanford, 1967; Tanokura, 1983; Lehninger et aL, 1993). Analogously, the standard pK,’s of the N-terminal ammonium group and the Cterminal carboxyl group are the experimental pK,’s of these groups in an otherwise electrically neutral amino acid (e.g., alanine) with respectively Gterminal or N-terminal blocking group. The effective pK, is the pK, that is observed for the site as part of the entire system (i.e., the protein). The effective pK, of a titrating site may differ from the standard pK, by several pK units. In Fersht (1985), highly perturbed pK,’s in enzymes are reported with pK, shifts of up to 5 pH units; this corresponds to a free energy shift of 6.9 kcal/mol (using 300 K for the conversion). The largest measured pK shift in the proteins lysozyme (Bashford and Karplus, 1990; Kuramitsu and Hamaguchi, 1980; Bartik et al., 1994)) ribonuclease A (Rco et aL, 1991), and myoglobin (Bashford et aL, 1993) is about 2.5 pK units. Finally, the intrinsic pK, is the hypothetical pK, of a site in the system assuming that all other titrating sites are fixed in their electrically neutral state. The standard and the intrinsic pK, are denoted by pFFd and pKF, while the effective pK, is denoted by pK, in the following. For calculating the pH-dependent properties of a system, a choice of the titrating sites which are included in the analysis has to be made. In this choice, it is possible to exclude atom groups whose standard pK,’s are far from the pH range of interest; e.g., the Ser hydroxyl group with pFFd = 13.6 in considerations of protein stability around pH = 7. Each titrating site is composed of a set of atoms whose partial charges depend on the protonation state of the site, e.g., all side-chain atoms in the case of aspartate (see Table I). The atoms of the system that are not part of any site are termed the “background atoms” or “background charges” (Bashford and Karplus, 1990). By definition, the background atoms are assumed to have pH-independent partial charges, even though the theory that is presented in this work makes use of infinite pH limits; e.g., if peptide N - H groups are excluded from the set of titrating sites, it is assumed that they are not ionized in the limit pH + w . This does not cause any difficulties, and it is understood that the pHdependent properties of a system are defined relative to a chosen set of titrating sites. By definition, the pKk’ of site i in a protein differs from the PK;”’~of the site in the isolated amino acid by a pK shift that is due to the electrostatic interaction with the protein environment (protein backbone; nontitrating side chains; titrating side chains in their uncharged
Chargeb
Chargeb Site A%
pKmd
Atom
s=o
s=
12.48
CD HD 1 HD2
NH1 HHll HH12 NH2 HH21 HH22
0.20 0.09 0.09 -0.70 0.44 0.44 -0.80 0.26 0.26 -0.80 0.26 0.26
0.20 0.09 0.09 -0.70 0.44 0.64 -0.80 0.46 0.46 -0.80 0.46 0.46
CB HBl HB2 CG OD 1 OD2
-0.28 0.09 0.09 0.62 -0.76 -0.76
-0.21 0.09 0.09 0.75 -0.36 -0.36
+
NE
Ip
HE
cz
4.00
Site
1
pP?d
His (contil.)
LYS
N-Ter
10.79
7.50
Atom
s=o
s = l
HDl CD2 HD2 CE1 HE1 NE2 HE2
0.16 0.09 0.09 0.25 0.13 -0.53 0.16
0.44 0.19 0.13 0.32 0.18 -0.51 0.44
CE HE1 HE2 NZ HZ 1 HZ2 HZ3
0.18 0.05 0.05 -0.96 0.34 0.34 0.00
0.21 0.05 -0.30 0.33 0.33 0.33
N HTl HT2 HT3
-0.96 0.34 0.34 0.00
-0.30 0.33 0.33 0.33
0.05
GTer
3.80
C OT1 OT2
0.34 -0.67 -0.67
0.76 -0.38 -0.38
cys*
10.46
CB HB1 HB2 SG HGl
-0.25 0.05 0.05 -0.85 0.00
-0.11 0.09 0.09 -0.23 0.16
Glu
4.40
CG HGl HG2 CD OEl OE2
-0.28 0.09 0.09 0.62 -0.76 -0.76
-0.21 0.09 0.09 0.75 -0.36 -0.36
CB HBl HB2 CG NDl
-0.08 0.09
-0.05 0.09
0.09 0.08 -0.53
0.09
His‘
6.42
0.19
CA HA
0.19 0.09
0.21 0.10
Serd
13.60
CB HB 1 HB2 OG HGl
-0.14 0.05 0.05 -0.96 0.00
0.05 0.09 0.09 -0.66 0.43
Thr’
13.60
CB HB OG1 HGl CG2 HG21 HG22 HG23
-0.05 0.09 -0.96 0.00 -0.35 0.09 0.09 0.09
0.14 0.09 -0.66 0.43
cz
-0.04 -0.96 0.00
10.13
OH HH
-0.27 0.09 0.09 0.09 0.11 -0.54 0.43
-0.51
“Atomsand charges compatible with the all-hydrogen parameter set of CHARMM; standard pK:s taken from Nozaki and Tanford (1967), Fersht (1985), Tanokura (1983), and Lehninger et ul. (1993). Partial charges in the unprotonated ( 5 = 0) and protonated state ( 5 = 1). ‘Protonated His: average of Hsd and Hse in the CHARMM parameter set; “macroscopic” standard pK. of His as defined in Tanokura (1983). Site excluded from calculations reported in Section II1,B.
16
MICHAEL SCHAEFER ET A L
state) and the change in the interaction with the aqueous environment, i.e., the desolvation effect upon transfer of the site from the isolated amino acid to the protein (see Fig. 2). For calculating the change in the electrostatic free energy of site i on the transfer, we use the isolated residue of site i as a model of the standard peptide. In the “model compound” for site i, the pK, is assumed to be equal to pKSCd. This permits the use of empirical data for the pPPd as a reference for the free energy change that is associated with protonating the titratable sites. In the following, we write A, for the model compound for site i. According to the thermodynamic cycle in Fig. 2, the pK shift PK’?, - p KsEdof site i is given by
1 pK’,lF- pKSCd = ;(AG - AGO) 1
=C
(G,l(S,
= 1, s;&)
-
where c = -(In 10)RT. In Eq. (33), we have wri$en s; for the neutral (i.e., uncharged) state of site j, and Gel(?) and G,,’ (si) are the electrostatic free energies of the protein in protonation state s and the model compound for site i in protonation state s,, respectively. E. Electrostatic Free Energy of a Protonation State
To derive a formula for the pHdependent electrostatic free energy of a protonation state, we first consider the case where site i in the
FIG.2. Electrostatic free energy for protonating site i as part of the protein ( P ) with all other sitesj # i in their electrically neutral state s;, and as part of the model compound A,. The quantities AGI or AGO are the electrostatic free energy difference between the protein with site i protonated or unprotonated and the protonated or unprotonated model compound A,. From the cycle, it follows that c(pK’:,T - pKzd) = AGI - AGO, where c = -(In 10)RT.
17
ELECTROSTATIC FREE ENERGY IN SOLUTION
protein is allowed to titrate and all other sites are fixed in their electrically neutral state. Since site i is the only titrating site, Eq. (24) is applicable with PK,,~replaced by the intrinsic pK, of the site, pK:Y. Using Eq. (33), we can thus write the free energy difference between site iin protonation state s, (protonated or unprotonated) and the unprotonated state of the site (s, = 0) as
where all other sites are assumed to be electrically neutral, sjYi= sf. The first term on the right-hand side of Eq. (34) is multiplied by s, to ensure that AG?(si,pH) vanishes if si = 0. Equation (34) can be generalized to be applicable to the case where + n (S) H+ all sites are allowed to titrate. In the ligation process 4 +. P(?) [see Eq. (2)],all sites change their protonation state from 0 (unprotonated) to si (protonated or unprotonated) . To evaluate the corresponding electrostatic free energy difference, the second term in parentheses in Eq. (34) is substituted by Gel@) - Gel(@,whereas the first and the third term contribute for each site independently. The pHdependent free energy of the protonation state 3, relative to the fully unprotonated state of the system, is thus
(n)
In Schaefer et al. (1997), Eq. ( 3 5 ) has been shown to be consistent with other, somewhat more complicated, expressions for AG(Z,pH) in the literature (Yang et al., 1993; Antosiewicz et al., 1994).
E
Electrostatic Free Energy Calculation To calculate the electrojptic free energy as a function of the protonation state, Gel (?) and Gel[si),we make use of the continuum model in which the solute and the aqueous solvent are described as polarizable media with dielectric constants ei and E, , respectively, and where the
18
MlCHAEL SCHAEFER ET AL.
presence of salt is accounted for by a continuous ion distribution in the solvent (Davis and McCammon, 1990; Sharp and Honig, 1990). The program UHBD (Davis et aL, 1991) is employed to solve numerically the linearized Poisson-Boltzmann equation by means of the finitedifference algorithm. Since the electrostatic potential depends linearly on the solute charges, the electrostatic free energy of the solute and the model compounds can be written as the sum of contributions from the charge groups (background atoms and the titratable sites) interacting with their own reaction field, and from the interaction between charge groups. The sites i and j consist of the atoms with indices p and v whose partial charges q;1 and respectively depend on the charge state of the sites (see Table I). Given the electrostatic potential of site i in charge state s,, @?, the charge statedependent electrostatic interaction between sites i and j is
Y
where the sum is over all atoms of site j, and where x, is the position of atom v. Equation (36) is symmetric under the exchange of the indices i and j, i.e., Gi,(si,sj)= qi(sj,si). By assigning the site index 0 and the constant charge state = 0 to the background atoms in the protein and in the model compounds, and using the definition of the electrostatic interaction between charge groups in Eq. (36), we can write the electrostatic free energy as
The sum in Eq. (38) is over all pairs of the indices 0 and i. This leads to a total of four terms, because the model compound for site i is composed of only two atom groups; i.e., the site itself, and the background atoms that constitute the remainder oLthe residue. In Eqs. (37) and (38), the terms (1/2) Gii(s,si)and (1/2) GiiI(si,si)in the summations are the self-energies [described as nonbonded Coulomb plus “Born” (Bashford and Karplus, 1990), or nonbonded Coulomb plus “chargesolvent” (Luty et d., 1992) energies] of site i in charge state si in the protein and in the model compound, respectively. The factor (1/2) and
ELECTROSTATIC FREE ENERGY IN SOLUTION
19
the double notation of the charge state siin the selfenergy are required for consistency with the definition of the energy of interaction between sites [Eq. (36)]. Using a finitedifference Poisson-Boltzmann program, e.g., UHBD, it is straightforward to calculate the potential of a group of charges in the presence of the solvent with salt and an otherwise uncharged solute. Given the potential of site i, the interaction with all other sites and the self-energy of the site are available at little computational expense, according to Eq. (36). It follows that for the calculation the electrostatic free energy terms Gq(si,sj)in.Eqs. (37) and (38), it is sufficient to perform 2N electrostatic potential calculations for the sites in the protein (two charge states each), 1 for the background charges of the protein, and 3N for the model compounds (background atoms and the site in two charge states for every Ati). With each electrostatic potential calculation typically requiring 12 min of CPU time on an HP-735/9000 workstation for an average-sized protein, and 10 s for a model compound, the electrostatic free energy calculations can be done within a few hours of central processing unit (CPU) time even for systems with several hundreds of sites. Technical aspects of the finitedifference calculations,e.g., appropriate parameters for the setup of the finitedifference grid and for the definition of the solute volume, are described in detail in the literature (Yang et al,, 1993; Antosiewicz et al., 1994; Schaefer et al., 1997).
G. Monte Carlo Titration For systems with more than about 30 titratable sites, it is computationally impractical to evaluate the free energy from the generating function or the ensemble averages in Eqs. (10)-(11), because the total number of charge states ( 2 N )increases exponentially with the number of sites. To avoid the requirement of sampling all charge states at any given pH, it is possible either to restrict the calculation to those sites whose pKsd: or pK:" is within a given interval from the pH of interest or to use the Monte Carlo (MC) importance sampling method (Metropolis et al., 1953).Whereas the former approach, the "reduced site" approximation (Bashford and Karplus, 1990), is limited to cases where not more than about 30 sites fall within the pH interval for the selection (i.e., the number of sites that can be treated by the enumeration of their charge states), the Monte Carlo approach has been successfully applied to systems with several hundred sites while providing well-defined statistical error bounds for the calculated site titration curves (Beroza et al., 1991; Beroza and Fredkin, 1996).Other titration methods have been proposed
20
MICHAEL SCHAEFER ET AL.
in recent years, e.g., a cluster method where the sites are subdivided into groups such that the interactions between sites in different groups are weak as compared with the interactions in the groups (Gilson, 1993), and a combination (Sham et al., 1997) of a cluster method with the mean field approximation introduced by Tanford and Roxby (1972). However, definitive error bounds for these methods have not been reported in the literature. Because of its applicability to large systems and the availability of statistical error bounds, a Monte Carlo program by Beroza et al. (1991) is employed for the calculations that are reported below in Section 111. It requires the standard pK,’s of the sites (see Table I ) , the electrostatic self- and interaction energies of the sites, background atoms, and the model compounds for all possible combinations of their charge states [see Eqs. (37) and (38)], and the absolute temperature as input. For details regarding the Monte Carlo program and other input parameters, e.g., the number of charge states that are sampled at each pH, see Beroza et al. (1991) and Schaefer et al. (1997).For the calculation of the average protonation of 100 sites at 150 pH values, i.e., the calculation of the site titration curves, the Monte Carlo program requires approximately 30 min of CPU time on an HP-735/9000 workstation. Since the computation time for the Monte Carlo program depends quadratically on the number of sites (Schaefer et al., 1997), it can be applied to systems with several hundred sites without requiring more time than the electrostatic free energy calculation with the finitedifference Poisson-Boltzmann method for the same system (see Section 11,F). 111. APPLICATIONS In the following subsections, applications of the theory of linked functions to the calculation of lysozyme pK,’s, the pH stability of lysozyme, and the pH stability of the foot-and-mouth disease virus capsid are reported. The calculation of lysozyme pK,’s is used to determine a value for the protein dielectric constant that leads to best agreement with experimental data, given that the present methodology does not explicitly allow for conformational relaxation of the protein due to changes in the ionization state. In the application to lysozyme pH stability, the aim is to illustrate the difference between the relative and absolute electrostatic free energy of folding as a function of pH and to elucidate the origin of the conformation dependence of the absolute electrostatic free energy. In the application to foot-and-mouth disease virus, the acid lability of the virus capsid is analyzed and the contributions from specific sites is identified. It is demonstrated that interactions involving a helix
ELECTROSTATIC FREE ENERGY IN SOLUTION
21
dipole are not responsible for the acid lability. Furthermore, the study of the virus capsid stability demonstrates the applicability of the methodology to systems with many (>loo) sites. A. Protein Dielectric Constant While the dielectric constant and ionic strength of the aqueous solvent are uniquely defined by the experimental conditions (i.e., by the temperature, pressure, and the solute concentrations), there is uncertainty about the dielectric constant (or constants) to be assigned to the interior of solutes in continuum electrostatic calculations (Gilson and Honig, 1986; Warshel, 1987). In the present approach the solutes are assumed to have a fixed structure for each well-defined conformer (MacKerell et al., 1995), so that the effects from atomic polarization and from conformational flexibility must be accounted for implicitly by the dielectric constant (or constants) that is assigned to the protein interior. Although the protein interior is known to be inhomogeneous and it would be useful (and would be possible) to vary the dielectric constant used for different parts of the solute (Statesand Karplus, 1987; Demchuk and Wade, 1996), we use a single dielectric constant in the present calculations. The conformational flexibility of proteins and, as a consequence, their dielectric response and orientational polarizability are expected to be important for changes in the protonation state of the system, i.e., for the stabilization of ionized sites in the interior of proteins (Warshel et al., 1984,1986).The effect from the reorientation of protein permanent dipoles is analogous to the difference between the equilibrium (-78) and infinite frequency (-2) dielectric constant of water. In principle, polarization effects could be accounted for by using different ensembles of structures for each charge state S. Here, we use one fixed protein structure for all charge states, standard partial charges for the neutral and the ionized state of all sites from the all-hydrogen parameter set of CWM (MacKerell et aZ., 1992; MacKerell et al., 1998), so that polarization effects must be accounted for by the protein dielectric constant. In the calculation of the solvation free energy of small molecules, dielectric constants in the range from ei = 1 to 2 have been used, and good agreement with experiments and with the results from microscopic free energy perturbation methods has been obtained (Jean-Charles et al., 1991; Sitkoff et al., 1994b). In part, the success of these approaches may be attributed to the fact that the small compounds that were studied are rigid or have very limited conformational flexibility. Furthermore, the results of solvation free energy calculations for small molecules with
22
MICHAEL SCHAEFER ET AL.
the continuum model are nearly independent of the dielectric constant that is assigned to the solute interior because of the high solvent accessibility of all atoms. In studies on electrostatic solvation effects in proteins, an internal dielectric constant of ei = 2 to 4 has frequently been used, where ei = 2 is attributed to the electronic polarizability of protein atoms, while it is assumed that E~ = 4 also accounts for fluctuations of the permanent dipoles (Gilson and Honig, 1986). However, as pointed out above, conformational changes, including side-chain reorientations (You and Bashford, 1995), are expected to play a role in stabilizing ionized sites (Russell and Warshel, 1985; Warshel et al., 1986). In the absence of approaches which account for conformational change explicitly, it is thus necessary to assign an effective dielectric constant to the protein interior that exceeds the value of 2-4 reflecting the atomic polarizability and small structural fluctuations. The requirement that the protein interior be described as a polar, rather than an apolar, medium in the context of pK, calculations for a single (fixed) structure was pointed out some years ago by Warshel and co-workers (1984; Warshel and Levitt, 1976). However, in that work and other papers (Warshel, 1978, 1987; Sham et al., 1997), it was concluded without justification, it seems to us, that the continuum electrostatic model is inconsistent altogether. The main criticism was based on the original Tanford-Kirkwood model (Tanford and Kirkwood, 1957) in which the self-energy contribution to pK, shifts was theoretically addressed but not included in the treatment because all titratable sites were assumed to be at a constant distance ( 1 from the surface of the spherical protein model to obtain reasonable agreement with experiment (Tanford, 1957). Furthermore, the Tanford-Kirkwood model did not include the interaction between ionized sites and protein permanent dipoles because structural information at the atomic level was not available at the time. This is of course not true of the continuum electrostatic methods in current use (Bashford and Karplus, 1990; Yang et al., 1993; Antosiewicz et al., 1994) and employed in the present study. Other than the interaction between titratable sites, they include the contributions from the self-energy (“Born term” or “charge-solvent energy”) and the interaction between sites and the permanent charges of the protein atoms (interaction with the “background atoms”) (Bashford and Karplus, 1990; Davis et al., 1991). In recent studies employing the continuum model for calculating the self- and interaction energies of titratable sites, the use of a high dielectric constant in the range from 10 to 20 for the protein has been suggested, based on comparisons between calculated and experimental pK,’s (Antosiewicz et al., 1994; Demchuk and Wade, 1996). The assignment of a
A)
ELECTROSTATIC FREE ENERGY IN SOLUTION
23
high dielectric constant to the protein has been interpreted as a simple way of accounting implicitly for the conformational flexibility of proteins in continuum electrostatic calculations that are based on the use of a single conformer. This is no different, in principle, from using a dielectric constant of 78 for the aqueous medium to account for the orientational polarizability of the water dipoles. Because of its simplicity and success, this approach is followed in the calculations reported here. To determine the optimal value for the protein dielectric constant, a series of titration and pK, calculations has been performed for the protein hen egg-white lysozyme, where the protein dielectric constant was varied in the range from 1 to 30. In all continuum electrostatic calculationswith the UHBD program (see Section II,F), the dielectric constant of bulk water was set equal to 80 and the ionic strength of the solution was set to 0.145 M, a value close to that found under physiological conditions. The temperature was set to 300 K. The triclinic crystal structure [protein data bank entry 21zt (Ramanadham et al., 198l)l and the tetragonal crystal structure [entry lhel (Malcolm et aL, 1990)l of lysozyme were employed. To test the dependence of the results on small structural changes, three modified structures were generated for each crystal structure with 100, 200, and 300 steps of in vacuominimization using the program CHARMM (Brookset al., 1983). The minimized structures are labeled 2lzt-m1, m2, m3, and lhel-ml, m2, m3, respectively (for details, see Schaefer et al., 1997). Figure 3 shows the titration curves of all ionizable groups in lysozyme for a protein dielectric constant of ei= 20. By definition, the effective pK,’s of the sites are given by the pH where the site titration curve equals 0.5 (see Section 11,D). At each value of the protein dielectric constant, the average absolute deviation of the lysozyme pK,’s from experiment (Kuramitsu and Hamaguchi, 1980; Bartik et aL, 1994) was calculated. The result of the calculations is shown in Fig. 4 for the tetragonal and triclinic crystal structures and the minimized structures. The average error decreases significantly when the protein dielectric constant is increased from 1 to 20. For ei > 20, there is no further significant change in the average error. In the subsequent calculations, we therefore assign an effective dielectric constant of ei = 20 to the protein interior. The fact that the average absolute error in the lysozyme pK, calculations does not change significantlyfor ei > 20 could be taken as a reason to use ei = 80, a value which would greatly simplify the calculation of electrostatic free energies. If one omits the fact that ions are excluded from the solute volume, a protein dielectric constant of ei = 80 permits the use of the Debye-Hiickel equation for the interaction between
24
MICHAEL SCHAEFER ET AL.
PH FIG.3. pHdependent protonation of the 32 titratable sites in lysozyme (21zt) as calculated with the MC titration program. Protein dielectric constant E, = 20. The two protonation curves that depart markedly from curves of independently titrating sites (pH range 12-16) are for Tyr 53 and A r g 68, with effective pK.’s of 13.0 and 14.8, respectively.
charges, while the calculation of self-energies becomes obsolete since the protein and the solvent are equally polarizable. However, the test set of lysozyme pK,values used here includes predominantly sites that are located at the surface of the protein, where the degree of conformational flexibility is expected to be more significant than it is for sites that are less exposed to solvent, e.g., in the active sites of some enzymes (Sham et al., 1997). It follows that the dielectric constant of ei = 20 derived from the lysozyme pK,’s is likely to be an upper bound to the effective dielectric constant of the protein in the context of titration calculations with a single protein conformer. Thus, we use ei = 20 rather than a larger effective dielectric constant, despite the fact that ei > 20 gives similar results in the case of lysozyme. The finding that protein relaxation due to changes in the ionization state corresponds to an effective dielectric constant of ei = 20 for the protein is consistent with the observation that the Tanford-Kirkwood model of protein titration (Tanford and Kirkwood, 1957) gives best agreement with experiment when the charges are placed at the surface of the protein sphere (with ei = 2 to 4),where the solvent screening of charge-charge interactions is significant even at short distances. Recent theoretical studies on protein pK,’s, which are predominantly using finitedifference Poisson-Boltzmann methods for calculating electrostatic free energies, thus provide evidence that computationally much
ELECTROSTATIC FREE ENERGY IN SOLUTION
25
protein dielectric constant FIG.4. (a) Average e m r of calculated pK.'s relative to experimental data for tridinic hen eggwhite Ipozyme as a function of the protein dielectric constant used in the f i n i M e r e n c e Poisson-Boltzmann (FDPB) calculations. From bottom to top: crystal structure 2lzt and minimized structures 2lzt-ml, 2lzt-m2, and 2lzt-m3. The horizontal line represents the e m r of the null model, i.e., the average absolute difference between the standard pK.'s and the experimental pKo's. (b) Same as (a) for tetragonal hen eggwhite lysozyme. From bottom to top: crystal structure lhel and minimized structures lhel-ml, lhel-m2, and lhel-m3.
simpler continuum models can give results with comparable accuracy, in particular the modified Tanford-Kirkwood model of Gurd and coworkers (Matthew et al., 1985) and of Imoto (1983) and the distancedependent dielectric approach of Mehler and Eichele (1984). However, the advantage of the numerical methods for calculating continuum electrostatic free energies [i.e., the finite-difference (Warwicker and Watson, 1982) and the finiteelement (Zauhar and Morgan, 1985) methods] is that they address the interaction and the selfenergies of all charges consistently, using a description of the molecular surface at the atomic level (Fersht and Sternberg, 1989).
B. Variation of Stability of Lysozyme with pH There are many experimental studies on the pK,'s (Kuramitsu and Hamaguchi, 1980; Bartik et al., 1994), titration curve (Tanford and
26
MICHAEL SCHAEFER ET AL.
Roxby, 1972; Roxby and Tanford, 1971), and pH-dependent stability of lysozyme (Pfeil and Privalov, 1976), making this protein an excellent test case for the Poisson-Boltzmann method. In fact, lysozyme was one of the first proteins for which a complete titration curve was measured (Roxby and Tanford, 1971). Hen egg-white lysozyme has 11 arginines, 7 aspartic acids, 2 glutamic acids, 1 histidine, 6 lysines, 3 tyrosines, and N- and Gtermini, i.e., a total of 32 titrating sites with the list of sites given in Table I (see Section II,D for the reasons to exclude serine, threonine, and cysteine). To use the method described earlier (see, in particular, Section I1,B) for calculating the pH dependence of the stability of lysozyme, structures for the native and denatured state are required. An X-ray or nuclear magnetic resonance (NMR) structure is a good model for the native protein in solution, although the variation in side-chain conformers in different crystal structures introduces some uncertainty as to which, if any, is the best one to use. However, the uncertainty concerning the native state structure is much less than that in the case of the denatured state, for which insufficient experimental data exist for a structure determination. In fact, the available data suggest that the unfolded state of a protein is characterized by an ensemble of significantly different conformations (Privalov, 1992; Fersht et al., 1994; Fiebig et al., 1996). For simplicity,we use a linear structure as a model of the unfolded state in this study. The choice of the linear model corresponds to the limiting case for the unfolded protein. It is thus an important test case for an examination of the interactions between titrating sites in the unfolded state, which are assumed to be zero in the null model (see below) that has been used previously in protein stability calculations (Antosiewicz et al., 1994; Yang and Honig, 1993). Also, in contrast to random coil structures, which can be generated by various computational techniques for denaturating a protein (Caflisch and Karplus, 1994; Hunenberger et al., 1995), the extended structure is uniquely defined by a set of +/ 9 angles (assuming that all side-chain angles are set to 180") and can easily be generated and reproduced. High accessibility to the solvent is ensured for all side chains so that the maximum effect of denaturation on the stability is likely to be included. As a simple test of the dependence of the results on the choice of 4/ I)values for an extended structure, we use two different models of the unfolded state (see Fig. 5): first, the extended ideal p structure "Beta" with (4, @) = (-140", +135"); and, second, the "Ex72" model with (4, 9) = (-72", +72"). For both models, the choice of the (4,$)angles leads to a backbone conformation that is close to a local minimum of the vacuum energy (Schulz and Schirmer, 1978; Maccallum et al., 1995);
ELECTROSTATIC FREE ENERGY IN SOLUTION
a
b
n
L
27
20 A
100 A
100 A
FIG. 5. Wire graph of lysozyme (heavy atoms). (a) Stereo view of the native state (protein data bank entry 21zt). (b) Extended structure “Beta” after 100 steps of steepest descent minimization (all +/$ angles set to -140”/+135” prior to minimization; Nterminus left). (c) Extended structure “Ex72” after 100 steps of steepest descent minimization (all + / I ) angles initially set to -72“/+72”; N-terminus left).
28
MICHAEL SCHAEFER ET AL.
while the total energy of the Ex72 model is higher than that of the ideal Beta model, the Coulomb energy (dielectric constant of 1, no cutoff) of Ex72 is lower than that of Beta. To generate the extended structures, all side-chain dihedral angles were initially set to 180". The extended chains were then minimized using 100 steps of steepest descent minimization, to reduce confonnational strain and remove bad contacts in the initial conformation (for details, see Schaefer et al., 1997).The structures obtained after minimization are referred to as Beta-ml and Ex72ml, respectively. In the null model of the unfolded state, it is assumed that there are no interactions between the ionizable groups and that the sites titrate with their standard pK,'s, i.e., the null model corresponds to the trivial model introduced in Section II,C with PK,,~= PK$~.The pHdependent electrostatic free energy and the titration curve for the null model can thus be calculated using the analytical expressions in Eqs. (28) and (27). Since the electrostatic free energy of the fully unprotonated state is undefined in the null model, the free energy difference AGF(G) is set to 0 in Eq. (19), such that only the relative pH stability of proteins, Eq. (20), can be determined in this case. The null model has also been used as a reference in pK, calculations for native protein structures, where the average error of the computational approach is compared with the average difference between the experimental pK,'s and the standard pK,'s (Antosiewicz et ad,, 1994); see Fig. 4. Structures to represent the native state of hen egg-white lysozyme were taken from the protein data bank (Bernstein et aL, 1977). As for the test pK, calculations,both the triclinic crystal structure [entry 21zt (Ramanadham et al., 198l)l and the tetragonal crystal structure [entry lhel (Malcolm et al., 1990)] were used to obtain a measure of the effect of different native configurations. It has been shown (Bashford and Karplus, 1990) that the two structures give significantly different pK, values for some sites. Hydrogen positions were calculated using the HBUILD command (Briinger and Karplus, 1988) within CHARMM (Brooks et al., 1983), and van der Waals radii (RJ2 of the Lennard-Jones potential) were taken from the all-hydrogenparameter set param22 of CHARMM (MacKerell et al., 1992; MacKerell et al., 1998). As in the pK, calculations, the ionic strength was set equal to 0.145 M; the protein and solvent dielectric constants were assigned 20 and 80, respectively (see Section III,A), and the temperature was T = 300 K. The stability measurements (Pfeil and Privalov, 1976) were performed in 0.1 MNaCl at 298 K, and the titration curve measurements (Tanford and Roxby, 1972) were made at 298 Kwith a 0.1 Mconcentration of KCl for the native lysozyme and with an additional 6 Mconcentra-
ELECTROSTATIC FREE ENERGY IN SOLUTION
29
tion of guanidine hydrochloride as a denaturing agent for the unfolded state.
1. Titration Curve Because of the large number (32) of titrating groups, the free energy and pH stability of lysozyme was calculated using the titration curve integration method [Eq. (19)]. To this end, we determined the titration curves for the crystal structures 21zt and lhel, and for the unfolded protein models null, Beta, and Ex72. Figures 6 and 7 show, respectively, the experimental and calculated titration curves for the folded and unfolded protein. From Fig. 7, there are only minor differences between the titration curves of the two crystal structures: the maximum difference is lA(Q>lmax= 0.72 at pH = 2.2. The maximum differences between the titration curves of the unfolded lysozyme models are also small, IA(Q>lmax= 0.32 at pH = 3.8 between null and Beta-ml, and lA(@lmax = 0.80 at pH = 4.0 between null and Ex72-ml (see Fig. 7b). This is to be compared with the maximum difference between the calculated titration curves of the crystal structure 21zt and the null model; the maximum is lA(Q>lmax= 5.02 at pH = 3.4 (see Fig. 7a). The maximum difference between the experimental titration curves of the native and unfolded lysozyme is IA(Q)lmax = 3.45 at pH = 3.0 (see Fig. 6). Thus, the results from the two crystal structures, on the one hand, and between the three 20 16 12
8 4
PH FIG.6. Experimental titration curves of native (0,solution 0.1 M Kcl) and denatured lysozyme (+, solution 6 M guanidine hydrochloride); data taken from Tanford and Roxby (1972).
30
MICHAEL SCHAEFER ET AL.
0
2
4
6
8 PH
10
12
14
16
0
2
4
6
8
10
12
14
16
PH
FIG.7. Calculated titration curve of lysozyme. (a) Crystal structures 21zt (-) and lhel (---); for comparison, the titration curve according to the null model is also shown (--). (b) Models of the unfolded protein, null (-), Beta-ml (---), and Ex72-ml (-.-.-),
models of the unfolded lysozyme, on the other, are much more similar to each other with respect to the calculated titration curves than the titration curves of the native and unfolded protein. It is interesting to note that in Figs. 6 and 7a, the largest difference occurs in the same pH region (between pH 2.5 and 4). This region involves the carboxyl groups, which have the largest pK, shifts in the native relative to the denatured state. The calculated average pK, shift for the carboxyl groups in the unfolded lysozyme structure, -0.18 (Betam l ) and -0.29 (Ex72-ml), is in approximate agreement with recent studies by Oliveberg et al. (1995), who estimated that on the average, the pK,’s of carboxyl groups in denatured barnase are 0.4 pKunits lower than the standard pK,’s of the sites. In Fig. 8a and b, the calculated titration curves for the native and unfolded lysozyme structure are shown in comparison with the experimental titration curves taken from Tanford and Roxby (1972) (see Fig. 6). The average difference between the calculated and the experimental titration curves in the experimental pH range from 1.5 to 10.5 is 0.82 (in elementary charge units) for the crystal structure 21zt. For the titration curve of the unfolded lysozyme, we also find good agreement between theory and experiment. The average error between the calculated and experimental titration curve for the Beta-ml model is 0.52; the error for the Ex72-ml model is larger, 0.69 (titration curve not shown). For both extended models of unfolded lysozyme, the agreement with the experimental titration curve is significantly better than for the null
31
ELECTROSTATIC FREE ENERGY IN SOLUTION
:f 0
-5
-10
0
2
4
6
8
10
12
PH
14
16
0
2
4
6
8
10
12
14
16
PH
FIG.8. Titration curve of lysozyme. (a) Native state: calculated for the triclinic crystal structure 21zt (-); experimental data (0).(b) Denatured state: calculated for the exexperimental data (+). tended Beta-ml model (-);
model, for which we determined an average error of 1.13. This implies that the extended structures Beta and Ex72 are useful models of the unfolded protein in the context of titration and pH-stability calculations. The agreement between the calculated and the experimental titration curves is best for the unfolded models Beta-ml and Ex72-ml, i.e., after the first 100 steps of minimization. The average error for the titration curves of the unminimized extended models Beta and Ex72 are 0.56 and 0.73, respectively. This may be a consequence of the high conformational energy (i.e., low probability) of the unminimized Beta and Ex72 structures, which are generated by setting all side-chaindihedral angles to 180" and bond length, bond angles, and dihedrals of the protein backbone (including prolines) to ideal values. 2. Relative Electrostatic Stability From Eq. (20), the titration curves of two conformers of a system are sufficient for calculating the relative free energy difference as a function of pH. Since we found the best agreement between the calculated and the experimental pK,'s and titration curve of lysozyme when using the crystal structures 21zt and lhel, we report the results obtained with them. For the same reason, we use Beta-ml and Ex72-ml as the models of the unfolded protein. In Fig. 9a, the relative stability AAG, Eq. (20), of triclinic lysozyme (21zt) is shown relative to the null, Beta-m1, and Ex72-ml models of the unfolded protein. The experimental stability curve (Pfeil and Privalov, 1976),which has been determined in the pH range 1.5-7, is also given;
32
MICHAEL SCHAEFER ET AL.
-20 . -2
0
2
4
6
8
PH
10
12
14
16
18
-21
-2
'
'
'
'
'
'
'
'
'
'
0
2
4
6
8
10
12
14
16
1
PH
F1c.9. Relative pHdependentstability. (a) Native lysozyme (21zt) using the null (-), the Beta-ml (---), and the Ex72-ml (...) models of the unfolded protein as the reference state; experimental data (0) for the net stability (total folding free energy, including nonelectrostatic contributions) of native lysozyme in 0.1 MNaCl solution taken from Pfeil and Privalov (1976); data points ( + ) calculated with the titration curve integration method, Eq. (19), using the experimental titration curves (Tanford and Roxby, 1972) of the native and the unfolded lysozyme for the integration and setting AAG at pH = 7 equal to the experimental value from Pfeil and Privalov (1976). (b) Unfolded models Beta-ml (-) and Ex72-ml (--) of lysozyme, using the null model as the reference.
the aggregation of lysozyme has so far prevented accurate measurements at high pH (C. M. Dobson, private communications, 1996). For comparison, we also calculated a "semiexperimental" stability curve using the titration curve integration method, Eq. (19),with the experimental titration curves of the folded and unfolded lysozyme (Tanford and Roxby, 1972). We used pH = 7 with its associated experimental stability from Pfeil and Privalov (1976) as the reference pH and reference stability, since the experimental titration curves do not extend to a pH where the unprotonated state dominates [see Eq. (20) and related text]. Although there is approximate correspondence between the measured stability curve and that calculated from the titration curves, particularly for the change in stability in the pH 2-4 range where the carboxyl groups titrate, there are significant differences at higher pH. These may be due to the assumption that the nonelectrostatic contributions to protein stability are pH independent, as well as to differences in the experimental conditions in the titration curve measurements of the native and unfolded states, in particular the ionic strength of the solutions used in the measurements (Tanford and Roxby, 1972). Regardless of the structures used, the relative stability curves always approach zero at high pH. This is a consequence of the definition of AAG, which uses the unprotonatzd state of the system as thereference
ELECTROSTATIC FREE ENERGY IN SOLUTION
33
where AAG = 0 [see Eq. (20) and related text]. Other choices for the reference state of AAG are possible, and they would lead to a different offset for the relative stability curves in Fig. 9. It is coincidental, therefore, that the calculated relative stabilityin Fig. 9 and the experimental stability of lysozyme are in the same energy range. First, nonelectrostatic free energy contributions to protein stability are not accounted for in the present theory; and, second, the electrostatic free energy difference between the native and the unfolded protein in the unprotonated reference state is not accounted for in the relative stability results [see Eq. (20) and the thermodynamic cycle in Fig. 11. As is expected from the good agreement between the titration curves of Beta-m1 and Ex72-ml (see Fig. 7b), the dependence of AAG on the unfolded model structure used in the calculation is small when compared to the change in the free energy between pH = 7 and the very low and very high pH limits. In accord with experiment, the theoretical pH-stability curve predicts a plateau for AAG in the pH range 4 5 pH 5 7 and an increase of the relative free energy of the order of 10 kcal/mol when the pH is decreased from 4 to 1.5. Pfeil and Privalov (1976) report an overall Gibbs free energy difference between the folded and the denatured state of lysozyme equal to -14.5 kcal/mol under standard conditions (pH = 7, T = 25°C);at pH = 1.5, the experimental value is -5.3 kcal/mol. Thus, the change in experimental stability between pH = 1.5 and 7 is -9.2 kcal/mol. This is to be compared with the calculated stability difference between pH = 1.5 and 7 of -13.6 kcal/mol (21zt/Beta-m1) and -11.8 kcal/mol (21zt/Ex72m l ) . If the null model is used as the unfolded reference state, the predicted change in the stability is -16.5 kcal/mol. The use of the explicit Beta and Ex72 models of the unfolded protein clearly leads to a better agreement with the experimental pH dependence of the stability of lysozyme. To analyze the difference between the extended models of the unfolded protein and the null model in more detail, we calculated the stability curves of Beta-ml and Ex72-ml relative to the null model. From Fig. 9b, it follows that the relative free energy of Beta-ml (Ex72m l ) changes by 1.4 (1.3) kcal/mol upon variation of the pH from 7 to 14, and by -2.9 (-4.7) kcal/mol upon a pH change from 2 to 7. This implies that the interactions between titrating sites in the unfolded state, although small, are not negligible. The calculated changes in the stability of 3-5 kcal/mol and about 1.5 kcal/mol upon variation of the pH from 7 to 2 and from 7 to 14, respectively, are likely to be a lower bound to the error that results from the assumption of zero interaction between sites in the null model, since the extended models are designed for
34
MICHAEL SCHAEFER ET AL.
maximum solvent exposure of the titrating sites and, as a consequence, minimal interaction between sites. In a random coil structure or a molten globule state of a protein (Ptitsyn, 1992; Fiebig et al., 1996), residual secondary and tertiary structure would cause the interactions between titrating sites to be larger than for the extended models used in this study.
3. Absolute Electrostatic Stability Use of the extended model of the unfolded protein makes it possible to calculate a meaningful absolute electrostatic free energy difference between the native state and the unfolded state. According to Eq. (19), this requires the calculation of the electrostatic free energy of the fully unprotonated protein in solution for both conformations (here, the native and the extended structures). The absolute electrostatic free energy difference between the conformers A and B is given by the relative free energy difference plus the difference between the electrostatic free energies of the unprotonated reference states, AGAB(pH)= AAG,(pH) AGy(0) [see Eq. (20)]. It follows that the absolute electrostatic free energy curves in Fig. 10 are shifted by AGY((0) = G$ (0) - G:, (0) relative to the curves in Fig. 9.
+
5 -
8 -20 -2
0
2
4
6
8
10
12
14
16
18
PH FIG. 10. pHdependent absolute electrostatic contribution to the stability of lysozyme: triclinic crystal structure 21zt relative to the Beta-ml (-) model; 21zt relative to the Ex72-ml (---) model; tetragonal crystal structure lhel relative to the Beta-ml (-.-.-) model; and lhel relative to the Ex72-ml (...) model of the unfolded protein. Experimental (0)and semiexperimental (+) stability data are the same as in Fig. 9a.
35
ELECTROSTATIC FREE ENERGY IN SOLUTION
In Table 11, the Coulomb energy, the electrostatic solvation free energy, and their sum, the electrostatic free energy in solution, are given in four different charge states for the structures 21zt, lhel, Beta-ml, and Ex72-ml used for the stability calculations.The electrostatic free energy in solution is always the lowest for the standard charge state (all sites in the standard state at pH = 7); this is in accord with the fact that lysozyme is most stable in the pH range from 5 to 9. Interestingly, the Coulomb energy alone is also the lowest for the standard charge state of each structure, but the relative ranking of the four charge states with respect to Ecouland G,, is different. Whereas the two crystal structures 21zt and lhel have very similar values in each of the charge states, the unfolded structure Ex72-ml has energies in all charge states that are about 10 kcal/mol lower than for Beta-m1. Since the only electrostatic energy term that all four charge states have in common is the interaction of (nontitrating) polar groups, the more negative electrostatic free energy of Ex72-ml must arise from the polar groups; specifically, it is due to TABLE I1 Ekctrostatic Free Enm@ fw the Charges Statesb 8, yo, is*, and 1 Structure
21zt
S
0 711 p d
1 1 he1
0
3" p d
1 Beta-ml
0 3O
-Trtd 1
Ex72-ml
0 so jsd
1
Eco"l(3
GOl"(3
Gl(3
-80.4 - 128.2 -161.2 2.4
-102.5 -17.1 -77.1 - 186.4
- 182.8 - 145.3 -238.3 - 184.0
-82.6 - 129.8 -163.1 0.8
- 102.8 -16.8 -77.1 - 186.2
- 185.3 - 146.6 -240.2 - 185.4
-104.4 -104.2 - 122.9 -62.7
-79.4 -31.2 100.0 -119.0
-183.8 -135.4 -222.9 -181.7
- 120.6 -125.4 - 143.0 -75.3
-73.3 -21.4 -90.0 -116.6
- 193.9
-146.7 -233.1 -191.9
Energies in kcal/mol; protein c, = 20, solvent G = 80, monovalent ion concentration 0.145 M. 6, unprotonated state; ?, all sites neutral; SStd, standard charge state at pH = 7; 1, protonated state. ' = Ecwi + Gm.
36
MICHAEL SCHAEFER ET AL.
the hydrogen bonds of the backbone that are present in Ex72-ml but not in Beta-ml. The electrostatic free energy differences between the four pairs of native and unfolded lysozyme structures are given in Table 111. The same set of charge states as in Table I1 are compared. Since the values of G$ (S) for 21zt and lhel are very similar, A G F (S) depends on whether Beta-ml or Ex72-ml is used for the unfolded state. As explained above, the folding free energy AG? (8) of the unprotonated state is required to calculate the absolute pH stability from the relative stability. While the other charge states could, in principle, also be used as the reference states in the thermodynamic cycle in Fig. 1, their folding free energies are mainly of interest for comparison with the calculated pH-stability curves. The folding free energies in Table I11 are all in a range from about 0 to -20 kcal/mol. Even with the use of a relatively high dielectric TABLE 111 Electrostatic Free Enera DiJfmenct? for the Charge States? 0, yo, s'*, and T Structures Native 21zt
Unfolded Beta-ml
S
0 so TS"
1
21zt
Ex72-ml
0
so
s'" 1 1 he1
Beta-ml
0
so
Ttd 1
lhel
Ex72-ml
-
0 P s'" 1
(3' 1 .o -9.9 - 15.4
-2.3 11.0 1.4 -5.2 7.9 -1.5 -11.2 -17.2 -3.7 8.5 0.1 -7.1 6.5
"Energies in kcal/mol; parameters as in Table 11. Charge states as defined in Table 11. AEY = Eti - E!i.
ELECTROSTATIC FREE ENERGY IN SOLUTION
37
constant of ei = 20 for the protein interior, the solvation free energies in Table I1 are in a range from -10 to -190 kcal/mol. The error in the FDPB calculation of solvation free energies must, therefore, be less than a few percent to obtain meaningful free energy differences between conformers. In Fig. 10, the absolute stability curves for the four combinations of the native and the unfolded state 2lzt/Beta-ml, 21zt/Ex72-ml, lhel/ Beta-ml, and lhel/Ex72-m1 are shown. They have almost identical shape, which is consistent with the good agreement between the relative stability curves in Fig. 9a (21zt only, curves for lhel not shown). However, the calculated absolute free energy difference at a given pH varies by approximately 15 kcal/mol, depending on the choice of the native and unfolded structure. Use of the tetragonal crystal structure lhel instead of the triclinic crystal structure 21zt leads to a decrease by about 2 kcal/ mol of the calculated free energy of the native state, i.e., to an increase of the calculated stability (negative shift). On the other hand, use of the Ex72-ml structure instead of Beta-ml for the unfolded conformation leads to a decrease of the free energy of the unfolded state by about 10 kcal/mol, i.e., to a decrease of the calculated stability (positive shift, upper two curves in Fig. 10). A comparison between the folding free energy AGY (Sstd) of lysozyme in the standard charge state from Table I11 with the calculated pH stability AGAB(pH)at pH = 7 from Fig. 10 shows that there are only minor differences: for the native/unfolded structure pairs 2lzt/Betaml, 21zt/Ex72-ml, lhel/Beta-ml, and lhel/Ex72-ml, one has AGF(SStd)= -15.4, -5.2, -17.2, and -7.1 kcal/mol, while AG,(pH = 7) = -15.9, -6.0, -17.9, and -8.0 kcal/mol, i.e., amaximum difference of 0.9 kcal/mol. It follows that the standard charge state represents well the equilibrium distribution of charge states in the pH range between 6 to 10, in which there is little change in the calculated pH stability. In part, this is a consequence of the fact that only two groups (His, Nterminus) titrate in this pH range and that their titration behavior is similar in the folded and unfolded state. The electrostatic free energy of folding for the electrically neutral state, S = So, differs from the folding free energy of the standard charge state (corresponding to the broad minima in the stability curves in Fig. 10) by about 6-7 kcal/mol. The value is nearly independent of the pair of crystallographic and extended structures that are used to represent the native and the unfolded states. Furthermore, the folding free energy of the uncharged state (polar interactions only) exhibits changes as a function of the native/unfolded structures that are very similar to the changes observed for the minimum of the pH stability in Fig. 10 (and
38
MICHAEL SCHAEFER ET AL.
for the folding energy of the standard charge state). It follows that it is the variation in the interactions of polar atom groups in the different lysozyme structures that accounts for the variation in the calculated absolute pH stability of lysozyme and that the interaction of titrating sites is comparatively independent of the choice for the native and unfolded structure. One reason for this difference between the contributions from polar and charged atom groups is that the latter are involving mainly long-range interactions, whereas the former are due to shortrange interactions (e.g., hydrogen bonds), which are more sensitive to structural changes. 4. Conclusions
We have used the linked function treatment of protein titration to calculate the titration curves of the crystal structure and an extended structure of lysozyme as models for the native and denatured state. Good agreement with experimental results (average difference 0.50.8 elementary charge units in the experimental pH range) was obtained. If one assumes that the individual site titration curves of the 32 sites contribute a random error of 5 6 each, this corresponds to an average error of kO.09 to k0.14 (9-14%) for each site. For the extended model of the denatured state, the agreement with the experimental titration curve is better than that for the independent sites (null) model using the standard pK,’s for all titrating groups (extended model: average error 0.5-0.7 elementary charge units; null model, average error 1.1). By calculating the relative pH stability of the extended model, using the null model as the reference, we have demonstrated that there are significant interactions between titrating sites in the denatured state. The electrostatic free energy contribution from titrating sites in the extended molecule vanes from - 1.5 to 3.3 kcal/mol for the pH in the range 2-10. The experimental data on lysozyme stability include all contributions to the free energy, whereas the present theory accounts only for the electrostatic contribution. Therefore, the comparison of the absolute electrostatic pH stability with experimental results in Fig. 10 is only qualitative. We obtained good agreement between the calculated and the experimental change in the stability of lysozyme when the pH was varied from 7 to 1.5 (acid denaturation): the experimental free energy change is 9.2 kcal/mol, as compared with the calculated free energy change of 12- 13.5 kcal/mol. However, there is considerable uncertainty as to the absolute electrostatic contribution, depending on the structures used to represent the native and the unfolded state. In an improved theoretical treatment of the pH stability of proteins, it will be necessary
ELECTROSTATIC FREE ENERGY IN SOLUTION
39
therefore to include conformational equilibria or to devise a method for selecting sets of conformations representative of the native and the unfolded state. C.
Capsid Stability of Foot-and-MouthDisease Virus
Foot-and-mouthdisease virus (FMDV) is a member of the picornavirus family and consists of a protein capsid containing one molecule of positive-sense single-strandedRNA. The capsid has icosahedral symmetry and consists of 60 identical protomers (Rueckert, 1990). Every protomer has a total of approximately 735 residues and contains four noncovalently linked chains: VPl, VP2, VP3, and VP4 (molecular weights: 23,000, 25,000, 24,000, and 8000, respectively). Crystal structures are available for serotypes 0,BFS (protein data bank entry lfod), CS8C1 (lfmd), and A22 (Curry et al., 1996). Of all picornaviruses, FMDV is the most sensitive to low pH values. When intact FMDVis taken to a pH below 7, the structure dissociates into pentamers, which appear to be symmetric assemblies of five protomers (Randrup, 1954; Brown and Cartwright, 1961; Burroughs et al., 1971). In this dissociation process, the RNA is released and the smaller of the four chains, W 4 , is detached from the protomers. It has been proposed that this provides the mechanism by which FMDV delivers its RNA into the cytosol of the infected cell (Baxt, 1987; Carrillo et al., 1984, 1985; Mason et al., 1993). Other members of the picornavirus family may also use acid-induced conformational changes within the endosome to effect cell entry. Like FMDV, the cardiovirus mengovirus also dissociates to pentamers (at pH 6.2) (Mak et al., 1970). Human rhinovirus (HRV) undergoes a conformational change at pH 5 similar to that generated by its interaction with susceptible cells although it does not dissociate (Korant et al., 1972). Nevertheless, antibiotics which specifically inhibit the vacuolar proton ATPases responsible for endosomal acidification block HRV infectivity (Perez and Carrasco, 1993). Exceptionally,poliovirus infection is not affected by these agents. Thus, endosomal acidification appears not to be the universal mechanism for picornavirus cell entry. Recently, Curry et al. (1995) measured in vitro stability curves as a function of pH for three different subtypes of FMDV A10, A22, and A24. A capture enzyme-linked immunosorbent assay (ELISA) was used to monitor the dissociation of capsids to pentamers as a function of pH. The pH corresponding to the midpoint of this transition as determined by the ELISA signal ( P H ~was ~ ) determined for each sample. The A22 subtype was least acid stable of the three, with a pH50of 7.0, while the
40
MICHAEL SCHAEFER ET AL.
pHs0values for A10 and A24 were 6.5 and 6.65, respectively. In addition, stability curves were reported for FMDV empty capsids, which lack the RNA strand. The empty capsids of all three subtypes were found to be more acid-stable by about 0.5 pH units than the corresponding virion. Curry et al. (1995) and Twomey et al. (1995) identified histidines at the pentamer-pentamer interface that might determine acid lability. Curry et al. suggested H142 (VP3)and possibly H65 (VP2).The former was suggested because it is close (4 A) to the N-terminal end of an ahelix of the neighboring pentamer, and the latter was implicated based on sequence differences between A22 and A10/A24, i.e., A10 has a Phe and A24 has a Tyr at that position. Curry et al. reasoned that the positive end of the a-helix dipole would result in an electrostatic repulsion between pentamers when the H142 is protonated. Twomey et al. counted the number of positively and negatively charged residues on the neighboring pentamer around His residues at the interface: H142 and H145 (VP3)were found to have more basic than acidic residues around them, so it was presumed that they would be destabilizing at low pH when they are positively charged. Since the crystal structures of the viruses do not include the FWA, the calculations are made for empty capsids. However, the resulting pH dependence may well approximate the behavior of native viruses, since the empty and native capsid structures are very similar. FMDV is chosen for the electrostatic calculations, rather than HRV or poliovirus, since it has been shown to dissociate into pentamers (Brown and Cartwright, 1961), whereas this has not been demonstrated for HRV or poliovirus. FMDV dissociates into pentamers at a pH that is very close to the pH value at which the crystal structure was obtained. It is thus likely that the pentamers are structurally conserved in the pH range of interest (pH 6-7). We report the calculations of titration curves and stability profiles for the FMDV serotype 0,BFS. The main purpose is to identify the residues that are important for the pH stability and to determine the reason for their importance. Furthermore, we quantify the electrostatic interaction between H142 (VP3) and the adjacent a-helix, and the charged residues in its environment, to determine whether the hypotheses of Curry et al. (1995) and Twomey et al. (1995) are consistent with the calculations. For comparison, results from calculations of the absolute electrostatic free energy of binding are also given for the FMDVserotypes A106, and A22 Iraq. Studies of FMDV have shown that the capsid dissociates into pentamers of protomers as the pH is reduced. The pentamer interface is formed by the packing of two protomers that are related by twofold symmetry. In the calculations, we used a model system involving the association of
ELECTROSTATIC FREE ENERGY IN SOLUTION
41
two protomers as indicated in Fig. 11. Although the side chains at the dimer interface may change their conformation upon dissociation, the overall protomer conformation is likely to be essentially unchanged, especially since in reality it is still part of the pentamer (see Fig. 11). The model omits consideration of protein-protein interactions at the threefold icosahedral symmetry axis and focuses on the interactions at the center of the interpentamer interface, i.e., interactions close to the twofold icosahedral axis. This simplification is justified because the vast majority of atom-atom contacts between pentamers is within the interface surrounding the twofold symmetry axis, and because the residues that have been suggested to be most important for pH dependence in FMDV (Curry et aL, 1995; Twomey et aL, 1995) are located in the vicinity of the twofold symmetry axis and are distant from the threefold symmetry axis (-35
A>.
FIG. 11. Schematic illustration of the association of two protomers treated in the calculations. In the dimer, the location of the capsid proteins W1,W2, and W 3 is indicated. W 4 is located on the inside of the capsid (below the paper plane). To show that the dimer captures most of the pentamer-pentamer interface, we also included the remaining protomers of both pentamers (dashed lines). The icosahedra1 twofold and threefold symmetry axes are indicated by numbers 2 and 3, respectively.
42
MICHAEL SCHAEFER ET AL
In the dissociated state, the two protomers titrate independently, such that the titration curve is equal to the sum of the titration curves of the protomers. Furthermore, since we assume that the structure of the protomers (proto) is symmetry related and remains the same after dissociation, the titration curve of the dissociated system (diss) and its electrostatic free energy in the unprotonated state is
To calculate the electrostatic free energy of association for the two protomers, we thus use Eqs. (39) and (40) for the titration curve and electrostatic free energy of conformation B in Eqs. (20) and (21), and the titration curve and electrostatic free energy of the protomer dimer for conformation A. The solvent and protein dielectric constants were set to E, = 80 and ei = 20 (see Section II1,A) , the ionic strength to 0.145 M, and the temperature was 293 K. Since we are mainly interested in the stability of the complex in the pH interval from 6 to 8, we selected Asp, Glu, His, and the C- and N-terminal groups for the titration calculations, i.e., we excluded sites whose standard pK, is far from the pH interval of interest [see Section 11,D;for the pPFd (Table I)]. This results in 90, 88, and 90 titratable residues per protomer for types 0,A10, and A22, respectively. In the titration program of Beroza et al. (1991), 7500 Monte Carlo steps were performed to determine the average protonation of all sites at a given pH and the titration curves were calculated using 121 pH values evenly distributed in the interval from 0 to 12. The crystal structure (resolution 2.6 of type OIBFS FMDV (full name: 01BFS1860) was obtained from the protein data bank entry lfod (Bernstein et al., 1977). Crystal structures for the types A106, and A22 Iraq 24/64 (Curry et al., 1996), and the empty capsid structure of A22 (called A22E), were provided by S. Curry, E. Fry, and D. Stuart (all structures refined to 3.0 A resolution). The root mean square deviation (C, only) after superposition between A22 and A22E is 0.35 The protomers consist of 736, 736, and 737 residues, for OIBFS, A10, and A22, respectively. Several portions of the structure are missing from the crystal coordinates. For OIBFS,these are residues 21 1-213 (C-terminus) of VP1, 1-4 of W 2 , and 1-14 and 40-64 of W 4 . For A10, VP1 lacks residues 134-154 and 209-212, VP2 lacks residues 1-11, and VP4 lacks residues 1-14 and 40-64. In A22, residues 137-155 and 211-213 of VP1, 1-11 of VP2, and 1-14 and 39-64 ofW4 are missing. The missing
A)
A.
43
ELECTROSTATIC FREE ENERGY IN SOLUTION
segments were not included in the calculations. This is unlikely to influence the calculations significantly, since only one of the missing segments, the N-terminus of W 2 , is near the pentamer-pentamer interface. Furthermore, the N-terminus of VP2 is not close to the center of the interface, where the histidines of interest are located. Hydrogen atoms were added to the structures with the HBUILD command of the CHARMM program (Brooks et al., 1983). 1. Absolute Ekctrostatic Free Energy of Binding
The electrostatic contribution to the absolute binding free energy of the dimer can be obtained from Eq. (19). This requires the calculation of the total electrostatic free energy of the unprotonated state, which consists of the Coulomb energy and the electrostatic contribution to the solvation energy, for both the protomer and the dimer (see Table IV). Since only Asp, Glu, His, N-terminal, and Gterminal residues are treated as titratable sites, the “unprotonated” state of the system includes Lys and Arg residues in their protonated (ionized) states and Cys and Tyr in their protonated (neutral) states. In fact, the unprotonated state in this study corresponds to the standard charge state of the system at pH = 7, except for the N-termini, which are neutral. Table V lists the calculated absolute values of the electrostatic binding free energy at different pH values. The value of AGbind(m) corresponds to the binding free energy for the fully unprotonated state. The pH.,, of the lowest (optimum) electrostatic binding free energy was determined from the binding free energy curve according to Eq. (19). Although there are substantial differences in the free energy of solvation and in the Coulomb energy of the different dimer (protomer) structures in the unprotonated state (Table IV), their overall binding energies are relatively similar (-5 to -12 kcal/mol). Because there are many contributions to stability other than electrostatic ones (Lazaridis et aL, 1995), the absolute AGbind(pH)values cannot be compared directly TABLE IV Coulomb, Solvation, and Binding Free Energy“for tht Unpotonated State
Solvationb lfod A10 A22 A22E
-697.01 -691.67 -619.77 -619.79
(-694.94) (-687.62) (-634.34) (-634.66)
Coulomb” - 1049.99 (- 1046.34) (-893.78) -894.68 (-977.38) -1002.76 (-976.38) -1001.07
Energies in kcal/mol. Values for the dimer; values for two protomers in parentheses.
A~
~ ~ ( 5 )
-5.72 -4.95 -11.81 -9.82
44
MICHAEL SCHAEFER ET AL.
1fod A1 0 A22 A22E
-5.72 -4.95 -11.81 -9.82
-6.33 -6.23 -13.16 - 10.84
8.2 7.7 7.5
7.7
82.62 89.56 87.01 89.88
Energies in kcal/mol. Binding free energy of unprotonated state (see Table IV). pH of lowest (optimum) binding free energy. Binding free energy at pH = 0.
with experimental data. From the small difference (maximum difference 1.35 kcal/mol) between the binding energies of the unprotonated state (here: Asp, Glu, C-terminus, Arg, and Lys in their ionized states; see above) and the binding energy at the pH.,, of binding, it follows that optimal binding requires the anionic sites that are allowed to titrate in the calculations to be predominantly in their ionized states. Further, it is interesting to note that the binding energy at the pH.,, is always more favorable than the binding energy that is calculated for the unprotonated (standard, except N-termini) charge state. To evaluate the contribution to the binding energy that originates from the interaction of the ionic sites, we also calculated the binding energy of the system at pH = 0 (Table V) . The AGbilldvalues under these extreme pH conditions are positive and on the order of 100 kcal/mol. This is caused by positively charged sites whose loss in solvation energy upon complexation is normally balanced by the interaction with negatively charged residues (Asp, Glu) ,which are now neutral. Thus, although electrostatic binding energies are relatively small at the pH.,, (Table V) , they can be very large and unfavorable under extreme pH conditions. Of course, such extreme conditions are unphysical since they would presumably lead to unstable monomers. Nevertheless, they are of interest because they demonstrate the importance of charge-charge interactions under physiological conditions.
2. Relative Electrostatic Free Energy of Binding Because of the difficulty of comparing stability curves between different serotypes or mutant structures in terms of absolute AGbind,we focus on the changes in electrostatic stability as a function of pH. In the following, we use the electrostatic binding free energy relative to the minimum of AG(pH). Formally, this means that we redefine AAG(pH)
45
ELECTROSTATIC FREE ENERGY IN SOLUTION
as AGAB(pH)- AGm(pHOp,), where AG, is the absolute electrostatic binding free energy according to Eq. (19) and pH opt is the pH of optimum binding (minimum of the electrostatic binding free energy). This is equivalent to using pH,, as the reference pH in the integral expression for AAG in Eq. (20) instead of pH = m, i.e., = -(In 10)RTIP)rOp’((n(pH’))~ - (n(pH’))~)dpH’ (41) AAGbind(pH) PH
where is the titration curve of the dimer and ( T Z ) ~is the titration curve of the dissociated monomers according to Eq. (39). The calculated relative stability curve for OIBFS is shown in Fig. 12. The minimum is at pH 8.1; the capsid becomes destabilized at lower and higher pH values. When the pH is lowered to 6.5 there is a destabilization of about 2 kcal/mol. The destabilization that occurs when the pH is raised beyond 8.1 is less important. This is due in part to the fact that we did not include Arg, Cys, Lys, or Tyr residues in the calculation. For comparison, Fig. 12 also shows the stability curve after including these residues. Since there is no difference for pH < 8.5 and we are interested primarily in the stability around pH 7, we leave out the Arg, Cys, Lys, and Tyr residues in the following. This greatly reduces the time required for the computation (180 vs. 380 titratable sites in the dimer).
= 0
g
6-
5 a
42-
0 5.0
I
6.0
I
8.0
7.0
9.0
11 .o
PH FIG.12. Relative stability, Eq. (41), of the OIBFSdimer as a function of pH. The thick stability curve results from the calculation with only Asp, Glu, His, N-terminus, and G terminus as titratable sites. For comparison, the relative pH-stability with Cys, Lys, and Tyr included as titratable sites is also shown (thin line).
46
MICHAEL SCHAEFER ET AL.
The shape of the stability curve results from differences in titration behavior of the two states of the model system-the dimer and the two separate protomers [see Eq. (41)].To analyze the contribution from individual sites to the stability, we consider their protonation (titration) curves in the dimerized and dissociated protomer states. The pH dependence of the contribution from site i to the stability of the dimer can be calculated using Eq. (41),with (n)* - (n)B replaced by A(si) = (si)* - (s,)~, the change in the average protonation of the site upon formation of the complex. From Eq. (41),it follows that the lowering of the pH from pH,,, to pH1< pH.,, leads to an increase of the relative free energy (destabilization) of the complex originating from those sites for which A(s,) < 0 in the interval (pHl,pH,,,). Thus, the term “destabilizing” in this work refers to sites that are responsible for a loss of stability of the complex upon variation of the pH, specifically the change from pH.,, to acid pH conditions, the region of primary experimental interest.
3. Important Histidines Inspection of the titration curves of the individual sites shows that the residues with a titration behavior that differs between the complex and the unbound protomer in the acid region are found at or near the protomer-protomer interface (which corresponds to the interpentamer interface in the capsid). These residues are, therefore, predicted to be responsible for the acid lability of the capsid. Figure 13 shows the p& shift between the dimer and the protomer for the titrating residues as afunction of distance from the interface. Within 10 Afrom the interface, there are 10 Asp, 11 Glu, and 8 His residues per protomer. For most acidic residues (open symbols in Fig. 13), the pK,’s are lower in the dimer than in the protomer. The desolvation of the anionic sites (Asp, Glu) upon dimerization destabilizes their charged state and thus leads to a positive pK, shift. The fact that a negative pK, shift is observed for most acidic residues indicates that there are interactions with polar groups and with cationic sites (Arg and Lys were not allowed to titrate) that favor the ionized form. From an inspection of site-site distances, there is a salt bridge formed across the pentamer-pentamer interface for only two Glu residues (E213in VP2; El46 in VP3);the salt bridges involve R60 (VP2)for E213 and K63 (VP2)for E146.A number of the Asp and Glu residues had very different pK,values between the protomer and the dimer, but none of these occur in the pH interval (pH 6-7) of interest; they are all in the pH range between 0 and 4.5. Three His residues are primarily responsible for the features of the capsid stability curve around pH = 7.5.H141 and H144 (VP3)have a
47
ELECTROSTATIC FREE ENERGY IN SOLUTION
1.5
-2.5-
B
0 0
,
'
,
,
,
I
1 FIG. 13. Differences in pK,'s between sites in the dimer and the Drotomer (pKd;""- p K y ) vs. the distance of the sites to the protomer interface,where the distance is defined as the minimum distance between ionized atoms in the titratable site (e.g., Os, or Os, in Asp) to any non-hydrogen atom in the neighboring protomer. Residue His (m); N-terminus (0); and Gterminus ( A ) . A negative ApK. implies types: Asp ( 0 ) ;Glu (0); a destabilization of the dimer when the pH is lowered from pH.,, to acid pH conditions [see Eq. (41) and related text].
reduced pK, in the dimer as compared with the protomer (6.30 in the dimer vs. 7.52 in the protomer for H141, and 4.13 vs. 7.49 for H144), causing a significant destabilization at pH < 8 (Fig. 14). H87 (VP2) is also destabilizing, but only below pH = 6. In contrast, H21 (VP2) has a stabilizing effect as the pH is lowered. However, the contribution from H21 is more than compensated by the loss in stability of the complex that is associated with H141 and H144, which results in the net increase of AAGbindwhen the pH is lowered from pH,,,, = 8.1 toward acid pH (Fig, 12).The importance of HI41 is in accord with the analysis of Curry et al. (1995) and Twomey et al. (1995) (H141 is labeled H142 in these references). H144 was also suggested by Twomey et al. (1995) as being one of the two possible destabilizing residues at the surface. The p c " values (see Section I1,D) of H141 and H144 are remarkably low, 0.90 and 0.76, respectively. This is caused by strong interactions with adjacent Lys and Arg residues,which are formally counted as background charges and not as titrating residues in this work. From a detailed analysis of the energetic contributions to the pK, shift, it was found that both residues experience a relatively small desolvation effect in the dimer (-0.68 and -0.84 pH units for H141 and H144, respectively) but that the interaction with polar and cationic atom groups is large (-4.84
1 .oo
H21 (VP2)
0
2
4
6
PH
8
10
0
2
4
6
8
10
PH
FIG.14. Titration curves of the individual sites H21 (W2),H141 (W3),and H144 (W3) for the 0,BFS virus capsid structure. Sites in the dimer, solid lines; sites in the individual protomer, dashed lines.
49
ELECTROSTATIC FREE ENERGY IN SOLUTION
and -4.83 pH units, respectively). The complex nonsigmoidal titration curves of H141 and H144 (see Fig. 14) are caused by interaction with H87 (VP2), which has a surprisingly low calculated pK, of 1.93. H65 (VP2)was also suggested by Curry et al. (1995) as being possibly destabilizing. However, we did not find a significant contribution from this residue. We calculated stability curves for OIBFS with three different point mutations: H65F (VP2),H141L, and H144L (both W3). Structures of the mutated residues were created by conserving the xl and x2dihedral angles and building the remaining side-chain atoms based on the standard bond lengths and angles of the param22 set of CHARMM (MacKerell et aZ., 1998). H65 was chosen to demonstrate that a residue with no change in p K , between protomer and dimer does not affect the pHdependent relative stability. It was changed to a Phe, the residue that occurs in the same position in FMDV serotypes A10 and A12 (Palmenberg, 1989). H141 was changed to a Leu, the consensus residue for picornaviruses at that position (Palmenberg, 1989). H144 was also changed to a Leu so as to compare it to the H141L mutation. Since mutations affect the titration behavior of other residues, new titration curves were calculated for each mutatedvirus. The relative stability curves for the three mutations are shown in Fig. 15, in comparison with the relative stability of the wild-type OIBFS sequence. As expected from the previous calculations, the H144L substitution has the largest effect, resulting in a virus dimer that is not significantly destabilized until the
5.0
6.0
7.0
8.0
9.0
10.0
PH FIG.15. Relative dimer stability, Eq. (41), for 0,BFS (thick line) and for the mutants H65F (W2) (thin line), H141L (W3) (dashed line), and H144L (W3) (dotted line).
50
MICHAEL SCHAEFER ET AL
pH is below 6 (the relative AGbind of the mutant increases by 2 kcal/mol on lowering of the pH from 8 to 5, as compared with an increase by more than 6 kcal/mol for the wild type; see Fig. 15).The H141L mutation has a similar but smaller effect on stability, whereas the stability of the H65F mutant is only marginally different from that of the wild type. 4. Effectof a-Helix on His 141 The results of Section III,C,3 suggest that H141 (VP3) destabilizes the dimer at pH < 8 (see Fig. 14). Curry et a2. (1995) proposed that the ahelix formed by residues 89-98 (VP2) is a key factor in this destabilization through an unfavorable electrostatic interaction between the helix dipole and the protonated His. Significant effects of a helix on the stability of barnase have been demonstrated experimentally (Sancho et al., 1992), although the dominant contribution appears to arise from local interactions with the helix terminus rather than the so-called helix dipole (Tidor and Karplus, 1991; &pist et aL, 1991). We analyzed this hypothesis by calculating the electrostatic interaction between the helix and the protonated and unprotonated H141, respectively. In these calculations, only the H141 partial charges in one protomer and the helix partial charges in the other protomer (related by a twofold symmetry axis) are present, while the charges of all other atoms are set to zero. Titratable sidechains of the helix were assigned their standard charge state at pH = 7. We calculated the effect of the helix backbone and the helix side chains separately for the three serotypes OIBFS,A10, and A22. From Table VI,the destabilizingeffect of the helix backbone on the protonated state of H141 is on average 0.37 kcal/mol, which corresponds to a ApK, of -0.27. Sitkoff et al. (1994a) reported a calculated ApK, of -0.29 due to the interaction between the backbone of a 21-residue a-helical peptide and a titrating cationic site. The effect of the helix side chains is smaller: TABLE VI Electrostatic Effect" of Protomer 1 on the Protonation of HI41 (W3)in Protomer 2
lfod
A10
A22
helix bb
helix sc
prot 1
DEKR
Glu
R60
H144
0.39 0.36 0.37
0.01 0.17 0.19
2.58 2.65 2.61
0.58 0.50 0.79
-0.68 -0.60 -0.64
0.59 0.55 0.56
2.73
2.72 3.07
"Energies in kcal/mol. Conversion to ApK, by multiplication with -0.75. Column heads: helix bb/sc, effect of the backbone/side chains of residues 89-98 (VP2); prot 1, all atoms of protomer 1; DEKR, all Asp, Glu, Lys, and Arg of protomer 1 (DEKR refers to their 1-lettersymbol); Glu, all Glu of protomer 1; R60, only residue R60 (VP2) charged; H144, effect of protomer 1 on H144 (VP3)in protomer 2.
ELECTROSTATIC FREE ENERGY IN SOLUTION
51
0.01 kcal/mol for OIBFS and about 0.18 kcal/mol for A10 and A22. The reason for the difference between OIBFS and A10/A22 is that residue 93 (VP2) is a Ser in OIBFS whereas it is a His in A10 and A22. The effect of the H93 side chain in A10/A22 is 0.17 kcal/mol in A10 and 0.18 kcal/mol in A22. The total destabilizingeffect of H141 protonation caused by the neighboring protomer was calculated by including all atomic charges of the other protomer (in contrast to the above, where charges were assigned only to atoms in the helix), with the titrating sites in their standard (pH 7) states. Table VI shows that the interaction energy is much larger than for the a-helix alone; on average there is a destabilization of the protonated state of H141 by 2.61 kcal/mol (ApK, of -1.89). The effect of the Asp, Glu, Lys, and Arg residues is shown in the column headed DEKR. Although these ionized residues have a strong effect (0.62 kcal/mol on average), they only contribute about 24% to the total destabilization energy. We calculated the effect of the Glu residues only, which should stabilize the protonated state of H141, and obtained results in agreement with this expectation (column Glu in Table VI).A very strong destabilization of -0.57 kcal/mol was caused by R60 (VP2), whose charged nitrogen atom is separated by only 5.2 from the Nsl atom of H141. Since the combined contribution of the ahelix and the charged residues is only 0.98 kcal/mol (Table VI), the predominant destabilizing effect is caused by the interaction between H141 and polar groups on the neighboring protomer. In agreement with our previous results, the total destabilization of the complex caused by H144 is comparable to that of H141 (Table VI, last column).
A
5. Conclusions
In this work, the pH stability of the FMDV virus capsid is calculated based on the linked function theory, i.e., by integration of the titration curves of the individual protomers and the dimer. This approach includes consideration of all charge states of the titratable sites. It is thus expected to be more accurate than calculations where only specific charge states of titrating residues are included, as, e.g., in the work of Warwicker (1992). The calculations on the pH stability of the FMDV virus capsid permit us to make a detailed analysis of which residues are responsible for the instability of the capsid in the pH range 5-6. It was shown that the importance of a given residue depends strongly on its distance from the protomer-protomer interface. Not surprisingly, the calculated effects near pH 7 are dominated by the titration behavior of histidine residues. Two His that contribute most to the destabilization at low pH are H141 and H144 in the VP3 capsid protein [H142 and H145 in Curry et al.
-
52
MICHAEL SCHAEFER ET AL.
(1995) and Twomey et al. (1995)l. They are located at the pentamerpentamer interface. H87 (VP2) is also destabilizing, but only below pH 6. H141 and H144 were identified by Curry et al. (1995) and Twomey et al. (1995) as two likely candidates for destabilization. Curry et al. (1995) based their identification of H141 on the presence of an a-helix in the neighboring pentamer (and on conservation of this His, which is unique to FMDV in the picornavirus family). We find that the helix effect is not the major cause for the pK, shift of H141, since it accounts for only 20% of the effect of the entire neighboring protomer (Table VI) . Twomey et al. (1995) counted the number of charged residues on the neighboring pentamer within 10 of any His residue and concluded that H141 and H144 could be destabilizing, since in their environments the positively charged residues outnumbered the negatively charged ones. We find that the combined effect of all charged residues (Asp, Glu, Arg, Lys) accounts for only 20-25% of the effect on H141 (Table VI); i.e., polar residues also play an important role. Thus, the structure-based analysis agrees qualitatively with the computational results. However, it is more reliable to base such conclusions on titration curve calculations, which now can be obtained relatively easily for systems as large as protomer dimers. Warwicker (1989, 1992) studied the higher acid stability of poliovirus as compared with HRV. He calculated pK, shifts of H109 (VP2) and H150 (VP3)at the pentamer-pentamer interface, assuming standard protonation states of other titratable residues (Warwicker, 1992). H150 (VP3) in HRV corresponds to H144 in FMDV, one of the residues addressed in this study. From his calculations, the protonation of the histidines does not account for the destabilization of HRV below pH 5.5. The differential stabilities of HRV and poliovirus were explained by proposing that a &sheet extension adjacent to the His residues is structurally more acid stable in poliovirus than in HRV, and prevents hydrogen ions from binding to the His residues in poliovirus. By means of an equilibrium thermodynamic model of virus capsid association, Zlotnick (1994) found that small energy differences for one contact can cause significant changes in virus stability because of the large number of units involved. In his model, every pentamer-pentamer contact contributes a fixed amount of free energy to the total system. Statistical factors were also taken into account, such as the number of ways of forming and dissociating a particular assembly. He found with this model an exponential free energy dependence of the equilibrium on the assembly concentration. Since the calculations for the mutants H141L and H144L in OIBFS showed a marked stability increase at pH values below 7 (Fig. 15) for one pentamer-pentamer contact, they could
A
ELECTROSTATIC FREE ENERGY IN SOLUTION
53
have an important effect on overall virus stability. Consequently, these mutants are highly interesting candidates for stability measurements.
IV. OUTLOOK In the applications of the linked function theory to the pH stability of lysozyme and of the capsid of FMDV, good agreement between calculation and experiment has been demonstrated. In particular, there is quantitative agreement between the calculated and the experimental loss in lysozyme stability when the pH is lowered from the physiological pH 7 to acid pH conditions (- 2). In the application to FMDV, the calculations made possible the identification of specific titratable sites that are responsible for the acid lability of the capsid, which is assumed to play an important role for the infectivity of the virus. The theory of protein titration and pH stability outlined in this work takes full account of the protonation equilibria of titratable sites as well as their mutual interactions. The interactions between ionizable groups contribute to the shift of pK,’s as a consequence of conformational change, e.g., the binding of a substrate, At the same time, the interactions lead, in general, to a departure of the titration curves of individual sites from that of a single moiety (the ideal Henderson-Hasselbach titration curve; see Fig. 3). If the deviation of the site titration curves from the ideal are neglected, the free energy difference between conformations can be evaluated on the basis of the pK,’s alone, using the independent sites model developed in Section I1,C. However, to take full account of the interaction between ionizable groups, free energy changes must be derived from the generating functions for the conformations (see Section I1,A). For systems with many (>30) titrating sites, the computational intractability of the generating function approach can be circumvented by first calculating the titration curve and then integrating the titration curve, e.g., starting from pH = m as the reference where the electrostatic free energy is given by the free energy of the fully unprotonated state of the system. The underlying proportionality between the derivative of the free energy and the average degree of protonation (titration curve) follows directly from the binding polynomial as defined in the theory of linked functions. In future work, it will be necessary to include multiple conformations in pK, and pH-stability calculations, such as those reviewed here for lysozyme and the capsid of the FMDV. First, the application to lysozyme has revealed a strong conformation dependence of the contributions to the electrostatic free energy from the interaction of polar groups. To reduce this conformation dependence, it will be necessary to incorporate
-
54
MICHAEL SCHAEFER ET AL
an average over an ensemble of structures in the methodology, or to develop methods for selecting a conformation that is representative of the ensemble at thermal equilibrium. Second, the calculations on the lysozyme pK,'s with different values of the protein dielectric constant, which give best results for ei = 20, indicate that it is necessary to account for the relaxation of the protein permanent dipoles due to changes in the ionization state of the protein. Since the polar environment around ionizable groups in proteins is inhomogeneous, an explicit treatment of the polarization effects is expected to lead to a better agreement between calculated and experimental pK,'s than that obtained by the present approach, which assumes that the microscopic polarizability corresponds to a macroscopic dielectric constant of ei = 20 everywhere. Multiple conformations can be incorporated at different levels in the theory of protein titration. A simple approach would be the use of average structures, e.g., from a molecular dynamics simulation; it would correspond to the idea of selecting a single conformation to represent the ensemble at thermal equilibrium. A more rigorous treatment of multiple conformations would involve the double summation over all conformations and charge states in the generating function, Eq. (8), similar to the approaches that have been developed to treat multiple ligands or allosteric effect in the theory of linked functions (Wyman and Gill, 1990). In implementing the different methods to include conformational averages, one important criterion will be to maintain computational feasibility, even for systems with many (several hundred) titratable sites, while achieving a high degree of accuracy.
ACKNOWLEDGMENT This work was supported in part by a grant from the National Institutes of Health. M.S. is supported by a fellowship within the Biophysics Program of the European Community. The authors thank Michael Engels for many helpful discussions. The two applications were based on an article on lysozyme in the Journal OfPhysical Chemistly (Schaefer et al., 1997) and an article o n the foot-and-mouth disease virus capsid in the Journal ofMoleculur Biology (van Vlijmen et al., 1998).
REmRENcEs Adair, G. S. (1925).J. Biol. Chem. 63, 529-545. Antosiewicz, J., McCammon, J. A., and Gilson, M. (1994).J. Mol. Biol. 238, 415-436. Aqvist, J., Luecke, H., Quiocho, F. A., and Warshel, A. (1991). Roc. Natl. Acud. Sci. U.S.A. 88,2026-2030. Ascenzi, P., Coletta, M., Arniconi, G., Bolognesi, M., Menegatti, E., and Guameri, M. (1990). Biol. Chem. HopPe-Sqrler371, 389-393. Bartik, K., Redfield, C., and Dobson, C. M. (1994). Biophys. J. 66, 1180-1184.
ELECTROSTATIC FREE ENERGY IN SOLUTION
55
Bashford, D., and Karplus, M. (1990). Biochistty 29, 10219-10225. Bashford, D., Case, D. A., Dalvit, C., Tennant, L., and Wright, P. E. (1993). B i o c h i s h y 32,8045-8056. B a t , B. (1987). I.’irus Res. 7, 257-271. Bernstein, F. C., Koetzle, T. F., Williams, T. F., Meyer, G. J. B., Jr., Brice, M. D., Rodgers, J. R., Kennard, O., Schimanouchi, T., and Tasumi, M. (1977). J. Mol. Bwl. 112, 535-542. Beroza, P., and Fredkin, D. R. (1996).J. Comput. C h . 17, 1229-1244. Beroza, P., Fredkin, D. R., Okamura, M. Y., and Fehler, G. (1991). Aoc. Natl. Acad. Sci. U.S.A. 88,5804-5808. Brocklehurst, K (1994). Aotkn Eng. 7,291-299. Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swaminathan, S., and Karplus, M. (1983).J. Comput. Chem. 4, 187-217. Brown, F., and Cartwright, B. (1961). Nature (London)192, 1163-1164. Brilnger, A. T., and Karplus, M. (1988). Proteins: Shuct., Funct., Genet. 4, 148-156. Burroughs, J. N., Rowlands, D. J., Sangar, D. V., Talbot, P., and Brown, F. (1971). J. Gen. Virol. 13, 73-84. Caflisch, A., and Karplus, M. (1994). A o c . Natl. Acad. Sci. U.S.A. 91, 1746-1750. Carrillo, E. C., Giachetti, C., and Campos, R. H. (1984). Virology 135,542-545. Camllo, E. C., Giachetti, C., and Campos, R. H. (1985). Virology 147, 118-125. Casale, E., Collyer, C., Ascenzi, P., Balliano, G., Milla, P., Viola, F., Fasano, M., Menegatti, E., and Bolognesi, M. (1995). Biophys. C h . 54, 75-81. Curry, S., Abrams, C. C., Fry, E., Crowther, J. C., Belsham, G. J., Stuart, D. I., and King, A. M. (1995).J. Virol. 69,430-438. Curry, S., Fry, E., Blakemore, W., AbuChazaleh, R.,Jackson, T., King, A., Lea, S., Newman, J., Rowlands, D., and Stuart, D. (1996). Structure4, 135-145. Davis, M. E., and McCammon, J. A. (1990). Chem. Rev. 90,509-521. Davis, M. E., Madura, J. D., Luty, B. A, and McCammon, J. A. (1991). Comput. Phys. Commun. 62, 187-197. Demchuk, E., and Wade, R. (1996).J. Phys. Chem. 100, 17373-17378. Fersht, A. (1985). “Enzyme Structure and Mechanism,” 2nd ed. Freeman, New York. Fersht, A. R., and Sternberg, M. J. E. (1989). Protein Eng. 2,527-530. Fersht, A. R., Itzhaki, L. S., ElMasry, N. F., Matthews,J. M., and Otzen, D. E. (1994). Proc. Natl. Acad. Sci. U.S.A. 91, 10426-10429. Fiebig, K. M., Schwalbe, H., Buck, M., Smith, L. J., and Dobson, C. M. (1996).J. Phyx C h . 100, 2661-2666. Gilson, M. K. (1993). Prota’ns: S t w t . , Funct., Gaet. 15, 266-282. Gilson, M. K., and Honig, B. H. (1986). Biopolymm 25, 2097-2119. Gilson, M. K., Sharp, K A, and Honig, B. H. (1988). J. cMnput. Chem. 9, 327-335. Hnnenberger, P. H., Mark, A. E., and van Gunsteren, W. F. (1995). Proteins: Struct., Funct., Genet. 21, 196-213. Imoto, T. (1983). Biophys. J. 44, 293-298. Jean-Charles, A., Nicholls, A., Sharp, K., Honig, B., Tempczyk, A, Hendrickson, T. F., and Still, W. C. (1991).J. Am. Chem. SOC. 113, 1454-1455. Kirkwood, J. G. (1934).J. C h . Phys. 2, 351-361. Knowles, J. (1976). CRC C d . Rev. Biochem. 4, 165-173. Korant, B. D., Lonberg-Holm, K., Noble, J., and Stasne, J. T. (1972). Virology 48, 71-86. Kuramitsu, S., and Hamaguchi, K (1980).J. Biochem. (Tokyo) 87, 1215-1219. Lazaridis, T., Archontis, G., and Karplus, M. (1995). Adv. Protein Chem. 47,231-306.
56
MICHAEL SCHAEFER ET AL.
Lehninger, A. L., Nelson, D. L., and Cox, M. M. (1993). “Principles of Biochemistry.” Worth Publishers, New York. Luty, B. A., Davis, M. E., and McCammon, A. (1992).J. Comput. C h a . 13, 768-771. Maccallum, P. H., Poet, R., and Milner-White, E. J. (1995).J. MoZ. Biol. 248, 374-384. MacKerell, A. D., Jr., et al. (1998).J. Phys. C h a . (in press). MacKerell, A. D., Jr., et al. (1992). FASEEJ. 6, A143. MacKerell, A. D., Jr., Sommer, M. S., and Karplus, M. (1995).J. Mol. Biol. 247, 774-807. Mak, T. W., O’Callaghan, D. J., and Colter, J. S. (1970). Viroloa 40, 565-571. Malcolm, B. A., Wilson, K. P., Matthews, B. W., Kirsch, J. F., and Wilson, A. C. (1990). Nature (London) 345,86-89. Mason, P. W., Baxt, B., Brown, F., Harber, J., Murdin, A., and Wimmer, E. (1993). Virology 192,568-577. Matthew, J. B., Curd, F. R. N., Garcia-Moreno, E. B., Flanagen, M. A,, March, K. L., and Shire, S. J. (1985). CRC C d . Rev. Biocha. 18, 91-197. Mehler, E. L., and Eichele, G. (1984). Biochemistly 23, 3887-3891. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). J. C h a . Phys. 21, 1087-1092. Nozaki, Y., and Tanford, C. (1967). In “Methods in Enzymology” (C. H. W. Hirs, ed.), Vol. 11, pp. 715-734. Academic Press, New York. Oliveberg, M., Arcus, V. L., and Fersht, A. R. (1995). Biochemistly 34, 9424-9433. Palmenberg, A. (1989). In “Molecular Aspects of Picornavirus Infection and Detection” (B. L. Semler and E. Ehrenfeld, eds.), pp. 211-241. Am. SOC.Microbiol., Washington, DC. Perez, L., and Carrasco, L. (1993).J. ViroZ. 67, 4543-4548. Pfeil, W., and Privalov, P. L. (1976). Biophys. C h . 4, 41-50. Privalov, P. L. (1992). In “Protein Folding” (T. E. Creighton, ed.), pp. 91-97. Freeman, New York. Ptitsyn, 0. B. (1992).In “Protein Folding” (T. E. Creighton, ed.), pp. 243-300. Freeman, New York. Ramanadham, M., Sieker, L. C., and Jensen, L. H. (1981). Actu CvstaZlogr. C37, 33. Randrup, A. (1954). Acta Pathol. Microbiol. Scand. 34, 366-374. Rico, M., Santoro, J., Gonzalez, C., Bruix, M., and Neira, J. L. (1991). In “Structure, Mechanism and Function of Ribonucleases” (C. M. Cuchillo, R. de Llorens, M. V. NoguCs, and X. Pares, eds.), pp. 9-14. Department d e Bioquimica i Biologia Molecular and Institut de Biologia Fondamental, Universitat Autbnoma de Barcelona, Bellaterra, Spain. Roxby, R., and Tanford, C. (1971). B i o c h i s h y 10, 3348-3352. Rueckert, R. R. (1990). In “Virology” (B. N. Fields and D. M. Knipe, eds.), 2nd ed., Vol. 1, pp. 507-548. Raven Press, New York. Russell, S. T., and Warshel, A. (1985).J. MoZ. BioZ. 185, 389-404. Sancho, J., Serrano, L., and Fersht, A. R. (1992). Biochemistly 31, 2253-2258. Schaefer, M., Sommer, M., and Karplus, M. (1997).J. Phys. C h a . 101, 1663-1683. Schellman, J. A. (1975). Biopolymers 14,999-1018. Schulz, G. E., and Schirmer, R. H. (1978). “Principles of Protein Structure.” SpringerVerlag, Heidelberg. Sham, Y.Y.,Chu, Z. T., and Warshel, A. (1997).J. Phys. Chem. B101, 4458-4472. Sharp, K. A., and Honig, B. (1990). Annu. Rev. Biophys. Biophys. Chem. 19, 301-332. Sitkoff, D., Lockhart, D. J., Sharp, K. A., and Honig, B. (1994a). Biophys.J. 67,2251-2260. Sitkoff, D., Sharp, K. A., and Honig, B. (1994b). J. Phys. C h a . 98, 1978-1988. States, D. J., and Karplus, M. (1987).J. MoZ. BioZ. 197, 111-130.
ELECTROSTATIC FREE ENERGY IN SOLUTION
57
Szabo, A., and Karplus, M. (1972).J. Mol. Biol. 72, 163-197. Tanford, C. (1957).J. Am. Chem. SOC.79, 5340-5347. Tanford, C., and Kirkwood, J. G. (1957).J. Am. C h a . SOC.79, 5333-5339. Tanford, C., and Roxby, R. (1972). Biochemist?y 11, 2192-2198. Tanokura, M. (1983). Biochim. Biqphys. Acta 742, 576-585. Tidor, B., and Karplus, M. (1991). Biochemistly 30, 3217-3228. Twomey, T., France, L. L., Hassard, S., gUrrage, T. G., Newman, J. F., and Brown, F. ( 1995). Vk~lOgy206, 69-75. van Vlijmen, H. W. T., Curry, S., Schaefer, M., and Karplus, M. (1998). J. Mol. Biol. 275,295-308. Warshel, A. (1978). Proc. Natl. Acad. Sci. U.S.A. 75, 5250-5254. Warshel, A. (1979). Photochem. Photobiol. 30, 285-290. Warshel, A. (1987). Nature (London) 330, 15-16. Warshel, A., and Levitt, M. (1976).J. Mol. Biol. 103, 227-249. Warshel, A., Russell, S. T., and Churg, A. K. (1984).Proc. Natl. Acad. Sci. U.S.A. 81,47854789. Warshel, A., Sussman, F., and King, G. (1986). Biochemishy 25, 8368-8372. Warshel, A., Naray-Szabo, G., Sussman, F., and Hwang,J.-K. (1989).Biochemistly 28,36293637. Warwicker,J. (1989). FEBS Lett. 257,403-407. Warwicker, J. (1992).J. Mol. Biol. 223, 247-257. Warwicker,J., and Watson, H. C. (1982).J. Mol. Biol. 157, 671-679. Wyman, J. (1948). Adu. Protein C h .4, 407-531. Wyman, J. (1964). Adu. Protein C h . 19, 223-286. Wyman, J., and Gill, S. T. (1990). “Binding and Linkage.” University Science Books, Mill Valley, CA. Yang, A S . , and Honig, B. (1993).J. Mol. Biol. 231, 459-474. Yang, A.S., Gunner, M. R., Sampogna, R., Sharp, K., and Honig, B. (1993). Bateins: Struct., Funct., Genet. 15, 252-265. You, T. J., and Bashford, D. (1995). Bz0phys.J. 69, 1721-1733. Zauhar, R. J., and Morgan, R. S. (1985).J. Mol. Biol. 186, 815-820. Zhou, F., Windemuth, A,, and Schulten, K. (1993). B i o c h i s h y 32, 2291-2306. Zlotnick, A. (1994).J. Mol. Biol. 241, 59-67.
This Page Intentionally Left Blank
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS IN PROTEINS By ENRICO DI CERA Department of Blochemlatry and Molecular Biophysics, Washington Univeraity School of Medicine, St. Louis, Missouri 63110
I. Introduction ...................................................... 11. The Reference Cycle ............................................... 111. Structural Mapping of Energetics .................................... . A. Site-Specific Structural Perturbations ............................. B. Linkage between Site-Specific Structural Perturbations and Energetics ................................................. C. Limits of Single-Site Ala Scans ................................... nil. Site-Specific Analysis of Mutational Effects in Proteins .................. A. Double-Mutant Cycles .......................................... B. SiteSpecific Transition Modes ................................... C. Properties of the Coupling Free Energy V. SiteSpecific Dissection of Thrombin Specificity ........................ A. Substrate Recognition by Serine Proteases ........................ B. Thrombin Structure and Function ............................... C. Site-Specific Probes ............................................. D. Perturbation at the Pl-P3 Sites E. Why Is the Fast Form More Specific? ............................. F. Allosteric Mechanism for High-Order Coupling ................... VI. Concluding Remarks .............................................. References .......................................................
..........................
..................................
59 61 63 63 66 69 73 73
75 76 79 79
82 87 91 96
102 113 115
I. INTRODUCTION In a seminal paper published 50 years ago in this review series,Jeffries Wyman (1948) introduced the basic idea of linked functions as a general property of any macromolecular system capable of different functions. Wyman pointed out that “whenever a molecule possesses two or more different functions . . . belonging to nearby groups in the molecule, there is the likelihood of an interdependence of the functions due to interaction between the groups.” Two functions coexisting in the same protein may exert reciprocal effects on each other. Wyman proved that if binding of a ligand X influences binding of a ligand Y, binding of Y must influence binding of X and to an extent that can be predicted exactly from the effect of X on Y. Through this reciprocity principle, which is the biochemical counterpart of the analogous Maxwell’s principle involving physical quantities, Wyman brought the rigor of the Gibbs approach (Gibbs, 1875) to the analysis of protein-ligand interactions 59 ADV.WC.%S IN PROTEIN CHEMISTRY, Vol. 51
Copyright 0 1998 hy Academic Press. AU rights of reproduction in any form reserved. 00653233/98 $25.00
60
ENRICO DI CERA
and revealed how thermodynamics could be exploited to infer information on structural properties of the system (Wyman, 1948, 1964). With structural biology still in its infancy,Wyman’s approach inevitably emphasized the functional aspects of protein energetics. This has led to the unfortunate notion in some circles that Wyman’s linkage thermodynamics is a phenomenological theory not concerned with structure and therefore bearing marginal importance in structure-function studies. Since its inception, however, linkage thermodynamics sought to understand the structural origin of effects reflected in the energetic properties of a system. When the principle of linked functions was first enunciated, Wyman was seeking a molecular explanation for the effect of pH on oxygen binding to hemoglobin, the physiological Bohr effect. He hoped to identify structural determinants involved in the proton ionization reactions linked to oxygen binding from the observed linkage between oxygenation and proton release, and the criterion of proximity of the linked groups. This might have enabled a better understanding of the mechanism of heme-heme communication leading to cooperative oxygen binding. Linkage thermodynamics was therefore conceived as a conceptual and methodological tool to extract structural information from energetics. This theory has enjoyed myriad applications in nearly all areas of protein and nucleic acids physical chemistry (Edsall and Wyman, 1958;Wyman and Gill, 1990) and set the conceptual framework for the development of allosteric theory (Monod et aL, 1963, 1965; Koshland et aL, 1966). The extraordinary developments of structural biology and recombinant DNA technology have created the conditions enabling experimentalists to effectively tackle the problem of how energetics and regulatory interactions are encoded into structure. This is an appropriate time to look at Wyman’s theory of linked functions with renewed interest and show how it can contribute to our understanding of structure-function correlations when cast into the more general framework of site-specific thermodynamics (Di Cera, 1995). In this chapter we illustrate how Wyman’s theory of linked functions can be extended to the site-specific analysis of mutational effects in proteins that are at the basis of many current studies of protein folding, enzyme catalysis,and molecular recognition. The site-specific analysis exploits the ability of recombinant DNA technology to perturb macromolecular systems at the level of single amino acids to obtain information on how individual residues contribute to protein stability and ligand recognition (Di Cera, 1995). Site-specific linkage thermodynamics is concerned with local effects and how they add up to generate the global behavior of the system that constitutes the focus of Wyman’s classical approach (Wyman, 1948, 1964; Wyman
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
61
and Gill, 1990). This general theory of cooperativity is aimed at develop ing a mechanistic and structure-based understanding of linked effects from the model-independent analysis of experimental data. It represents a modern incarnation of Wyman’s original idea of inferring structural information on the system from the properties of its linked groups. Throughout this chapter we will make an effort to illustrate the essential aspects of the approach by emphasizing the concepts and making essentially no use of the elaborate mathematical formalism detailed elsewhere (Di Cera, 1995). The goal is to make our discussion readily accessible to experimentalistswho would benefit the most from the ideas and methods illustrated here. 11. THEREFERENCE CYCLE Wyman’s principle of linked functions is best illustrated by a reference cycle (Di Cera, 1995) in which the contribution of specific sites is explicitly taken into account. Many properties of the system can be revealed by direct inspection of the cycle and especially those pertaining to events that arise locally at each site. The reference cycle can be depicted as follows:
OAC,
Moo
MOl
e
MlO
M11
‘AG,
Here M represents the macromolecule subject to reversible transformations due to two distinct processes taking place at sites i and j . It is not necessary to specify the nature of these processes, other than to point out that they induce a transition between two states, 0 and 1. A number of biologically relevant processes are amenable to this description. In ligand-binding processes a given site of the macromolecule switches from the free (0) to the bound (1) state. In helix-coil transitions and protein folding the site is a given residue in the coil (0) or helical (1) state, or in the unfolded (0) or folded (1) configuration. Analogous considerations apply to mutational effects, where each site is a residue that can exist in the wild-type (0) or mutated (1) state. The general applicability of the cycle in Eq. (1) enables the analysis of processes of widely different nature. Ligand binding or folding at a given site can be studied in terms of the effect of binding or folding at
62
ENRICO DI CERA
a second site, or as a function of mutations introduced at other sites. The AG's in Eq. (1) reflect the energetic balance of the transformations in the cycle that generate the four possible intermediates. Depending on the nature of the process under consideration, these terms may refer to binding free energies, free energies of unfolding, or perturbation free energies due to sitedirected mutations. The reference cycle is the elementary conceptual block for understanding the mutual interference of two processes. Combination of reference cycles produces more complex patterns of linkage that are however subject to the same restrictions as is the elementary cycle in Eq. (1). The cycle contains four species. Species Moorepresents the configuration of the macromolecule where the sites are both in state 0. If site i is changed to state 1, there is a free energy change OAG, associated with the reversible 0 + 1 transition that transforms Moointo Mlo. The suffix 0 indicates that the transition takes place with site j in state 0. Likewise, when site j is changed to state 1, there is a free energy change OAG, associated with the reversible 0 + 1 transition that transforms Moointo Mol. The suffix 0 indicates that the transition takes place with site i in state 0. Each of the above transformations can occur when the other site is in state 1.The free energy change for the reversible 0 + 1 transition of site i when site j is in state 1 is lAG,, and likewise 'AG, gives the analogous free energy change for site j when site i is in state 1. The essence of linked functions stems from the dependence of the free energy change for the 0 + 1 transition of a given site on the state of the other site. Linkage exists when 'AG, # OAG,. The effect is reciprocal because of the conservation of free energy in the reference cycle. In going from Moo to MI1,the free energy change is the same along the Moo+ Mlo+ Mll or the Moo+ Mo, + MI1pathways. Hence 'AG, + OAG, = 'AG, + OAG, and necessarily 'AG, # OAG, if 'AG, # OAG,. So, if site j affects site i, then site i must affect site j . Of the four transitions in the cycle, only three are independent, and the free energy difference between the vertical or horizontal transitions is exactly the same. This quantity is the coupling free energy AG, = lAG, - OAG, = lAG - OAG, (Weber, 1975; Jencks, 1981) and measures the interdependence of the effects. If site j affects site i to a certain extent AG,, then site i must affect site j to the same extent. The value of AG, indicates whether the two sites are linked (AG, # 0) or not (AG, = 0) and whether the linkage is positive (AG, < 0) or negative (AG, > 0). In the former case the two sites enhance each other in the process under consideration; in the latter they oppose each other. It should be pointed out that the coupling free energy in Eq. (1) is the same as the free energy for the dismutation reaction Mlo + Mol =
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
63
Moo + MI1where the two species Mooand MI1are generated from the parent species Mlo and Mol. This is because each intermediate can be assigned a free energy G to define the A G s in the cycle as OAGi = Go - Go,OAG, = G1- G,'AG, = GI - Gol, 'AG = GI - Go,and the coupling free energy can also be written as AGc = G1 + - GoGI,which is in fact the free energy change for the dismutation Mlo Mol = Moo + Mil. Measurements of the coupling free energy require characterization of only two parallel reactions of the cycle in Eq. (1).This has the practical advantage that information on one process can be obtained indirectly from the cycle by studying another process linked to it. Hence, if the process of interest is not amenable to direct experimental analysis, a suitable linked process can be identified and studied in its place. Implementation of this principle through sitedirected mutations that can be introduced in nearly every system enables the indirect study of processes like conformational and folding transitions that may be difficult to follow directly. In general, mutational effects can be exploited in the analysis of protein energetics using the linkage principles embodied by the cycle in Eq. ( l ) ,as illustrated below.
+
111. STRUCTURAL MAPPINGOF ENERGETICS
A. Site-Specific Structural Perturbations Current studies of protein structure and function emphasize the role played by specific residues in the observed effects. Some studies focus on the domains of the protein involved in recognition of a specific ligand or substrate. Identification of a structural epitope then opens the question of how specific residues contribute energetically to the binding event in the ground or transition state. Assessment of the structural boundaries defining an epitope and the energetic contribution of each residue to binding is instrumental to the subsequent development of small molecules that can either reproduce or inhibit the effects elicited by larger physiological ligands. Such studies benefit a basic understanding of the rules for ligand recognition, as well as more prosaic aspects related to drug design. This approach involves the combination of physical chemistry methods with X-ray crystallography,nuclear magnetic resonance (NMR) spectroscopy and above all, recombinant DNA technology through which the effects of specific substitutions on stability and recognition can be studied. Residues in a protein can be replaced by any of the 20 natural amino acids using sitedirected mutagenesis (M. Smith,
64
ENRICO DI CERA
1985).More recently, this strategy has been extended to unnatural amino acids and the ability to manipulate protein structure and function has been greatly expanded (Judice et al., 1993; Cornish and Schultz, 1994). Other powerful approaches include sitedirected isotope labeling of specific residues (Sonar et al., 1994; Spudich, 1994). A typical course of action in current studies of structure-function correlations starts with the identification of important contacts from the crystal structure. In the analysis of protein stability particular attention is devoted to residues buried in the interior of the protein and arranged in hydrophobic cores (Matthews, 1993; Yu et al., 1995; Shortle, 1996). Attention is also paid to residues engaged in ionic interactions (Meeker et al., 1996), especially if they are screened from the solvent (Milla et al., 1994; Garcia-Moreno et al., 1997). In the analysis of molecular recognition of ligands and substrates obvious targets for mutagenesis are identified from residues involved in polar and hydrophobic interactions in the bound complex (Clackson and Wells, 1995; Castro and Anderson, 1996). In the absence of structural information on the bound protein, general criteria like solvent accessibility can guide the mutagenesis screen (Tsiang et ad., 1995;Dickinson et al., 1996).To a first approximation, residues that are freely accessible on the surface of the protein are optimal candidates for functional epitopes. After a set of suitable targets is identified, perturbations are introduced in the form of sitedirected mutations, usually Ala substitutions. The rationale behind Ala-scanning mutagenesis is that all interactions of a side chain except for the C, are eliminated (Lau and Fersht, 1987; Cunningham and Wells, 1989). The contribution of the deleted groups relative to the methyl moiety of Ala is assessed from the difference between the properties of the wild-type relative to the Ala mutant. Free energies of binding in the ground or transition state or free energies of unfolding are used to quantify the effect of the Ala substitution at any given site. For this strategy to be effective, it is necessary that the Ala substitution eliminates interactions without introducing new properties. In principle, this should be the case for almost all amino acids except Gly, for which the Ala substitution can introduce new apolar interactions. In addition, for Gly and Pro the Ala substitution can introduce perturbations of the protein backbone that become less flexible (Gly -+ Ala substitution) or less rigid (Pro -+ Ala substitution). Alascanning mutagenesis has found myriad applications in the identification and energetic characterization of structural epitopes recognizing specific ligands (Tsiang et al., 1995; Dickinson et al., 1996), or the structural determinants of protein stability (Green et al., 1992;Horowitz and Fersht, 1992; Fersht and Serrano, 1993; Matthews, 1993; Milla et al., 1994; Yu
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
65
et al., 1995),enzyme mechanism (Carter and Wells, 1988),and specificity (Dang et al., 1997a; Vindigni et al., 1997a).
Using Ala scans, Clackson and Wells (1995) have identified a hot spot of binding energy in the human growth hormone receptor recognizing the hormone. A small core of residues contributing to binding contains mostly hydrophobic residues and is surrounded by a region of polar and charged residues that participate in recognition to a lesser extent. This finding has potentially important consequences on drug design. In fact, if the functional epitope recognizing a ligand involves only a small number of residues clustered in space, then it becomes possible to mimic the action of large molecules with smaller ones that specifically target the hot spot of binding free energy and elicit the same biological response (Wells, 1996). Such a scenario has been shown to be realistic in the case of the receptor for the cytokine erythropoietin (Wrighton et al., 1996; Livnah et al., 1996). There are cases where the organization of the functional epitope is more complex. Mutagenesis studies of the thrombin-hirudin interaction show that the binding free energy is not localized at preferred regions, or hot spots, but rather is delocalized over the entire surface of recognition and involves many hydrophobic, polar, and charged residues (Wallace et al., 1989; Betz et al., 1991). Dickinson et al. (1996) have used Ala scans of 112 residues of coagulation factor VIIa in an attempt to identify the functional epitope for binding of tissue factor. They find that the residues most important for tissue factor binding do not cluster to define a hot spot but are interspersed among residues that are not important or only marginally so for binding. These findings underscore the difficulty of mimicking hirudin with smaller molecules and make it unlikely that a small analog of the tissue factor will be developed. Another important application of Ala scans has been provided by Tsiang et al. (1995).They have used Ala scans of charged surface residues of thrombin to identify regions that preferentially recognize fibrinogen or protein C in an attempt to dissociate the procoagulant and anticoagulant activities of the enzyme. Their studies have led to the identification of a residue, E217, whose mutation significantly affects fibrinogen binding without compromising recognition of the anticoagulant protein C (Gibbs et al., 1995).Mutation of E217 generates thrombin derivativeswith potential therapeutic application as anticoagulants (Tsiang et al., 1996). The success of these experiments shows the great virtue of Ala-scanning mutagenesis and reveals the plasticity of protein-protein interfaces and the different strategies used by different systems to achieve specificity. However, one should not overlook the fact that these impressive applications of recombinant DNA technology to the study of protein structure
66
ENRICO DI CERA
and function are modern incarnations of a principle formulated in 1895 by the Austrian chemist R. Wegscheider. He was the first to realize that information on individual sites of a molecule could be extracted by studying the effect of structural perturbations on the energetics (Wegscheider, 1895). He successfullyused alkyl derivatives of polybasic acids to infer the p&'s of individual titrating groups. Elegant applications of this approach have been offered by Neuberger (1936) and Edsall (Edsalland Blanchard, 1933) and have contributed enormously to our understanding of sitespecific energetics and to subsequent theoretical (Hill, 1944, 1985; Di Cera, 1995) and computational (Bashford and Karplus, 1991; Gilson, 1993; Yang et al., 1993) analysis of the ionization properties of proteins. Strategies similar to Wegscheider's original approach have been applied to the dissection of the energetics of enzyme-substrate complexes by substituting H for functional groups in the substrate like CH3 (Holler et al., 1973), NH2 (Blanquet et al., 1975), OH (Stubbe and Abeles, 1980), and COOH (Evans and Polanyi, 1936). These strategies can be considered as the immediate predecessors of Ala-scanning mutagenesis studies through which similar perturbations are introduced in a protein.
B. Linkage between Site-Spec& Structural Perturbations and Energetics Once structural perturbations are introduced in the system, it becomes necessary to determine the energetic consequences of the substitutions made. The correct approach is based on the definition of a reference cycle analogous to Eq. ( 1 ) where the perturbation is coupled to the process of interest. In the study of protein stability the cycle is
The 0 + 1 transition at site i [see Eq. ( l ) ] is replaced by the folding of the protein, and 'AGi is the same as the free energy of folding of the wild type, A G , . The same process can be studied after a mutation has been introduced in the system at site j to measure AG,,,,,. The difference AAG = AG,,,,, - A& quantifies the effect of the mutation on the stability of the protein (Horovitz and Fersht, 1992; Matthews, 1993; Shortle, 1996). We see from the cycles in Eqs. (1) and (2) that this difference
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
67
is the same as the coupling free energy of the cycle, which measures the linkage between the stability of the protein and the mutation. When AG, = AAG is positive, the mutation reduces stability. Enhanced stability is reflected by a negative value of AG,, and no effect is seen for AG, = 0. The equivalence between AAG and AG, in the cycle in Eq. (2) puts some important restrictions on the interpretation of experimental data. One of these restrictions is that the effect of the mutation cannot be attributed entirely to the folded state of the protein, as often assumed (Matthews, 1993). In fact, AG, measures the difference in free energy between the folded mutant and wild type relative to the same difference in the unfolded state. To assign AG, entirely as ‘AG,, the difference in free energy of stability between the mutant and wild type in the folded state, one must assume that the free energy of the unfolded state is not affected by the mutation. Under this assumption, stability measurements can be used to derive information on the structure of the folded protein by utilizing the cycle in Eq. (2) and its properties. However, there is overwhelming experimental evidence that this assumption on the unfolded state may be flawed. In a series of careful studies, Shortle and colleagues have demonstrated that mutations of staphylococcal nuclease affect to a large extent the unfolded state of the protein (Shortle et al., 1990; Green et al., 1992; Meeker et al., 1996; Shortle, 1996). The sqme conclusion applies to thioredoxin (Lin and Kim, 1991). In this case, the interpretation of AG, values in structural terms becomes problematic and the straightforward connection between stability and structural perturbations, which is claimed to exist in lysozyme (Matthews, 1993), remains to be demonstrated. An even more serious caveat in studies of protein stability comes from the definition of the unfolded state. In principle, this is an idealized state fully hydrated and devoid of interactions among the protein residues. In practice, this state is approximated by the denatured state of the protein that depends on the method used for denaturation (Makhadatze and Privalov, 1995). Reversible thermal denaturation is the most rigorous way to access the properties of the denatured state and is not equivalent to methods using denaturants like urea and GndHC1. These molecules specifically associate with the denatured state (Schellman, 1990), changing its chemical potential and therefore producing an ensemble of states energetically distinct from the thermally denatured state of the protein. For example, a mutation that opposes interaction of the denatured state with urea produces a AG, < 0 conducive to a stabilization of the folded state. However, there is no aprim’expectation for the same mutation to produce an analogous effect on stability in a thermal denaturation experiment. Similar complications arise in the study of ligand recognition. In this case, a cycle analogous to that in Eq. (2) can be constructed to study
68
ENRICO DI CERA
the effect of sitedirected mutations on binding of a ligand (L) in the ground or transition state as follows:
M*
a
M*L
AGrn",
Here OA Gi measures the free energy of binding L to the wild-type macromolecule, A&. The same process in the mutant gives AG,,,, and the - At!&, = AGc is a measure of the effect of the difference AAG = site-directed mutation on the binding process (Carter et al., 1984; Wells, 1990).As for protein stability, this difference is the coupling free energy of the cycle and measures the linkage between the binding of L and the mutation. The same cycle applies to binding in the transition state, where the free energy is directly related to the specificity constant s = k J K , and AAG = RT ln(h/s,,,,,) = AG,. When AGc > 0, the mutation reduces specificity, whereas enhanced specificity is reflected by AG, < 0 and no effect is seen for AGc = 0. As for protein stability, the effect on the coupling free energy cannot be assigned unambiguously as an effect of the mutation on the bound complex ML, as is commonly done when Ala scans are used to identify structural epitopes. In fact, AG, measures the difference ' A G - OAG, and reflects the perturbation introduced by the mutation on the ML complex relative to the free macromolecule M. If a mutation has AG, > 0 and destabilizes the binding of L, the effect is not necessarily due to destabilization of the complex ML and hence entirely to 'AG. A mutation that stabilizes the free form of the macromolecule (OAG < 0) and has no effect on the bound form ('AG = 0) also gives AG, > 0 and can be confused with a mutation that directly affects recognition of the ligand. In this case, the residue mutated is mistakenly associated with the epitope recognizing the ligand L, although it plays no role in the binding event. A value of AG, > 0 only means that the effect of the mutation has reduced the stability of the complex more than that of the free form. Assignment of the perturbation to the bound complex strictly requires experimental demonstration that the free form of the macromolecule is not affected by the mutation (OAq = 0). In the absence of this information, interpretation of the results may be problematic and must rely on
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
69
other criteria like the spatial proximity of residues affecting ligand binding or the involvement of these residues in ligand recognition based on structural information. Only when the Ala substitution does not alter the properties of the unfolded state or removes contacts important for interaction with the ligand can maps of the regions involved in stability and ligand recognition be constructed from the effect of the mutation on AGc in Eqs. (2) and (3).
C. Limits of Single-Site A h Scans The approach based on Ala scans is in principle very powerful and informative. However, in addition to the complications just dealt with in Section III,B, this approach has a serious limitation that must be kept in mind in practical applications. The limitation is that single-site Ala replacements neglect a priori the contribution of possible site-site interactions to protein stability and ligand recognition. Results from the limited number of studieswhere this problem has been addressed experimentally have fostered the general notion that residues tend to participate independently in stability and recognition (Sandberg and Terwilliger, 1989; Shirley et aL, 1989; Wells, 1990) and that interactions only occur among residues close in space (Carter et al., 1984; Carter and Wells, 1988;Wells, 1990; Horovitz and Fersht, 1992; Mildvan et aL, 1992). However, a number of instances have recently been reported of interactions among residues that can be as far as 30 %, away from each other (Shortle and Meeker, 1986; Perry et al., 1989; Howell et al., 1990; LiCata et al., 1990; Scrutton et aL, 1990; Green and Shortle, 1993;Jackson and Fersht, 1993; Robinson and Sligar, 1993; LiCata and Ackers, 1995). Hence, the role of interactions in mutational effects in proteins cannot be underestimated. There is good reason to believe that interactions are present in nearly every system and provide the most important ingredient to protein stability and ligand recognition, as the following argument demonstrates. A functional epitope for binding or stability composed exclusively of independent residues can be identified solely by single-site Ala substitutions. In this case, the residues of the epitope contribute to the energetics in an additive manner. The key assumption of Ala-scanning mutagenesis is that the Ala replacement has the only effect of eliminating the interactions of the side chain beyond the C, (Lau and Fersht, 1987; Cunningham and Wells, 1989). If this assumption is correct and the Ala replacement is an unbiased probe of the energetic contribution of a given residue to binding, then the Ala mutation at any position of the epitope should convert the free energy contribution to zero. Hence, the sum of the
70
ENRICO DI CERA
free energy losses over all sites in the epitope, with changed sign, should be close to the actual free energy of binding or stability measured experimentally for the wild type. Inspection of the results in Table I for some paradigmatic examples shows that this is far from being the case. A large discrepancy exists between the calculated and experimentally determined values. In the case of human growth hormone binding to its receptor or granulocyte colony stimulating factor binding to its receptor, the binding affinity calculated from the results of the Ala scan is greatly overestimated, and so is the stability of Arc repressor and staphylococcal nuclease. In the case of BPTI binding to t r p i n , tissue factor binding to coagulation factor VIIa, or linolenate binding to intestinal fatty acid binding protein, the binding affinity is grossly underestimated. When the affinity is underestimated, it may be argued that the functional epitope might have been incompletely characterized, thereby missing important interactions. This can hardly be the case for the interaction of tissue factor with VIIa, where 112 residues were targeted by mutagenesis (Dickinson et al., 1996), or intestinal fatty acid binding protein, where 23 important residues in the binding cavity were replaced (Richieri et al., 1997). On the other hand, when the affinity is overestimated, it may be argued that the functional epitope might have included sites of marginal importance. Again, this can hardly be the case in the interaction of human growth hormone with its receptor where the functional epitope is a small hot spot (Clackson and Wells, 1995), or for Arc repressor where the calculated value of stability was taken from the sum of only 11 out of 52 mutated residues (Milla et aL, 1994), or else for staphylococcal nuclease where only the effect of Ala replacements of 14 large hydrophobic side chains was considered (Shortle et al., 1990). The results of intestinal fatty acid binding protein are particularly instructive insofar as they show that the discrepancy between calculated and experimentally determined values depends on the particular ligand examined. The difference changes from - 11.5 kcal/mol for linolenate to 1.4 kcal/mol for stearate. Given the comparable size of the fatty acids listed in Table I and their comparable binding affinity, this large difference cannot be due to intrinsic properties of the ligand. Rather, it suggests the presence of communication among the protein residues that is sensitive to the particular ligand bound. It may seem paradoxical that an epitope containing all residues replaced by Ala should bind a ligand with a A G = 0, regardless of the system studied, if the residues are truly independent. A binding free energy of zero means that the ligand experiences no net energetic change in going from the free to the bound state and that the all-Ala binding epitope is energetically neutral. Similar arguments apply for
TABLE I Comparison of Free Energy Values (in kcal/mol) fw Stability and Ligand Recognition Measured Experimentally and Calculated jiom SingMite A h Scans" System
Process
Ala replacements
A Gd
A G.,
AAG
Reference
Arc repressor Staphylococcal nuclease hGH-hGHbpd BF'TI-chymotrypsin' VIIa-TFJ I-FABPE (palmitate) I-FABPg (stearate) I-FABPC (oleate) I-FABPC (linoleate) I-FABPg (linolenate) I-FABPg (arachidonate) GCSF-GCSF receptor*
Unfolding Unfolding Binding Binding Binding Binding Binding Binding Binding Binding Binding Binding
51' 14' 30 15 112 23 23 23 23 23 23 27
58.2 39.1 -25.9 -6.4 -9.7 -6.8 -13.1 -8.5 -5.4 2.4 -3.6 -14.5
13.8 5.5 -12.3 -10.7 - 15.4 - 10.9 -11.7 -10.7 - 10.0 -9.1 -9.5 -11.3
-44.4 -33.6 13.6 -4.3 -5.7 -4.1 1.4 -2.2 -4.6 -11.5 -5.9 3.2
Milla et al. (1994) Shortle et al. (1990) Clackson and Wells (1995) Castro and Anderson (1996) Dickinson et al. (1996) Richieri et al. (1997) Richieri et al. (1997) Richieri et al. (1997) Richieri et al. (1997) Richieri et al. (1997) Richieri et al. (1997) Young et al. (1997)
a AGd was calculated as the sum of the individual free energy perturbations, with changed sign, due to Ala replacement at individual sites. AAG measures the difference between the experimentally determined free energy for the process taking place in the wild type,AGq, and AGd. * Only the Ala replacements of residues W14, N29, M 1 , S32, E36, R40,S44, K47, E48, and R50 forming hydrogen bonds and ion-pairs protected
from the solvent were included in the calculations. ' Only the Ala replacements of large hydrophobic residues were included in the calculations. Human growth hormone (hGH) binding to the extracellular domain of its first bound receptor (hGHbp). Bovine pancreatic trypsin inhibitor (BPTI). /Tissue factor (TF) binding to coagulation factor VIIa. 8 Intestinal fatty acid binding protein (I-FABP). Granulocyte colony stimulating factor (GCSF).
72
ENRlCO DI CER4
protein stability. Although this scenario is hypothetical, its validity within reasonable energetic terms is key to the approach based on Ala scans. If the large discrepancy in Table I is the result of specific favorable or unfavorable contributions to stability and recognition introduced by the presence of Ala at any given site, the assignment of epitopes with Alascanning mutagenesis becomes context dependent and questionable. It is possible that Ala replacements may introduce additional properties at the site of mutation and that these properties may bias the energetic balance of the substitution. However, this bias is likely to be small. It is more reasonable to conclude that the large discrepancy documented in Table I underscores a more serious and general problem, i.e., the neglect of energetic contributions arising from possible site-site interactions that cannot be quantified by single-site Ala scans. In the case of protein stability, the presence of interactions is the obvious result of the highly cooperative nature of the folding process (Privalov, 1979; Creighton, 1990; Dill, 1990). In the case of ligand binding, the presence of cooperativity in the recognition event may be the signature of some general rules through which biological specificity is encoded into the structure of a protein. When interactions among residues are present in the system, the analysis must be cast in terms of double, triple, and higher order perturbations. The presence of interactions obviously invalidates the energetic assignments derived from singlesite Ala scans, because the contribution of a given residue to stability or ligand binding will depend on the state (wild type or mutated) of other residues. The extent to which interactions affect the assignments based on single-site Ala scans must be evaluated in each case and clearly complicates the identification of epitopes such as those reported for the proteins listed in Table I. This problem about single-site Ala-scanning mutagenesis studies of protein structure and function poses challenging tasks from an experimental standpoint, as will be discussed in Section IV,below. In a cooperative process like protein stability or ligand recognition the contribution of a given residue involves effects of multiple order. A first-order contribution comes from contacts made directly with the ligand, or with another residue in the protein. Higher order contributions may come from the coupling between the residue and other structural components. For example, the residue may contribute to recognition of the ligand by forming an ion pair with another residue of the protein. In this case, disrupting the interaction with an Ala substitution perturbs ligand binding but also the ion pair. To dissect the contribution coming from the ion pair, one should know what effect on ligand binding is caused by the Ala replacement of the other residue forming the ion
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
73
pair and compare the results with the double Ala replacement at both residues. This requires the construction of the two single mutants and the double mutant. In general, the residue recognizing the ligand may be involved in a number of interactions with other residues, like shortrange van der Waals coupling, long-range electrostatic coupling, or even large-scale conformational transitions. In the case of protein folding where the close packing of residues in the hydrophobic core may represent a crucial determinant for stability (Dill, 1990), the contribution of a residue clearly involves strong interactions with other residues and cannot be assessed solely from single-site substitutions. For second-order interactions, double mutations are necessary to assess the energetic contribution to stability and ligand recognition. In the case of higher order interactions, triple and quadruple substitutions become necessary. If an epitope contains Nresidues, a complete singlesite Ala scan requires N mutations and a double-site Ala scan requires N(N - 1)/2 mutations. The problem of correctly assessing the energetic contribution of residues in a functional epitope using sitedirected mutational perturbations is therefore a complex one and demands elucidation of site-site coupling patterns. This makes it necessary to develop a systematic theory for the analysis of mutational effects in proteins where the role of interactions is explicitly taken into account.
IV. SITESPECIFIC ANAL.YSIS OF MUTATIONAL EFFECTSIN PROTEINS A. Doubh-Mutant C’chs Consider the general case of a system composed of Nsites that can exist in two states, 0 and 1. In what follows we will assume that state 0 is the unperturbed wild-type state of the site, while state 1is the perturbed or mutated state. The system so constructed has a total of 2Npossible configurations, ofwhich 2N- 1 are independent if the wild-type configuration is taken as reference and used to scale all others energetically. It is convenient to associate each configuration with an Ndimensional vector of binary digits, 0 and 1, that depict the states of a site. The vector labels are arranged in a preassigned order, usually lexicographic. For example, the vector [OOll] labels one of the 16 possible configurations of a system containing four sites ( N = 4), where sites 1 and 2 are in the wild-type state and sites 3 and 4 are mutated. In general, a given configuration can be represented with the vector [a/3. . . 01 , where a, p, . . . , w = 0, 1. The first index refers to site 1, the second index to site 2, and so on, up to the w index for site N.
74
ENRXCO D1 CERA
Let AG, be defined as the free energy change associated with the 0 + 1 transition at site j when all other sites are in state 0. This term is the difference in free energy between the configuration with site j perturbed and the wild type resulting in the loss (AG, > 0) or gain (AG, < 0 ) of specificity or stability due to the first-order perturbation of that site. There are Nsuch terms to be taken into account, one for each site. Consider, then, the double perturbation at sites i and j . The free energy change for such perturbation can be written as the sum A GI A G AG,, where AG, is the interaction free energy between sites i and j when the perturbation is applied at both sites; AG, is the same as the coupling free energy in the thermodynamic cycle:
+
+
M*
a A G,
D
A G+ ~ AG,,
(4)
M*@
+ A G,,
where * and denote the two different perturbations at sites i and j . A negative value of AG, indicates positive coupling between the perturbations at sites i and j in enhancing specificity or stability, or negative coupling in reducing it, and vice versa for a positive value. A value of AG, = 0 indicates the absence of coupling between the perturbations at sites i and j; AG, is also the free energy change for the dismutation reaction M* M@= M + M*@. Since there are N sites, there are N(N - 1)/2 independent secondorder coupling free energies. Information on these terms is necessary to assess the extent of interactions between pairs of sites. Interactions that involve higher order coupling among three, four, up to Nperturbations can be analyzed in a similar manner. For example, the triple perturbation at sites i, j , and I can be written as AG, + A 5 + AG, + AGYl,where AG,, is the free energy of the dismutation M* + M@ + M' = 2M + M*@€and defines one of the N(N - 1 ) ( N - 2)/6 independent third-order coupling free energy values of the system. In general, any one of the N!/ ( N - k) !k! kth-order coupling free energies can be associated with the free energy change of the dismutation where the k singly perturbed configurations generate the configuration with k perturbed sites and k - 1 copies of the unperturbed configuration. The sum of all perturbation free energies A y s and the coupling free energies up to Nth order gives a total of 2N-1independent terms, equal to the number of independent configurations in the system of N sites. @
+
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
75
Previous discussions of double-mutant cycles have emphasized the importance of dissecting the interaction between residues in the analysis of protein stability and ligand recognition (Carter et al., 1984; Ackers and Smith, 1985; Shortle and Meeker, 1986; Horovitz and Fersht, 1990; Wells, 1990; Mildvan et al., 1992; LiCata and Ackers, 1995). Particularly important has been the analysis by Horovitz and Fersht (1990), who were the first to point out that these cycles can also be used to dissect more complex interactions involving multiple sites. Our approach differs from all previous analyses of the same problem insofar as it exploits the principles of site-specific thermodynamics- (Di Cera, 1995) and their general applicability to ligand binding, protein folding, and mutational perturbations. This approach captures the essence of the cooperative interactions among the sites through some basic properties of the free energy for the 0 + 1 transition at any site and the coupling free energy of double-mutant cycles to be discussed below. These properties provide clues on the mechanism underlying site-site coupling independent of any a prim' assumption.
B. Site-SpeciJc Transition Modes Current mutational studies of proteins use single-site Ala scans to assess the energetic contribution of individual residues to stability and ligand recognition. The scan provides information on Nsite-specific free energies of perturbation, A q s , obtained from the difference between the properties of the N single-site mutants and wild type. This is a minuscule amount of information compared to the other 2N- N - 1 coupling free energy terms that are necessary to fully describe the energetic balance of the perturbations. The coupling free energies are all zero if perturbations are additive and sites are independent. However, in the presence of nonadditivity and site-site coupling, the coupling free energies are finite and affect the response of the system to mutational perturbations at any given site. The nontrivial consequence of nonadditivity is that the energetic contribution of a residue to stability and ligand recognition depends on the state of other residues and changes as mutations are introduced at other sites. This immediately raises the question of defining the energetic cost of a mutation. An unambiguous answer to this question is only possible for the case N = 1, or in the absence of site-site coupling. For N 2 2 the state of other sites must be specified when the energetic cost of perturbing a given site is considered. The free energy cost of a perturbation reflects the free energy change associated with the 0 + 1 transition at the site. When the site is independent from the rest of the system,
76
ENRICO DI CERA
the free energy spent in its 0 + 1 transition is equal to AG, and can be obtained from a scan involving only single-site substitutions. When the site is coupled to the rest of the system, the result depends on the configuration of other sites. For a given mutation at site j there could be as many as 2"'-' different values of the free energy of perturbation, one for each independent configuration of the remaining N - 1 sites. All values bear equal importance, because they are associated with independent configurations accessible to the system. Hence, in the presence of site-site interactions one should consider a distribution of free energies of perturbation rather than thefree energy of perturbation at a given site. The values defining the distribution of free energies for the 0 + 1 transition represent the modes accessible to the site in all possible configurations of the other sites. The properties of the distribution of transition modes have been studied in great detail for Ising lattices where sites are equivalent and interact with all other sites. In this paradigmatic case, the distribution of free energy values is Gaussian, with a standard deviation that measures the average strength of interaction between pairs of sites (Keating and Di Cera, 1993). The distribution of transition modes for specific cases can be interpreted to a first approximation in terms of the properties of the Ising lattice model, with the mean and standard deviation of the distribution taken as measures of the susceptibility of the site to the perturbation and the strength of coupling with other sites. These parameters are derived from the ensemble of configurations accessible to the system and provide a more realistic assessment of the energetic contribution of a given site to stability or ligand recognition.
C. pfoperties of the Coupling Free Energy As for the site-specific perturbation free energy, the coupling between two sites may be subject to the state of other sites in the system. This problem was first approached by Horovitz and Fersht (1990) in the analysis of double-mutant cycles. They concluded that by comparing the values of the coupling free energy in a cycle in the two states, wild type and mutated, of a third residue it is possible to assess whether the third residue affects the interaction between the two sites. This approach can be extended to an arbitrary number of sites by constructing a hierarchy of perturbed cycles: first, the effect of a third site is examined on the coupling between two sites; then, the effect of a fourth site is studied on the coupling between the third site and the first two sites; and so forth. This approach is somewhat cumbersome and overlooks a basic
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
77
property of the coupling between two sites that is key to unravel the mechanism of site-site interaction. The coupling free energy between two mutations at sites i and j has been defined in the double-mutant cycle in Eq. (4) by implicitly assuming that all other sites are in state 0. Since the system contains N sites and sites i and j may be linked to each other and also to other sites, in general other sites may influence the coupling between sites i and j . By studying how the coupling between any two sites is affected by the configuration of other sites, much information can be gained on the mechanism of coupling. A cycle analogous to that in Eq. (4) can be drawn for any configuration of the other N - 2 sites. There are 2N-2 such configurations and N(N - 1)/2 distinct pairs of sites, i and j , leading to a total of N(N - 1)/2”’ possible thermodynamic cycles and coupling free energies. Not all these cycles are independent. In fact, the energetics of the system are fully described in terms of 2N - 1 independent terms, Nof which are site-specificfree energies of perturbation A q ’ s and the remainder are coupling free energy terms from second up to Nth order. Since no additional information can be generated by constructing double-mutant cycles beyond that embodied by the independent coupling terms, of the N(N - l ) Y 3possible cycles only 2N1 - Nare necessarily independent. However, for any given pair of sites i and j, the 2N-3coupling free energy values are all independent. Once coupling free energies are calculated for all possible configurations of the system, it is possible to decipher the code for site-site interactions by using the following property of a thermodynamic cycle whose proof is given elsewhere (Di Cera, 1995):
If the coupling between two sites is direct and involves on4 second-mder interactions, then the couplingj-ee energy is i n d e p e n h t of the configuration of other sites. Otherwise, the coupling is indirect and involves interactions higher than second order.
To understand the significance of this property of the coupling free energy, it is useful to consider two key examples of direct and indirect coupling. An example of direct coupling is provided by models of nearest neighbor interactions, like the Ising model (Wang and Di Cera, 1996) or the Koshland-Nemethy-Filmer model of ligand binding cooperativity (Koshland et al., 1966). In these models, interactions involve pairs of sites and are therefore second order in terms of the model-independent site-specific formalism (Di Cera, 1995). No matter how two sites are linked to each other and to any other site, the coupling between them remains energetically the same as the system changes its configuration.
78
ENRICO DI CERA
This has the nontrivial consequence that when the coupling between a pair of sites is not affected by a third site, one cannot conclude that the third site is not coupled to the pair as the Horovitz-Fersht approach would imply (Horovitzand Fersht, 1990). In fact, in any nearest neighbor model where the third site is coupled to each site in the pair, the state of the third site is inconsequential on the coupling free energy of the pair. Though somewhat counterintuitive, this conclusion can be proved in a rigorous manner (Di Cera, 1995) and provides an important reference point for the correct interpretation of coupling free energy profiles. Indirect coupling manifests itself in a more obvious manner. An example of indirect coupling is provided by the Monod-Wyman-Changeux model of concerted allosteric transitions (Monod et d., 1965) where interactions involve all sites through a linked global conformational change. As a result, sites are always positively coupled and the order of coupling changes according to the state of other sites as the protein switches from one state to another. The widely used mechanisms of cooperativity discussed above make very unrealistic predictions on the properties of the coupling free energy. In the case of nearest neighbor interactions, the prediction is that the coupling free energy between two sites will not depend on the state of other sites, while the concerted two-state model predicts coupling free energy values that change with the state of other sites but are always negative. There is obviously no a priori reason why the coupling between two sites should be independent of the state of other sites or it should always be negative. As we shall see in practical applications, coupling free energies are often positive, contrary to the prediction of the concerted allosteric model, and also change with the state of other sites, contrary to the prediction of any nearest neighbor model. This calls for more elaborate descriptions of cooperativity that merge the basic tenets of the two models. The underlying implication of the foregoing property of the coupling free energy is that basic mechanisms of coupling like direct nearest neighbor interactions or indirect, concerted allosteric transitions can be identified from the analysis of double-mutant cycles. Critical to the implementation of this approach is the availability of a highdimensional manifold of perturbations where the coupling between two sites can be studied in terms of a large number of configurations of other sites in the system.This poses challenging tasks from an experimental standpoint because the construction of triple or higher order mutants in a protein is clearly problematic. However, this approach appears to be ideally suited for the site-specific dissection of ligand recognition when most of the perturbations are introduced in small peptides that bind to the
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
79
protein. A detailed characterization of enzyme specificity can be obtained through the combination of perturbations in the ligand and the protein that generates the complexity necessary to dissect the high-order interactions in the system. A paradigmatic example of how this approach can be implemented in practice in the study of protein structure and function is offered next in Section V, dealing with the energetic origin of thrombin specificity.
V. SITE-SPECIFIC DISSECTION OF THROMBIN SPECIFICITY A. Substrate Recognition by Serine Proteases Serine proteases of the chymotrypsin family (Rawlings and Barrett, 1993, 1994) participate in key physiological functions like digestion, blood coagulation, fibrinolysis, and complement activation (Neurath, 1984; Perona and Craik, 1995). Proteases involved in digestive processes, like trypsin, have wide specificity and are also found in organisms as primitive as eubacteria. In contrast, proteases involved in the more specialized functions of blood coagulation, fibrinolysis, and complement activation have narrow specificityand are found almost exclusivelyin vertebrates (Doolittle and Feng, 1987; Patthy, 1990).Among these more evolved proteases, activity and specificity is controlled allosterically by the binding of Na+,whereas more primitive proteases and those involved in fibrinolysis are apparently devoid of such regulation (Dang and Di Cera, 1996). Serine proteases share a common fold composed of two six-stranded Pbarrels of similar structure that pack together asymmetrically to host at their interface the residues of the catalytic triad H57, D102, and S195 (Lesk and Fordham, 1996). The catalytic triad polarizes the side chain of the active site S195 for a nucleophilic attack on the C atom of the scissile bond of the substrate. The C atom is converted into a tetrahedral intermediate in the transition state, which is stabilized by hydrogen bonds between the charged carbonyl 0 atom of the peptide group of the scissile bond and the amide hydrogen atoms of G193 and S195 forming the oxyanion hole. The substrate is then acylated by the 0, atom of S195 after transfer of a proton to H57 and its Gterminal fragment is released. Deacylation is catalyzed by the nucleophilic attack of a water molecule that releases the carboxylic acid product and the N-terminal fragment of the substrate restoring the state of the catalytic triad. D102 anchors H57 in the correct orientation for proton transfer from and to S195, compensating for the developing positive charge (Warshel et az., 1989).
80
ENNCO DI CERA
Although serine proteases have a common catalytic mechanism, they differ widely in specificity. The molecular origin of this difference remains elusive. The preference of trypsin-like enzymes for Arg residues is due to the presence of D189 at the bottom of the catalytic pocket. Replacement of G216 and G226 by Ala in trypsin mimics the molecular environment of elastase, partially occludes the primary specificitypocket, and changes the specificity from Arg to Lys residues (Craik et aL, 1985). However, these substitutions fail to elicit the expected preference for amino acids with small hydrophobic side chains as found in elastase. In chymotrypsin, residue 189 is a Ser and the preference is for bulky aromatic side chains. However, the D189S replacement in trypsin does not result in a chymotrypsin-like specificity. This is instead obtained by exchange of the chymotrypsin surface loops 185-188 and 221-225 with the homologous regions in trypsin (Hedstrom et al., 1992), though none of the residues in these loops contacts the bound substrate. These observations suggest a molecular basis for protease specificity that may involve multiple critical sites. A useful framework to approach the study of protease specificity takes into account the interactions made by the enzyme with the substrate (Schechter and Berger, 1967). Residues of the substrate interacting with the enzyme are labeled with a P and a number from 1 to N, starting from the scissile bond and moving to the N-terminus. Residues of the enzyme making contacts with the substrate are called specijicity sites and are labeled with an S. The amino acid at P1 of the substrate makes contacts with the specificity site S1 of the enzyme, P2 contacts S2, and so forth. The P residues of the substrate are necessarily contiguous in sequence, whereas this restriction does not apply to the S residues of the enzyme. Residues on the Gterminal portion of the scissile bond of the substrate are numbered P l ’ , P2’, and so forth, and the corresponding specificity sites on the enzyme are Sl’, S2’, and so on. The scissile bond is positioned between P1 and PI’. The existence of multiple recognition sites effectively narrows down specificity by reducing the probability that the required sequence is found in a random sample of potential substrates. The longer the consensus sequence interacting with the enzyme, the smaller the probability that it will occur in another potential substrate. The best illustration of the effectiveness of this strategy is provided by the bloodclotting proteases (Davie et al., 1991). These enzymes have a trypsin-like primary specificity and cut substrates at Arg residues. However, they do so with extraordinary selectivity because of several other interactions made by the substrate with the enzyme. For example, the vitamin Kdependent factors Xa, thrombin and activated protein C are highly homologous in
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
81
sequence and have a similar three-dimensional structure (Bode et al., 1992; Padmanabhan et al., 1993; Mather et al., 1996). Yet, factor Xa is the only protease in the blood that can generate thrombin from prothrombin and thrombin is the only protease that can activate protein C. The subtlety of the molecular strategy to achieve specificity in these enzymes is astounding. For example, the chromogenic tripeptide substrate Asp-Arg-Arg-pnitroanilide is 67-fold more specific for thrombin than activated protein C when Asp at P3 is in the L enantiomer, whereas it is 22-fold more specific for activated protein C than thrombin when Asp at P3 is in the D enantiomer (Dang and Di Cera, 1997). Understanding the molecular origin of specificity in serine proteases is important for structure-function and evolutionary studies and also bears on rational drug design. In the case of a serine protease like thrombin, an orally available active-site inhibitor could offer a safer alternative to anticoagulants like warfarin and heparin (Stirling, 1995; Grinnell, 1997). In some cases, the optimal sequence of an inhibitor can be deduced from that of a natural substrate. For example, the first chromogenic substrates and active-site inhibitors of thrombin were synthesized after the sequence of fibrinopeptide A (Blomback et al., 1969; Svendsen et al., 1972). Highly selective substrates for activated protein C have been designed after the sequence of the natural substrate Va (Dang and Di Cera, 1997). In general, this information may not be sufficient if the natural substrate interacts with the enzyme through residues that are distant in sequence although close in spatial arrangement due to a precise tertiary structure that cannot be mimicked by a short peptide. Structure-function correlations are often used in this case and form the basis of the empirical approaches of medicinal chemistry (Claeson, 1994). This entails the synthesis and laboratory testing of substrate libraries containing hundreds of molecules. A modern but equally empirical incarnation of the same strategy is phage display (G. P. Smith, 1985; Smith and Petrenko, 1997) and its extension to the method of peptides on phage (Scott and Smith, 1990; Devlin et aL, 1990; Cwirla et al., 1990; Ding et al., 1995). A rational approach to specificity in serine proteases must address two critical questions: What is the free energy cost of a replacement made at a P or S site? And do these sites contribute to recognition independently or cooperatively? These questions are central to the analysis of mutational effects discussed earlier in Section IV.The free energy cost of a given perturbation can be estimated from the distribution of values obtained by perturbing all recognition sites. The coupling pattern between pairs of substitutions can be identified in a similar manner leading to important information on the mechanism of linkage. Construction
82
ENRICO DI CERA
of an ad hoc set of perturbations in the sites involved in the recognition event enables the identification of the energetic signatures of specificity.
B. Thrombin Structure and Function The serine protease thrombin is capable of two important and opposite roles that are at the basis of the efficiency of blood coagulation (Fenton, 1988; Mann et al., 1990; Davie et al., 1991; Gailani and Broze, 1991; Berliner, 1992; Grinnell, 1997; Di Cera et al., 1997). The procoagulant role entails the conversion of fibrinogen into the insoluble fibrin clot, the promotion of platelet aggregation, the stabilization of the ensuing clot by activation of factor XIII, and the feedback enhancement of its own generation from prothrombin by activation of factors V, VIII, and XI. The anticoagulant role involves the thrombomodulin-assisted conversion of protein C into an active component that cleaves and inactivates factor Va together with protein S, thereby limiting the conversion of prothrombin into thrombin catalyzed by the prothrombinase complex. In addition to its primary roles in coagulation, thrombin elicits a variety of important effects on a number of cell lines upon binding to its receptors (Vu et al., 1991; Grand et al., 1996; Ishihara et al., 1997). A list of natural substrates for thrombin is given in Table 11. With the exception of the newly identified second thrombin receptor, there is a consensus Arg at P1. There seems to be little conservation at other sites around the cleaved bond. A striking difference emerges from the comparison of fibrinogen and protein C insofar as this substrate carries TABLEI1
Site of Cleavage ( ) ty Thrombin on Natural Substrates Substrate
Sequence
Reference
Fibrinogen (Aa chain) Fibrinogen (BP chain) Factor XI11
FLAEGCGVRJ GPRVVERH
Rlornback (1969)
NEEGFFSAR~GHRPLDK
Blomback (1969)
TVELEGVPRJ GVNLQQ
Factor VIII Factor V Factor VII Thrombin receptor 1 Thrombin receptor 3 Protein C
LSNNAIGPRJ SFSQNSRHP RLAAALGIRJ S F R N S SLNQ RNASKPQGRJ IVGGKVCPK ATNATLDPR 4 S FLLRNPND LAKPTLP I K J T F R G A P P N S NQGDQVDPR J L I DGKMTRR
Takagi and Doolittle (1974) Eaton etal. (1986) Mann et al. (1988) Hagen et al. (1986) Vu et al. (1991) Ishihara et al. (1997) Foster and Davie (1984)
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
a3
Asp at P3 and P3', where fibrinogen has Gly or Ser (P3) or Arg (P3') residues. This suggests that the specificity site S3 of thrombin cannot be the same for fibrinogen and protein C and that a conformational transition must take place when thrombin switches from a procoagulant to an anticoagulant factor. This transition is linked to the release of Na+ from its site, leading to the fast-slow conversion of the enzyme (Wells and Di Cera, 1992). Na+ is required for the optimal conversion of fibrinogen into fibrin monomers, which is catalyzed by the fast (Na+-bound) form with high specificity. The slow (Na+-free) form of thrombin performs the same task with much lower specificity. This form, on the other hand, has higher specificity than the fast form toward protein C (Dang et aL., 1995, 1997a; Berg et al., 1996; Rezaie and Olson, 1997). Na+is the most important ligand of thrombin because of its ubiquitous presence in the physiological milieu where the enzyme functions in vivo. The fact that the Na+concentration is tightly controlled in the blood does not detract from the importance of Na+ as the key effector of thrombin function. In fact, Na+ is actively exchanged in the transition state upon binding of fibrinogen or protein C, just as the proton of the active site histidine is actively exchanged during catalysis although the pH of the solution remains constant. Since fibrinogen binds to the fast forms with higher affinity, it promotes the slow+fast conversion and Na+ binding. On the other hand, binding of protein C promotes the fast-slow conversion and Na+release. Hence, the notion that a constant concentration of Na+in the blood must result in a constant saturation of the Na+site during all steps of thrombin catalysis or ligand recognition contradicts the very basic principles of linkage thermodynamics (Wyman, 1948,1964) and is fundamentallywrong. Na+binding and dissociation are the key molecular events that control substrate recognition by thrombin. Thrombin is composed of two polypeptide chains of 36 (A chain) and 259 (B chain) residues that are covalently linked through a disulfide bond (Bode et al., 1992). The B chain carries the functional epitopes of the enzyme and has an overall architecture similar to that of pancreatic serine proteases (Fig. 1 ) . The extraordinary specificity of thrombin toward fibrinogen arises not only from contacts made in the interior of the active site (see below) but also from interactions with exosite I located about 20 away from the active site (Martin et aL, 1992; Stubbs et al., 1992). This region is homologous to the Ca2+-bindingloop of trypsin and chymotrypsin, is rich in positively charged residues, and is stabilized by the side chain of K70 that permanently substitutes the Ca2+.Exosite I serves as an extended primed recognition site. Binding of hirudin derivatives or thrombomodulin to this site also allosterically enhances Na+ binding and switches the enzyme to the fast form, thereby changing activity and specificity (Ayala and Di Cera, 1994; Guinto and Di Cera,
A
84
ENRlCO DI CERA
FIG.1. Ribbon rendering of the B chain of human thrombin in the fast form derived from the thrombin-himdin complex (Rydel et al., 1991) with the inhibitor removed. The side chains of the catalytic triad are shown and occupy the interface between the two @barrel domains. The catalytic triad is located about 20 A away from the bound Nat (circle). Important regions of the enzyme are noted.
1997; Vindigni et al., 199713).Another factor that influences thrombin specificity is the W60d insertion loop, which is unique to thrombin and shapes the apolar specificity site S2. This loop narrows significantly the access to the active site by protruding into the solvent. Replacement of W60d with the less bulky Ala or Ser profoundly affects the interaction of thrombin with the natural inhibitor antithrombin I11 (Rezaie, 1996) or fibrinogen (Guinto et al., 1995; Guinto and Di Cera, 1997). A similar function has been hypothesized for the autolysis loop shaping the lower rim of the access to the active site, but proteolytic cleavage of this loop produces no significant functional changes (Hofsteenge et aZ., 1988; Brezniak et al., 1990). Deletion of the entire loop, however, results in a selective loss of fibrinogen binding (Dang et aZ., 1997b). The Na+ binding site (Fig. 2) displays octahedral coordination involving the carbonyl 0 atoms of R221a and K224 and four buried water
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
85
FIG.2. Molecular environment of the Nat binding site of thrombin (1hah.pdb) and the conspicuous network of water molecules embedding the region. The hydrogenbonding network involves the bound Na+, 17 water molecules (gray circles), and several protein atoms (dark circles). Some hydrogen bonds (continuous lines) are conserved topologically with trypsin, which does not bind Na'. Others (dashed lines) are specific to thrombin. The bound Na' is octahedrally coordinated by the carbonyl 0 atoms of K224 and R221a and four water molecules (419, 416, 445 and 447). The side chain of D189 is nearby, with water 447 mediating a contact between 0%of D189 and the bound Na'.
86
ENRICO DI CERA
molecules that are tetrahedrally coordinated by protein atoms and other water molecules (Di Cera et al., 1995; Zhang and Tulinsky, 1997) that altogether define a complex hydrogen-bonding network within the catalytic pocket (Krem and Di Cera, 1998). Some of the hydrogen bonds in the network are conserved with trypsin (Bartunik et al., 1989). Others are specific to thrombin and are associatedwith Nat and its coordination shell. The bound Na+ is located 15-20 away from the catalytic triad and lies within 5 from D189 in the specificity site S1 with a water molecule mediating a hydrogen-bonding interaction between the bound Na+ and 0, of D189. The Na+ site lies within a cylindrical cavity formed by three antiparallel P-strands of the B chain (M180-Y184a, Y225-Y228, V213-C220), diagonally crossed by E188-El92 and shaped by the loop D221-K224 connecting the last two P-strands (Fig. 3). The sequence C220-G226 involving the Na+-bindingloop and part of the last P-strand of the B chain is absolutely conserved in thrombin from 11 different species, from hagfish to human (Banfield and MacGillivray, 1992), s u p porting the importance of Na+ binding in thrombin function. A crucial residue controlling Na' specificity in thrombin and all serine proteases of the chymotrypsin family (Rawlings and Barrett, 1993, 1994) is Y225, whose mutation to Pro abolishes Na+ binding (Dang and Di Cera, 1996) and produces a thrombin stabilized in the anticoagulant slow form that has enhanced specificity toward protein C (Dang et al., 1997a). The Nat site also appears to be stabilized by three ion pairs: R221a is ion paired to El46 of the autolysis loop; K224 is ion paired to E217; D221 and D222 form a bidentate ion pair with R187 (Fig. 3). The effects of altering the bidentate ion pair are revealed by the double mutant D221A/D222K made to mimic the sequence found in factor Xa in the same region (Di Cera et al., 1995). This mutant is stabilized in a conformation that is intermediate to the slow and fast forms, with reduced activity toward fibrinogen but enhanced activity toward protein C. There is no evidence of bound Nat in the crystal structure of this mutant, and disruption of the ion pair makes the segment portions of the 184 loop completely disordered (Zhang, Tulinsky, Guinto, and Di Cera, unpublished results). The effects of disrupting the R221aA-El46 are revealed by the properties of the natural mutant thrombin Salakta (Miyata et al., 1992), E146A, that has a reduced clotting activity. The R221aA mutant displays similar properties and has reduced Na+ binding (Dang et al., 1997a). Disruption of the K224-E217 ion pair in the E217A mutant produces a drastic loss of clotting activitywith a modest reduction of protein C activation (Gibbs et al., 1995). The K224A replacement produces similar effects and reduced Na' binding (Dang et al., 1997a).
A
A
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
87
FIG.3. Contacts between the irreversible inhibitor H-mPhe-Pro-Arg-CH&I and the active site of thrombin (Bode et al., 1992). Shown are the thrombin residues D189, P60c, W60d, L99, and W215 that interact with the inhibitor. The guanidyl group of the Arg at P1 makes an ion pair with the carboxyl group of D189 at S1 at the bottom of the active site. Pro at P2 packs optimally in the S2 apolar cavity provided by the W60d loop. H-LIPhe at P3 makes favorable hydrophobic contacts in the cleft with L99 and especially a perpendicular alyl-aryl edge-on interaction with W215 at S3. The D enantiomer of Phe at P3 enables it to interact favorably with the aromatic moiety at S3 with minimum strain in the backbone of the inhibitor, fixed by Pro at P2. Also shown are the residues of the catalytic triad the backbone of the 215-224 segment comprising the Na+ (circle) binding loop. R221a is ion paired to El46 in the neighbor autolysis loop, whereas K224 is ion paired to E217.
C. Site-Specijic Probes
The molecular strategy used by thrombin to achieve specificity toward fibrinogen and protein C is deeply rooted in the mechanism through which Na+ binding affects the environment of the active site of the
88
ENRICO DI CERA
enzyme. Recent attempts to crystallize the slow form of thrombin (Zhang and Tulinsky, 1997) have documented extraordinarily small changes compared to those of the fast form. Based on these structural results, the slow and fast forms of thrombin should be functionally equivalent. Since there is overwhelming experimental evidence to the contrary (Wells and Di Cera, 1992; Ayala and Di Cera, 1994; Dang et al., 1995, 1997a; Vindigni and Di Cera, 1996; Di Cera et al., 1997; Vindigni et al., 1997a,b), a more effective approach to understanding the molecular origin of thrombin specificity should emphasize the functional energetics using the strategy already outlined in Section IV. The main question is how the Na+-induced slow-fast conversion enhances specificity toward fibrinogen and small chromogenic substrates. A related question is which allosteric form should be targeted with activesite inhibitors to guarantee optimal specificity. In both cases, much information can be gained from a dissection of the energetic contribution of the specificity sites. Current studies of enzyme specificity employ libraries generated from combinatorial chemistry or phage display to identify consensus sequences for binding. Substrates generated in this manner, however, can be used as powerful probes of the molecular environment of the specificity sites of the enzyme to elucidate how they contribute to recognition in the transition state. The theoretical developments outlined in Section IV make it possible to unravel sitespecific contributions to binding from the effects of small ad hocperturbations introduced in the system. If the perturbations are made in the sequence of a substrate to generate a library containing all species required for a site-specific analysis, much information can be derived on the energetics of the specificity sites that is difficult to obtain from mutagenesis of the enzyme. In order to understand the molecular origin of the higher specificity of the fast form toward fibrinogen, the chromogenic tripeptide substrate FPR (Table 111) was synthesized to mimic the interaction of the natural substrate with the active site of the enzyme (Stubbs et al., 1992; Martin et al., 1992). Like fibrinogen, this synthetic substrate is cleaved by the fast form with a specificity 30-fold higher than that of the slow form (Table IV). The crystal structure of thrombin inhibited with H-DPhe-ProArg-CH2Cl(Bode et al., 1992) provides information on the interactions of the Pl-P3 groups of FPR with the enzyme: Arg at P1 makes an ion pair with D189 at S1 at the bottom of the catalytic pocket; Pro at P2 interacts with the apolar moiety of S2 lined up by P60b, P60c, and W60d; Phe at P3 forms a favorable edge-to-face interaction with the aromatic ring of W215 at S3 (Fig. 3). The D enantiomer at P3 is necessary to mimic the interaction of F8 at P9 of fibrinogen (Table 11) with W215 of thrombin
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
89
TABLE 111 Substrate Library fw the Analysis of Thrombin Specijicity Abbreviation FPR FPK FGR VPR FGK VPK VGR VGK
Substrate
H-o-Phe-Pro-Arg-pnitroanilide H-o-Phe-Pro-Lys-pnitroanilide H-o-Phe-Gly-Arg-pnitroanilide H-o-Val-Pro-Arg-pnitroanilide H-wPhe-Gly-Lys-pnitroanilide H-o-Val-Pro-Lys-pnitroanilide
H-o-Val-Gly-Arg-pnitroanilide H-o-Val-Gly-Lys-pnitroanilide
Site(s) perturbed None P1 P2 P3 P1 and P2 P1 and P3 P2 and P3 PI, P2, and P3
after the /%turn made by the Aa chain following the P3-P5 Gly-Gly-Gly flexible region to reenter the catalytic pocket in a parallel configuration to the /3-strand hosting W215 (Ni et al., 1992). The chromogenic group pnitroanilide attached to the Gterminus enables quantitative spectroscopic measurements of the released pnitroaniline upon cleavage by thrombin at the P1-pnitroanilide scissile bond. Starting from FPR, seven substitutions were made to generate the library in Table I11 (Vindigni et aL, 1997a). The main idea was to introduce enough perturbation at P1, P2, and P3 while retaining sufficient specificity for accurate experimental measurements. The perturbation would then act as the source of information on the environment of the specificity sites of the enzyme S1, S2, and S3. H-DPhe was replaced with H-D-Val in VPR, VPK, VGR, and VGK to alter minimally the size of the side chain while replacing the aromatic moiety with a hydrophobic one. Pro was replaced with Gly in FGR, FGK, VGR, and VGK to avoid steric hindrance with S2 and relieve the rigidity of the P2-P3 bond. Arg was replaced with Lys in FPK, FGK, VPK, and VGK, to preserve the positive charge at P1 needed to contact D189 at S1. These substitutions generate all possible intermediates from the parent substrate FPR the three singly substituted substrates FPK, FGR, and VPR the three doubly substituted substrates FGK, VPK, and VGR and the triply substituted substrate VGK. The library therefore maps the intermediates in a three-site system where each site can exist in two states. To obtain the relevant free energy changes associated with the perturbations, the specificity constant s = k,,,/K, for substrate hydrolysis was measured in all cases (Table IV) to estimate the free energy of stability of the transition state. The value for FPR was used to scale energetically all others to obtain the relevant free energy changes in the transition state (Table V). Similar measurements were carried out with the three
TABLE N Specilicity Constants k,,/K, (in p M - ' s - ' ) for the Hydrolysis of Synthetic Substrates by Wild-Type (wt) and Mutant Thrombins in the Slow and Fast F m "
FPR
FPK
FGR
90 80 44 26
7.9 4.6 7.7 3.2
2.0 0.75 0.93 0.33
0.35 0.040 0.034 0.010
0.86 0.042 0.012 0.0025
WR
FGK
WK
VGR
VGK
Fast fm wt
R221aA K224A R221aA/K224A
100 36 24 13
0.021 0.01 1 0.027 0.011
2.1 0.96 1.4 0.70
0.34 0.14 0.17 0.049
0.0047 0.0024 0.0044 0.0017
0.0026 0.00038 0.00039 0.00021
0.11 0.0097 0.0063 0.0018
0.17 0.0086 0.0020 0.00063
0.00079 0.00013 0.00013 0.000063
slow fm wt
R221aA K224A R221aA/K224A
3.0 1.6 0.47 0.34
6.7 1.o 0.28 0.077
"Experimental conditions: 5 mM Tris, I = 200 mM, 0.1% PEG, pH 8.0 at 25°C. The slow form was studied in the presence of 200 mM choline chloride. The properties of the fast form refer to the limit "a'] + 03, at constant I = 200 mM. Errors are typically 22%.
91
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
TABLE V Free Energy Valws (in kcal/mol) dw to Perturbation ofthe PI-P3 Sites4 AGi
AGq
AGs
AGiq
AGls
AGqs
AG,,,
1.4 1.7 1.0 1.2
2.3 2.8 2.3 2.6
-0.1 0.5 0.4 0.4
1.3 0.8 1.1 0.8
0.8 0.5 0.7 0.5
1.1 0.5 0.6 0.7
2.2 1.2 1.8 1.5
1.3 2.2 1.6 2.1
0.7 2.2 2.2 2.9
-0.5 0.3 0.3 0.9
2.2 0.6 0.5 -0.6
1.2 0.6 0.7 0.1
1.4
0.7 0.8
3.3 1 .o 0.8 -0.8
Fast fm Wt
R221aA K224A R221aA/K224A slow fonn Wt
R221aA K224A R221aA/K224A
-0.1
"Values were obtained from the specificity constants s = k,,/K, in Table IV as follows (the suffixes refer to the sequence of the substrate): AG, = -RTln(sFpR/smn);AGq = -RTln(sdsmn); AGs = -RTln(swR/swn); AGiq = -RTln(SFCKSwR/~wKSFGR);AGis = -RTln(SVPKSFPR/SFPKswn); = -RTln(svcnsFPn/sFCnswn); AGLZ = -RTln(SvmShn/ sWKSFGRSWR). Errors are typically ZO.1 kcal/mol.
thrombin mutants R221aA, K 2 2 4 , and R221a/K224 to assess the role of the two ion pairs that seem to stabilize the Na+ binding environment (Fig. 3). This resulted in the complete dissection of a fivedimensional manifold of species in both the slow and fast forms of the enzyme from which detailed information can be derived on how perturbations of the substrate are coupled to each other and to perturbations in the enzyme. The five sites perturbed are P1, P2, and P3 in the substrate and R221a and K224 in the enzyme. The specificity constants relative to the 32 possible intermediates in the manifold are listed in Table IV for each thrombin form.
D. Perturbation at the PI -P3 Sites There is a large and significant nonadditivity in the effects induced by perturbations of the Pl-P3 sites that emerges from inspection of the second-order and third-order coupling free energies listed in Table V. The extent of nonadditivity changes for each pair of substitutions and is also affected by the allosteric state of the enzyme and mutations made around the Na+-binding environment. The presence of interactions among the P1-P3 sites generates complexity in the response of the enzyme to specific perturbations of the substrate sequence. This demands an analysis of the perturbations based on the principles illustrated earlier in Section IV.
92
ENRICO DI CERA
The free energy change due to replacing Arg with Lys at P1 in all possible combinations of the state of P2 and P3 is summarized in Table VI. The values are all positive in both the slow and fast forms, for wildtype and mutant thrombins, indicating that the Arg + Lys replacement at P1 always causes a loss of specificity. A similar finding was reported for trypsin (Craik et al., 1985; Perona et al., 1993; Vindigni et al., 1997a) and is due to the change in interaction with D189 in the specificity pocket S1. The side chain ofArg at P1 is long enough for the guanidinium group to form an ion pair with D189 (Fig. 3), but Lys at P1 can interact with D189 only through a water-mediated contact. The cost of this replacement is about 1 kcal/mol in both the slow and fast forms when no replacement is made at P2 and P3, which suggests that the same mechanism may cause the loss of specificity in both allosteric forms. In trypsin, perturbation of the environment of D189 with the mutations G216A and G222A alters the catalytic register of the substrate leading to a decrease in k,,, and an increase in K , (Perona and Craik, 1995). These changes in catalytic parameters echo those observed in the fast+slow conversion of thrombin for both synthetic substrates (Wells and Di Cera, 1992) and fibrinogen (Vindigni and Di Cera, 1996). This would suggest that binding of Na+ orients the side chain of D189 for optimal coordination of the guanidinium group of Arg at P1, perhaps using water 447 that bridges the bound Na+and the 0, atom of D189 (Fig. 2). However, if this were the case, the loss of specificitywith the Arg += Lys substitution at P1 would be more pronounced in the fast form, contrary to what is seen experimentally. The similarity of effects between the two forms argues against a direct influence of the allosteric switch on the position of the side chain of D189. This conclusion is supported by the fact that water 447 is also present in trypsin (Bartunik et al., 1989; Krem and Di Cera, 1997),which does not bind Na+ (Dang and Di Cera, 1996), where it bridges the 0, atom of D189 to the carbonyl 0 atom of K224. The origin of the increased specificity of the fast form must therefore reside at other specificity sites. Due to the strong interactions among the Pl-P3 sites, the cost of replacing Arg with Lys at P1 depends on the residue at P2 and P3 and reveals the importance of going beyond single-site substitutions in the analysis of ligand recognition. With Gly at P2, the cost of the Arg + Lys replacement at P1 increases significantly by 1.3 kcal/mol in the fast form and 2.1 kcal/mol in the slow form, introducing a significant difference of -0.7 kcal/mol between the two forms (Table VI). This difference measures the coupling between the replacement at P1 and the slow+fast transition. A negative value indicates that the replacement promotes the slow+-fastconversion in the transition state, or that the replaced residue
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
93
binds preferentially to the slow form. A positive value signals a stabilization of the slow form, or that the replaced residue binds preferentially to the fast form. The absence of coupling seen when no perturbation is present at P2 and P3 cannot be taken as an absolute measure of the molecular events that accompany the substitution at P1. The environment around D189 in the transition state may be different in the slow and fast form, but this difference becomes appreciable only when Gly is present at P2. Similar considerations apply when a substitution is made at P3, although the energetic penalty for the P1 substitution increases by nearly 1 kcal/mol in both thrombin forms. The extent of interaction of P2 and P3 with P1 is significant. When Gly is present at P2, the interaction with P1 actually exceeds the cost of the replacement at P1 itself in the slow form. The free energy change due to replacing Pro with Gly at P2 in all possible combinations of the state of P1 and P3 is summarized in Table VI.As for the substitution at P1, the values are significantly positive. In this case, the effects tend to be more pronounced in the fast form, underscoring an obvious change in the environment of the S2 site in the slow+fast transition. The significantdifference is conducive to stabilization of the slow form in the transition state when Pro is replaced by Gly. The apolar site S2 of thrombin is formed by residues in the W60d loop that has no counterpart in other serine proteases. Residues in the apolar site are perhaps oriented differently in the slow and fast forms, causing a better discrimination of the residue at P2 in the fast form. W60d may play a key role in this respect because replacement of the bulky side chain with Ser in W6OdS abolishes the differences between the slow and fast forms in recognizing substrates with Pro or Gly at P2 (Guinto and Di Cera, 1997). The indole ring of W60d may produce steric hindrance in the slow form but not in the fast form. The differences between FPR and FGR are not due to effects on the free substrates arising from differences in the rate of diffusion into the active site. In fact, the substitution at P2 affects not only the specificity constant, which depends on the rate of diffusion, but also k,,, (Vindigni et al., 1997a), which depends exclusively on the acylation and deacylation rates pertaining to the transition state (Wells and Di Cera, 1992). The perturbation at P2 depends strongly on the residue present at P1 and P3. The cost of the Pro + Gly replacement increases by 1.2 kcal/mol in the fast form and 2.2 kcal/mol in the slow form as a result of the substitution at P1. This effect is exactly (taking into account round-off error) the same as that seen for the perturbation at P1 when P2 is perturbed, as a consequence of the reciprocity of the linkage between the perturbations at P1 and P2. Again, an approach based on
TABLE VI Free Energy Change (in kcal/nwl) in Specificity due to Perturbation of the PI-P3 Sites" ~~
Fast form
Slow form
R221aA
K224A
R221aA/ K224A
1.4 2.7 2.3 2.5
1.7 2.5 2.1 2.4
1.o 2.1 1.7 2.2
1.2 2.0 1.7 2.0
2.3 3.5 3.4 3.6
2.8 3.6 3.3 3.5
2.3 3.3 2.9 3.4
-0.1 0.8 1.0 0.9
0.5 0.9 1.0 0.9
0.4 1.o 1.o 1.1
wt
Coupling
R221aA
K224A
R221aA/ K224A
wt
R221aA
K224A
1.3 3.4 2.4 3.2
2.2 2.8 2.7 2.5
1.6 2.0 2.2 1.6
2.1 1.5 2.2 1.4
0.2 -0.7 -0.1 -0.6
-0.5 -0.3 -0.6 -0.1
-0.5 0.1 -0.6 0.5
-0.8
2.6 3.4 3.3 3.6
0.7 2.9 2.2 2.9
2.2 2.8 2.8 2.6
2.2 2.6 2.9 2.3
2.9 2.3 2.8 2.0
0.6 0.8 0.5 1.0
0.1 0.7 0.0
-0.3
0.6 1.2 0.7
1.1
1.6
0.4 0.9 1.1 1.1
-0.5 0.7 1.0 0.7
0.3 0.8 0.9 0.6
0.3 1.o 1.1 0.7
0.9 1.o 0.8 0.7
0.4 0.1 0.1 0.2
0.2 0.1 0.1 0.3
0.1 0.0 -0.1 0.4
-0.5 -0.1 0.3 0.4
wt
R221aA/ K224A
Replacement at P1 (Arg + Lys):
Fpx
(W
FGX (10) VPX (01) VGX (11)
-
Replacement at P3 (Phe Val): XPR (00) XPK (10) XGR (01) XGK (11)
1.5
0.5 -0.5 0.6
1.1
0.5
"Listed are all possible configurations of the other two P sites (0 = unperturbed; 1 = perturbed) in the order P1, P2, P3, along with the corresponding substrate. Errors are 20.1 kcal/mol or less. Values were obtained from the data in Tables IV and V. The difference between the values for the fast and slow forms gives the coupling between the substitution and the slow-fast transition. Positive values are indicative of stabilization of the slow form in the transition state, whereas negative values signal stabilization of the fast form. Values of the coupling in excess of 2 R T (0.6 kcal/mol) are set in boldface.
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
95
single-site substitutions would underestimate the cost of the Pro + Gly replacement at P2 because it would neglect the negative coupling between P1 and P2. Similar arguments apply when the substitution is made at P3. The free energy change due to replacing Phe with Val at P3 in all possible combinations of the state of P1 and P2 is summarized in Table VI.The unexpected finding is that VPR is actually a better substrate for thrombin than FPR itself, contrary to the predictions drawn from the crystal structure (Bode et al., 1992) that emphasize the virtues of an aromatic residue at P3. Furthermore, previous data obtained on bovine thrombin have documented a higher specificity for the sequence FPR compared to VPR (Lottenberg et al., 1983). Introduction of a hydrophobic group at P3 may bring about a more favorable interaction with the hydrophobic moiety of L99 (Fig. 3), which is close to the apolar site S2. Interestingly, residue Y3 of hirudin contacts W215 of thrombin in a manner similar to that of Phe at P9 of the fibrinogen Aa chain (Rydel et ad., 1991), but replacement of Y3 with the more hydrophobic Trp brings about a fivefold enhancement in binding affinity (De Filippis et al., 1995), consistent with the enhanced specificity of VPR compared to FPR and VPR. The energetic effect linked to replacement of the residue at P3 is of the same magnitude in both forms and excludes a direct involvement of the S3 site in the slowefast equilibrium. The perturbation at P3 depends strongly on the state of P1 and P2. The cost of the Phe + Val replacement increases by 0.9 kcal/mol in the fast form and 1.2 kcal/mol in the slow form as a result of the substitution at P1 and is the reciprocal of the effect seen for the perturbation at P1 when P3 is perturbed. The data in Tables IV and V reveal the presence of coupling among perturbations at P1, P2, and P3. The coupling is the result of constraints imposed by the enzyme on the bound substrate in the transition state and is therefore revealing of the molecular environment underlying the recognition process (Vindigni et al., 1997a).The coupling free energies for the three possible pairs of P sites in the two possible states of the third site are listed in Table VII. The values are constructed from the specificity constants pertaining to the four species in the double-mutant cycle in Eq. (4),where the mutations are replaced by the substitutions at the P sites. For example, the coupling between P1 and P2 is OAGI2= -RTln(SFGKSFPR/SFpKSFGR) in the absence of perturbation at P3 and 'AGI2 = -RTln(sVCKSWR/SWKSVGR) when P3 is perturbed. The value of OAG,, is the same as AG12in Table V. The coupling free energies in the case of wild-type thrombin are mostly positive and quite significant, demonstrating that perturbations at the P1, P2, and P3 sites are negatively
96
ENRICO DI CERA
Fast form wt
1.3 0.2 0.8 -0.2 1.1 0.1
Slow form
R221aA
K224A
R221aA/ K224A
0.8 0.3 0.5 -0.1 0.5 -0.0
1.1 0.6 0.6 0.1 0.6 0.1
0.8 0.3 0.5 -0.0 0.7 0.2
wt
2.2 0.7 1.2 -0.2 1.4 0.0
R221aA
K224A
0.6 -0.3 0.6 -0.3 0.7 -0.2
0.5 -0.6 0.7 -0.4
0.7 -0.3
R221aA/ K224A -0.6 -0.9 0.1 -0.1 -0.1 -0.3
*Listed are the two possible configurations of the third P site (0 = unperturbed; 1 = perturbed) in the order P1, P2, P3. Errors are 20.1 kcal/mol or less.
coupled in enhancing specificity.When a site is perturbed, perturbation at a second site reduces specificitybeyond simple additivity.Furthermore, the coupling between any two sites is enhanced by more than 1 kcal/mol when the third site is perturbed, underlying an even stronger cooperative effect in reducing specificity that progresses with the extent of perturbation in the substrate. There are six possible coupling free energy values for the three pairs, but only four are independent. Hence, the difference between any two values for each pair is exactly the same for all pairs. From the property of the coupling free energy (Section N,C)we conclude that the sites are coupled indirectly through interactions higher than second order. A simple nearest neighbor mechanism of interaction is inadequate to describe these interactions, and so is a concerted mechanism because of the negative nature of the coupling. A more elaborate mechanism needs to be developed to account for these findings. The nature of the coupling among the P1-P3 sites is such that interactions are significantly weakened in the fast form when any site is perturbed. A similar result is seen for the slow form, with the notable exception of the Pl-P2 coupling. The presence of coupling among the Pl-P3 sites and the way interactions change upon the allosteric transition reveal the site-specific origin of thrombin specificity.
E. Why Is the Fast Form More Specific? The Arg + Lys replacement at P1 slightly promotes the slow+fast transition when Gly is present at P2. On the other hand, the Pro + Gly replacement at P2 strongly stabilizes the slow form. The replacement at P3 is inconsequential on the allosteric equilibrium. Hence, the slow+fast
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
97
transition affects mostly the environment of the S2 site, with modest effects on the S1 site and no effect on the S3 site. Constraints at the S2 site accounts for the lower specificity of the slow form compared to the fast form and become inconsequential if the substrate acquires flexibility with a Gly at P2 and can readjust in the active site to compensate for the increased steric hindrance of the S2 site in the slow form. These findings explain why the thrombin mutant W6OdS cleaves FPR with the same specificity in the slow and fast forms (Guinto and Di Cera, 1997) and suggest the bulky side chain of W60d as the likely origin of the constraints at S2. Two factors contribute to specificity: the rigidity of the P2-P3 bond and the strength of the P1-S1 interaction. When Pro is present at P2, the P2-P3 bond is rigid and the substrate finds a more favorable S2 environment in the fast form. Replacement of Pro with Gly causes specificity to drop more in the fast form and reduces the differences between the allosteric forms of thrombin. The acquired flexibility of the P2-P3 bond is almost inconsequential in the slow form because of the more constrained environment of this form around the W60d loop lining the S2 site. The substrate FGR must assume essentially the same conformation as FPR to optimize its interaction with the slow form. In the fast form, however, the increased flexibility of the P2-P3 bond may relax the optimal interaction of Arg at P1 with D189 at S1, this effect being favored by a more open environment of the active site in this form. Consistent with this effect, the second replacement of Arg at P1 with Lys results in larger energetic penalty in the slow form, because the optimal Pl-Sl interaction has already been partially disrupted in the fast form by the Pro + Gly replacement. The coupling between substitutions at P1 and P2 comes partially from an intrinsic effect on the substrate, the loss of rigidity of the P2-P3 bond, and partially from the different environment of the enzyme in the slow and fast forms. The less constrained environment of the specificity sites in the fast form also act to reduce the extent of negative coupling among the various perturbations in the substrate, causing the interactions to essentially disappear as more substitutions are made at the P sites. The two ion pairs R221a-El46 and K224-E217 stabilizing the Na+ binding environment (Fig. 3) provide other important constraints in the slow form. The R221aA mutant has a reduced Na+ affinity (Dang et al., 1997a), suggesting that the R221aA-El46 ion pair may be broken in the slow form. This conclusion, however, is not supported by the data obtained with the substrate library because disruption of the R221aAEl46 ion pair affects specificity more in the slow than the fast form (Table IV).The parameters pertaining to the fast form are practically
98
ENRICO DI CERA
unchanged relative to the wild type, whereas those in the slow form show enhanced sensitivityto perturbation at P1 and P2. This perturbation is also less dependent on the state of other groups, indicating a reduction in the coupling among substitutions at the P1-P3 sites. Inspection of the coupling free energy values in Tables V and VII illustrates this point directly. Disruption of the R221a-El46 ion pair has a direct influence on the specificity sites S1 and S2 of the enzyme in the slow form and affects the way these sites discriminate between Arg and Lys at P1 or Pro and Gly at P2. The molecular basis of this effect may be due to enhanced mobility of the autolysis loop on the Glu side of the ion pair upon disruption of the contact. The enhanced mobility may interfere with substrate recognition in the slow form. On the Arg side of the ion pair, the mobility of the Na+-binding loop is hindered upstream by the C22O-Cl91 disulfide bond and downstream by the bidentate ion pair involving D221 and D222 with R187 in the p-strand distal to the 184 loop that defines the Na+ environment from behind (Fig. 1). The increased mobility of the 184 loop subsequent to disruption of the D221,D222R187 bidentate ion pair leads to the loss of Nat and several water molecules in the channel (Zhang et al., 1997). In addition, the replacement of the side chain of R221a with Ala may alter the orientation of the carbonyl 0 atom and reduce the Na' affinity. This may in turn alter the architecture of the loop and cause a rearrangement of the water molecules in the channel embedding the specificity site S1 (Krem and Di Cera, 1998). The R221a-El46 ion pair contributes to the integrity of the S1 environment in the slow form, but not in the fast form because the perturbation is practically abolished by Nat binding. The ion pair is energetically stronger in the slow form and may play an important role when Nat is released from its site. Perhaps this ion pair tightens up in the fast+slow conversion, bringing the autolysis loop closer to the Na+-binding loop and triggering the release of water molecules from the water channel embedding the specificity pocket S1 and the Na' site (Zhang and Tulinsky, 1997; Krem and Di Cera, 1998). As for the R221aA mutant, mutation of K224 to Ala reduces the Na+ affinity (Dang et al., 1997a), suggesting that the K224-E217 ion pair may be broken in the slow form, but again this proposal is contradicted by the experimental data that document a larger effect in the slow form (Table IV). Disruption of the K224-E217 ion pair produces effects very similar to those seen for the R221aA mutant, with a reduction of the coupling among the Pl-P3 sites especially in the slow form. The ion pair between K224 and E217 bridges two residues on the last two p-
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
99
strands of the B chain (Fig. 3) and seems to contribute to the integrity of the S1 and S2 environments in the slow form. The region in immediate proximity to K224 and E217 plays a key role in substrate selectivity and is absolutely conserved in thrombin from different species (Banfield and MacCillivray, 1992). The state of this ion pair can therefore control the access of substrates into the bottom of the catalytic pocket where the specificity site S1 is located. Strengthening the ion pair in the slow form may trigger the release of some of the water molecules in the channel embedding the specificity site S1, leading to a reorganization of the network. There is evidence of weak synergism between the ion pairs in the slow form but not in the fast form, as demonstrated by the results on the double mutant R221aA/K224A. The perturbation induced by the double mutation is more drastic and almost abolishes Nat binding (Dang et al., 1997a).The mutation affects the response to perturbations at the Pl-P3 site, with an effect more pronounced in the slow form. The site-specific parameters are profoundly altered in the slow form and, interestingly, the painvise coupling pattern shows the disappearance of indirect coupling in both the slow and fast forms, with the onset of positive secondorder direct coupling between P1 and P2 (Table VII). This effect is peculiar to the double substitution, though it is somewhat anticipated by the single substitutions.The molecular basis for the synergism between the R221a-El46 and K224-E217 ion pairs in the slow form is in the participation of residues R221a and K224 in Nat and water coordination. In the fast form, the carbony10 atoms of R221a and K224 directly ligate the Na". Mutation of these residues reduces the Nat affinity, but high concentrations of Nat oppose the structural perturbation induced by the mutation, restoring a molecular environment for the specificity sites that is essentially that of the fast form of the wild type. When Nat is released, the carbonyl 0 atom of K224 may reorient as seen in the structure of trypsin and may hydrogen bond to water 447 in concert with the carbonyl 0 atom of R221a (Gem and Di Cera, 1997). Water 447 hydrogen bonds to the side chain of D189 in the specificity pocket S1, and through the switching mechanism any perturbation of R221a and K224 changing the orientation of the carbony10 atoms will not be compensated as in the case of the fast form and therefore may lead to more drastic structural changes. These changes may eventually affect the orientation of the side chain of D189, bringing about changes in K , and k,,,. From the foregoing analysis we conclude that the more constrained environment in the slow form of thrombin is partially due to stronger ion pairs formed by R221a and K224 in the Nat-binding loop with El46
100
ENRICO DI CERA
in the autolysis loop and E217 in the penultimate /3-strand of the B chain. The integrity of these ion pairs is essential for maintaining the correct architecture of the Sl-S3 sites through the effect on the water molecules in the channel that embeds the specificity site S1. The role of the ion pairs in the fast form appears to be less critical, and their disruption can be compensated by the binding of Na+.The origin of the reduced Na' affinity in these mutants should be seen in a perturbation of the slow form leading to an impaired ability to switch to the fast form. The coupling free energy for allosteric switching from the slow to the fast form in the transition state becomes more negative upon mutation of R221a and K224 (Fig. 4), indicating an increased preference for binding to the fast form.
FPR
FPK
FGR
VPR
FGK
VPK
VGR
VGK
FIG.4. Coupling free energy for allosteric switching from the slow to the fast form in the transition state, calculated from the specificity values s = k,,JK, (Table N )of the substrates listed in the abscissa as AGc = -RTln(spa,/s,,ow).Symbols: ( 0 ) wild-type; ( 0 ) R221aA; ( 0 ) K224A; (m) R221aA/K224A. The difference between the two forms increases upon mutation of R221a and K224 due to perturbation of the slow form.
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
101
The foregoing analysis is invaluable to structure-function studies and to practical issues revolving around the design of better activesite inhibitors. Much improvement in the potency of these molecules can be obtained by reducing the negative coupling among the specificity sites. This effect is obtained by keeping a rigid backbone around the P2-P3 position that facilitates the coordination with D189 at S1 and by breaking the ion pairs R221a-El46 and K224-E217. The analysis also facilitates the identification of optimal pathways for enhancing or reducing specificity. This is illustrated directly in Fig. 5, where the free energy loss in specificity for any given substrate relative to FPR is plotted vs. the number AG (kcal/mol)
fast
slow
form
form
[W
6.0
5.5 5.0 4.5 4.0
3.5 3.0 2.5
2.0
1.5 1.0
0.5 0 4.5 0
1
2
3
FIG.5. Free energy penalty due to substitutions in the P1-P3 sites of substrate relative to FPR, [ O O O ] , plotted vs. the number of perturbed P sites, for wild-type thrombin in the slow and fast forms. Values were computed from the free energies listed in Table IV. Vector labels refer to the P1, P2, and P3 sites in lexicographic order, with 0 = unperturbed and 1 = perturbed. Optimal pathways for enhancing specificity can be identified directly from the plot (see the text).
102
ENRICO DI CERA
of substitutions made at the P sites. Starting from FPR, the optimal pathway to reduce specificity in the fast form is by replacing Pro with Gly at P2 and then Arg with Lys at P1, followed by the Phe + Val substitution at P3. In the slow form, however, the first two steps must be inverted because the Arg + Lys replacement at P1 is more deleterious. The optimal pathway to increase specificity in the fast form starting from VGK is to replace Gly with Pro at P2, followed by the Lys + Arg substitution at P1. The first step by far dominates the gain in specificity and contributes 3.7 kcal/mol to stabilization of the transition state. In the slow form, on the other hand, the first step should be the Arg + Lys substitution at P1 resulting in 3.2 kcal/mol gain in specificity, followed by the Gly + Pro replacement at P2. The Val + Phe substitution at P3 results in a small loss of specificity in both forms. These results set three basic guidelines for the improvement of active-site inhibitors of thrombin (Vindigni et al., 1997a): the P2-P3 bond must be rigid (Pro at P2), especially when targeting the fast form; coupling with S1 must be strong (Arg at P1) , especially when targeting the slow form; and P3 must carry hydrophobic residues.
F. Allostm’c Mechanism fw High-Order Coupling The coupling pattern emerged from the analysis of the substrate library is conducive to negatively cooperative interactions higher than second order. The development of novel allosteric schemes that merge the basic tenets of nearest neighbor and concerted allosteric models becomes necessary to account for these interactions. When this is done, the number of independent parameters to be resolved experimentally increases rapidly and may exceed the information provided by the data. A possible model would require the assumption that the slow and fast forms of wildtype thrombin in the transition state exist in two states in equilibrium, A and B, each containing second-order nearest neighbor interactions among the P1-P3 sites that change in a concerted fashion upon the allosteric transition from A to B. This model has seven independent parameters, three second-order nearest neighbor interaction constants for each state plus a constant describing the equilibrium between the states. However, the number of independent coupling free energies derived experimentally is four (Table VII), thereby making the task of resolving the seven independent parameters for this hybrid model impossible. One way to overcome this problem is to include other variables to increase the dimensionality of the system and generate enough constraints from experimental data. This is done by modeling the slow and
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
103
fast forms each as a fivedimensional manifold containing the sites P1, P2, P3, R221a, and K224 in order. Each configuration in the manifold is mapped into a fivedimensional vector of binary digits, 0 (unperturbed) and 1 (perturbed), to which a given value of the specificity constant is associated from Table IV.All relevant free energies are calculated by operating on these values. The results are summarized in Tables VIII and IX. Inclusion of R221a and K224 in the set of sites increases the number of configurations to be analyzed when a perturbation is applied to a given site, say, P1. The same applies to the number of configurations of other sites when the coupling between any two sites is considered. This analysis yields complete information on the coupling between perturbations in the substrate and the enzyme, within the enzyme, and within the substrate. A graphical illustration of the site-specific transition modes in Table VIII linked to perturbations in the five-dimensional system defined by P1, P2, P3, R221a, and K224 is given in Fig. 6a-e. The plots encapsulate the essential features of the results given above in Section V,D and especially the presence of strong coupling revealed by the significant standard deviation of the distributions. The distributions also point out the serious limitation of an approach where the energetic balance of the perturbation at any site is estimated from a single value obtained in the absence of perturbations at other sites (configuration 1 in the abscissas of Fig. 6a-e). The discrepancy is particularly serious in the case of perturbations made in the substrate. In the case of P2 in the slow form, there is a difference of 1.8 kcal/mol between the value estimated from the specificity of FPR and FGR compared to the mean of the entire distribution. Broader distributions for perturbations in the slow form underscore the stronger coupling in this allosteric state. Changes in the profiles between the slow and fast forms are particularly evident for P2 and K224, and to a lesser extent for P1 and R221a, documenting the different susceptibility to perturbation of specific structural domains in the two forms. Analysis of the coupling pattern involving all possible pairs (Table IX) shows how interactions change with the state of other sites. When only differences of at least 5 R T (0.6 kcal/mol) in the coupling free energy are considered, the patterns can be analyzed to identify the nature of the interaction. In the fast form only the Pl-P3 sites are significantly coupled and in an indirect way. Perturbation of any P site influences the coupling at other sites. In the slow form all sites are strongly coupled. Each coupling can be dissected to identify the element perturbing the interaction. A direct way to illustrate the effect of a third site on the coupling between two sites is to calculate the difference in
TABLE VIII Free Energr Change (in kcal/mol) in Speajicip due to Perturbation of the PI-P3 Sites of the Substrate or Residues R221a and K224 of Thrombin" 0000
1000
0100
0010
0001
1100
1010
1001
0110
0101
0011
1110
1101
1011
0111
1111
Fast fm P1 P2 P3 R221a K224
1.4 2.3 -0.1 0.1 0.4
2.7 3.5 0.8 0.3 0.0
2.3 3.4 1.0 0.6 0.4
1.7 2.8 0.6 0.8
1.0 2.3 0.4 0.3 0.7
2.5 3.6 0.9 0.4 -0.1
2.5 3.6 0.9 0.5 0.2
2.1 3.3 1.0 0.5 0.2
2.1 3.3 1.0 0.5 0.4
1.7 2.9 1.0 0.6 0.5
1.2 2.6 0.4 0.4 0.6
2.4 3.5 0.9 0.4 0.0
2.2 3.4 1.1 0.5 0.0
2.0 3.4 0.9 0.4 0.2
1.7 3.3 1.1 0.7 0.6
2.0 3.6 1.1 0.6 0.2
slow fm P1 P2 P3 R221a K224
1.3 0.7 -0.5 0.4 1.1
3.4 2.9 0.7 1.3 1.4
2.4 2.2 1.0 1.8 2.5
2.2 2.2 0.3 1.1 1.9
1.6 2.2 0.3 0.2 0.9
3.2 2.9 0.7 1.1 1.1
2.8 2.8 0.8 1.4 1.7
2.0 2.6 1.0 0.7 0.8
2.7 2.8 0.9 1.8 2.6
2.2 2.9 1.1 0.9 1.7
2.1 2.9 0.9 0.8 1.5
2.5 2.5 0.6 1.1 1.1
1.6 2.3 0.6 0.4 0.3
1.5 2.3 1.0
2.2 2.8 0.8 0.7 1.5
1.4 2.0 0.7 0.4 0.4
0.5
0.7 1.0
"Listed are all possible configurations of the other sites (0 = unperturbed; 1 = perturbed) in the order P1, P2, P3, R221a, K224. Errors are 20.1 kcal/mol or less.
TABLEIX Coupling Free Energy Values (in kcal/mol) for Perturbation of the PI-P3 Sites of the Substrate and Residues R221a and K2.24 of Thrombin" 000 Fast f i Pl-P2 Pl-P3 Pl-R221a Pl-K224 P2-P3 P2-R221a P2-K224 P3-R22 1a P3-K224 R221a-K224 slow f
1.3 0.8
0.2 -0.4 1.1 0.5
0.0 0.2 0.4 0.2
100
010
0.2 -0.2 -0.2 -0.6 0.1 0.1 -0.2 0.1 0.2 0.2
0.8 0.5
-0.1 -0.6 0.5
-0.1 -0.4 -0.1 -0.0 0.0
00 1
110
101
01 1
1.1 0.6 0.2 -0.4 0.6 0.3 -0.2 0.0 -0.1 -0.2
0.3 -0.1 -0.1 -0.4
0.5
0.8
0.1 -0.1
0.5
-0.0
-0.1 -0.2 0.0 0.2 0.1
-0.5
0.1 0.0 -0.2 -0.1 -0.0 -0.0
111
Couplingb
0.3
0.0 -0.4 0.7 0.4 0.0 0.1 0.1 0.2
-0.2 -0.4 0.2 0.1 0.0 0.0 0.2 0.2
Indirect Indirect None None Indirect None None None None None
-0.6 0.1 -0.0
-0.9 -0.1 -0.3 -1.1 -0.3 -0.3 -0.6 0.1 0.1 -0.6
Indirect Indirect Indirect Indirect Indirect Indirect Indirect Indirect Indirect Indirect
-0.0
Mediated by
P3 P2
P3, R221a, K224 P2 P2 P2 P1, R221a, K224 P1, P3, K224 P1, P3, R221a P2 P2 P2
P1
i
Pl-P2 Pl-P3 Pl-R221a Pl-K224 P2-P3 P2-R221a P2-K224 P3-R221a P3-K224 R221a-K224
2.2 1.2 0.9 0.3 1.4 1.4 1.4
0.7 0.8 -0.2
0.7 -0.2 -0.6 -1.4 0.0 -0.1 -0.3 0.1 0.3 -0.6
0.6 0.6 0.3 -0.2 0.7 0.6 0.7 -0.0 0.1 -0.9
0.5
0.7 0.5 -0.1 0.7 0.7 0.7 0.6 0.6 -0.4
-0.3 -0.3 -0.7 -1.6 -0.2 -0.4 -0.6 -0.1 -0.0 -0.8
-0.6 -0.4
-0.6 -1.3 -0.3 -0.4 -0.5
0.0
0.2 -0.7
-0.5
-0.1 -0.1 0.0 -0.2 -0.1 -1.1
~
~~
"Listed are all possible configurations of the other sites (0 = unperturbed; 1 = perturbed) in the order P1, P2, P3, R221a, K224. Errors are 20.1 kcal/mol or less. *Indirect coupling requires values in the distribution that differ by at least ?RT (0.6 kcal/mol). Direct coupling of less than -CRT on the average is considered zero.
106
ENRICO DI CERA
a
Perturbation at P1 t
o 0
t
9 ,- 1
1
m
. . . . . . . . . . . . . . . . I 1
2
3
4
5
6
7
8
910111213141.516
configuration FIG.6 (pp. 106-108). Sitespecific perturbation of (a) P1, (b) P2, (c) P3, (d) R221a, and (e) K224, as indicated, in all possible configurations of the other sites. The 16 configurations refer sequentially to those listed in Table VII. Data are for the slow ( 0 ) and fast ( 0 ) forms. The average value of the perturbation is depicted by a continuous (fast form) or broken (slow form) line. Note how the value obtained in the absence of perturbations at other sites (configuration 1 in the abscissa) usually does not coincide with the average value derived from the ensemble of states of the other sites. This reflects the intrinsic coupling among the sites. The mean and standard deviations of each distribution are (in kcal/mol) as follows: (Pl) 2.0 2 0.5 (o), 2.2 2 0.6 ( 0 ) ; (P2)3.2 Z 0.4 (o), 2.5 Z 0.5 (0); (P3) 0.8 Z 0.3 ( o ) ,0.7 2 0.4 (0); (R221a) 0.5 2 0.2 (o), 0.9 2 0.5 (0); (K224) 0.3 -C 0.3 ( o ) , 1.4 Z 0.6 ( 0 ) .
coupling free energy of a pair due to the 0 + 1 transition of a third site, in all possible configurations of the remaining sites. These calculations are summarized in Table X, where differences of at least ? RT are highlighted. The P2 site emerges as a major node of interaction. In the slow form, the state of P2 influences all interactions (Table IX). The state of R221a and K224 influences the Pl-P2 and P2-P3 interactions but has no effect on the Pl-P3 coupling that is influenced by P2. Finally, the coupling between R221a and K224 is influenced by P2 only. As a result, the Ala replacements at these thrombin residues produce additive effects on specificity when Pro is at P2 but are positively linked when Pro is replaced by Gly.
107
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
i
Perturbation at P2
b/
0
1
2
3
4
5
6
7
8
910111213141516
configuration
-
s t
.
.
.
.
.
.
.
,
,
,
1
2
3
4
5
6
7
8
910111213141516
configuration FIG.6- Continued
.
.
.
,
.
.
108
ENRICO D1 CERA
2
d
Perturbation at R221a 9 n
r-l
d
8 9 3
1
2
3
4
5
6
7
8
910111213141516
configuration
e
Perturbation at K224
9
0
0
0
0 0 0
. o O
0
0
0
3
4
5
6
7
8
0 0 A
- 0 0 .
0 0
0
I 2
0
0
0
0
9 1011 1 2 1 3 1 4 1 5 1 6
configuration FIG.6- Continued
TABLE X Effect of Other Sites on the Coupling Free Energy Values (in kcal/mol) for Perturbation of the PI -P3 Sites of the Substrate and Residues R221a and K224 of Thrombin" 000-100 Fast form Pl-P2 Pl-P3 Pl-R221a Pl-K224 P2-P3 P2-R221a P2-K224 P3-R221a P3-K224 R221a-K224
1.1 1.o 0.4 0.2 1.o 0.4 0.2 0.1 0.2 0.0
slow fonn Pl-P2 Pl-P3 Pl-R221a Pl-K224 P2-P3 P2-R221a P2-K224 P3-RZ21a P3-K224 R221a-K224
1.5 1.4 1.5 1.7 1.4 1.5 1.7 0.6 0.5 0.4
010-110 0.5 0.6
0.0 -0.2 0.5
0.0 -0.2 -0.1 -0.2 -0.1
0.9 0.9 1.0 1.4 0.9 1.0 1.3 0.1 0.1 -0.1
001-101 0.6 0.5 0.3 0.1 0.5 0.3 0.0 0.1 -0.1 -0.2
1.1 1.1 1.1
1.4 1.0 1.1 1.4 0.6 0.4 0.3
011-111
0.5 0.5
0.2 0.0 0.5 0.3 0.0 0.1 -0.1 0.0
0.3
0.2 0.3 0.6 0.2 0.2 0.6 -0.3 -0.2 -0.5
)00-010
100-110
001-011
101-111
300-001
100-101
010-011
110-111
0.5 0.3 0.3 0.2 0.6 0.6 0.4 0.3 0.4 0.2
-0.1 -0.1 -0.1 -0.2 0.1 0.2 0.0 0.1 0.0 0.1
0.3 0.1 0.2 0.0 -0.1 -0.1 -0.2 -0.1 -0.2 -0.4
0.2 0.1 0.1 -0.1 -0.1 -0.1 -0.2 -0.1 -0.2 -0.2
0.2 0.2 0.0 0.0 0.5 0.2 0.2 0.2 0.5 0.4
-0.3 -0.3 -0.1 -0.1 0.0 0.1 0.0 0.2 0.2 0.2
0.0 0.0 -0.1 -0.2 -0.2 -0.5 -0.4 -0.2 -0.1 -0.2
-0.1 0.1 0.0 -0.2 -0.2 -0.2 0.0 0.0 -0.1
1.6 0.6 0.6 0.5 0.7
1.0 0.1 0.1 0.2 0.2 0.3 0.3 0.2 0.3 0.2
1.1 0.6 0.5 0.4 0.8 0.8 0.7 0.8 0.7 0.7
0.3 -0.3 -0.3 -0.2 0.0 -0.1 0.1 -0.1 0.1 -0.1
1.7 0.5 0.4 0.4 0.7 0.7 0.7 0.1 0.2 0.2
1.3 0.2 0.0 -0.1 0.3 0.3 0.2 0.1 0.1 0.1
0.8
0.7 0.7 0.7 0.7
1.2
0.5 0.3 0.3 0.8 0.7 0.7 0.2 0.2 0.2
0.0
0.6 -0.2 -0.4 -0.5 0.1 -0.1 0.0 -0.2 -0.1
-0.2
Listed is the difference in coupling free energy due to a change in the state of a third site, in the four possible configurations of the other two sites (0 = unperturbed; 1 = pertubed) in the order P1, P2, P3, R221a, K224. For example, the value of 1.1 kcal/mol for the Pl-P2 coupling in the fast form is the difference between the coupling free energy with P3, R221a, and K224 unperturbed (000) and that with P3 perturbed (100). Errors are 50.1 kcal/mol or less. Values exceeding +-RT (0.6 kcal/mol) are set in boldface.
110
ENRICO DI CERA
The intricacy of the coupling patterns for the 10 possible pairs of sites can be resolved by using an allosteric scheme belonging to a class of general models detailed elsewhere (Di Cera, 1995). The minimal model enabling us to describe the data in Tables VIII-X is to assume that the enzyme-substrate complex in the transition state can exist in two states, A and B, with state A being more stable in the absence of perturbations (i.e., in the [OOOOO] configuration of the P1, P2, P3, R221a, and K224 sites) and state B becoming more stable as perturbations are introduced in the system. These states may reflect alternative binding modes of the substrate and/or alternative conformations of the enzyme. They are not to be confused with the slow and fast forms. In such a model, the perturbations introduced in the system are coupled through a concerted allosteric equilibrium between two states with the sites experiencing nearest neighbor interactions in each state. It is intuitively obvious that since A and B have different coupling interactions for the possible site pairs, the apparent site-site coupling will change as more perturbations are introduced in the system and the transition from A to B takes place. A unique solution can be found by casting the coupling free energies measured experimentally in terms of the model parameters. For a system existing in two states, each of which has 10 possible site pairs, there is a total of 31 independent parameters that describe the perturbations. For each state there are 5 perturbation free energies for each site and 10 possible painvise interaction constants, leading to a total of 30 parameters. The free energy AGL defining the stability of the A state relative to B in the absence of perturbations completes the set. The independent constraints provided by experimental measurements for a fivedimensional system are 31, and a unique solution of the problem is therefore possible. Analysis of the data in terms of this allosteric picture can be simplified by noting that the site-specific perturbation free energies need not be different in the two states and can be set equal to the values determined experimentally. This reduces the number of independent parameters to 26. To evaluate all other terms, the free energy of any intermediate in the manifold must be evaluated. This is done by expressing the specificity of any intermediate relative to that of the reference species, [00000], which is FPR interacting with wild-type thrombin, and converting this ratio into kcal/mol using the expression A GaSyh= -RT In (sagyB/ sooooo),where a,0, y , 6, E = 0, 1 label the state of the five sites in the system. The values of these free energy levels are given in Fig. 7. For each case, the relevant expressions in terms of the model parameters are evaluated. The calculated free energy is given by the sum of the sitespecific perturbation free energies, which are the same in states A and
111
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFEmS
[000@31 0.0,o.o (0.0,O.O)
[loooo]
[OlOOO]
[OOlOO] [OOolO]
[ml]
1.4.1.3 (1.4,1.3)
2.3,0.7 (2.3,O.n
-0.1,4,5
0.4.1.1 (0.4,l.l)
(4.L4.5)
0.1,0.4 (0.1.0.4)
[110001
[loloo] [loolo]
[10001]
[OllOO] [OlOlO]
[Olool]
[oollO]
[OOlOl]
4.9.4.2 (4.9.4.2)
2.22.0 (2.2,2.0)
1.5,2.6 (1.5.2.6)
3.3.1.7 (3.3,l.n
2.8,Z.S (2.8.2.5)
2.7,3.3 (2.7,3.3)
0.5,0.6 (050.6)
[ o o w
0.8,1.4 (0.8,1.4)
0.7.1.3 (0.7,1.3)
[11100]
[nolo] [llool]
[10110] [10101] [looll]
[OlllO]
[OllOl]
[OlOll]
[OOlll]
5.8.4.9 (6.1,4.8)
5.3.55 (5.3,S.S)
2.7,3.6 (2.8,3.5)
3.8,3.5 (3.8,3.5)
3.7,4.3 (3.4,4.2)
3.3.4.2 (3.e4.1)
1.1,2.3 (1.22.3)
1.8,2.6 (1.8,2.6)
4.8.5.5 (4.8,5.5)
2.5,3.7 (2.4,3.6)
2.43.3 (2.43.3)
[11110] [lllOl]
[11011]
[lo1111
[Ollll]
6.25.8 (6.5.5.7)
5.4,S.S (5.4,5.4)
2.9,4.5 (3.0.4.4)
4.4,5.3 (4.25.3)
5.959 (5.8,5.7)
[11111] 6.4,6.5 (6.4,6.3)
FIG.7. Manifold of intermediates for the five-dimensional system composed of sites P1, P2, P3, R221a, and K224 in order. Listed are the free energy levels associated with each configuration relative to the [OOOOO] species for the fast (upper left value) and slow (upper right value) forms. The comparison with the values predicted by the allosteric model (Fig. 8) are given by parentheses for the fast (lower left wlue) and slow (lower right value) forms. These values were calculated from the free energy terms of the model given in Fig. 8. For example, the calculated free energy level where sites 1 and 4 are perturbed, AGlwlo(i.e., FPK interacting with the R221aA mutant), depends on the sitespecific perturbations of sites 1 and 4, AGl = A G I m and AG4 = AGmlo, and on their painvise interaction free energy in the two states, so that AGlwlo = A G l
+ AG4 - RTln
[[ (
exp -AGL
p 1 4 )
+ exp(
- 3 1 / [ -2)+ 'xp(
I]]
The best fit values of the model parameters in the fast form are (in kcal/mol) as follows: AGi. = -0.5; 'AGI = 'AG1 = 1.4; 'AG2 = BAG2 = 2.3; 'AG3 = BAG3= -0.1; 'AG4 = BAG4 = 0.1; 'AG5 = = 0.4; 'AGi2 = 3.8; BAGi2 = 0.5; 'AGis = 1.2; BAGIS = 0.4; 'AG14 = BAGi4 = 0.3; 'AG15 = BAG15 = -0.4; 'AG23 = 1.3; BAG2s = 0.8; 'A& = 0.9; BAG24 = 0.1; = 0.1; 'AGz = -0.2; 'AGX = 1.1; = 0.0; 'AG35 = 1.0; 'AG35 = 0.1; 'AG45 = 'AG45 = 0.2. The best fit values of the model parameters in the slow form are (in kcal/ mol) as follows: AGL = -2.2; 'AGI = BAGl = 1.3; 'AGp = BAG2 = 0.7; AAG~= BAG^ = -0.5; 'AG, = BAG4 = 0.4; 'AG5 = 1.1; AAG12 = 2.2; BAG12 = 1.2; 'AG13 = 1.3; = 0.0; 'AG14 = 1.0; = -0.3; 'AGI5 = 0.4; = -1.0; 'AG23 = 1.6; 'AG23 = 0.0; 'AGq4 = 1.6; *AG24 = 0.0; 'AG25 = 1.6; BAG25 = 0.0; 'AGM = 'AGH = 0.8; AAG35 = 'AG35 = 0.8; AAG45= BAG45= -0.2. All errors are t O . l kcal/mol or less.
112
ENRICO DI CERA
B, plus a term that contains the contribution of the allosteric switching and the pairwise interaction free energies involving the sites in each state. The model reproduces the experimentally determined values within 20.3 kcal/mol. A diagram depicting the relevant nearest neighbor interactions in states A and B for the slow and fast forms of thrombin is sketched in Fig. 8. The properties of the A and B states recapitulate the information gathered from the model-independent analysis already presented in Section V,D. The strength of nearest neighbor interactions decreases in state B and practically vanishes for a number of pairs. Overall, interactions are more pronounced in the slow form with strong coupling involving the P sites. Coupling is also strong between these sites and the
form
-0.5
FIG.8. Schematic representation of the nearest neighbor interactions in state A and B for the slow and fast forms of thrombin. Interacting - sites are connected and the pairwise interaction energy is given. The values of the model parameters are listed in the legend to Fig. 7.
SITE-SPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
113
thrombin perturbations at R221a and K224, whereas no significant coupling is observed between R221a and K224. Upon switching to the B state in the slow form there is a significant reduction of coupling and the main interactions involve P1 with P2 and K224, and P3 with R221a and K224. This state is considerably less populated than the A state in the absence of perturbations. In the fast form, the properties of the A state echo those seen for the slow form, except for weaker coupling between P1 and R221a and K224. However, the B state is almost as stable as the A state in this form, and the main interactions involve the P2-P3 pair. The A and B states may represent different binding modes of the substrate in the transition state or different substates available to the enzyme-substrate complex in the slow and fast forms. Since most of the interactions are routed through the P sites, it may be expected that A and B refer to distinct modes for substrate binding. One binding mode, the A state, appears to be more constrained, with stronger negative coupling involving residues at the P sites. This mode of binding also couples strongly with the state of the ion pairs involving R22la and K224. The coupling is particularly evident in the slow form. In the fast form, there is a peculiar weakening of the coupling of P1 with the thrombin residues R221a and K224, indicating that the strength of the ion pairs R221a-El46 and K224-E217 influences the environment of the S1 site. The higher flexibility of the fast form is also indicated by the lower energy barrier (0.5 kcal/mol) to switching from the A to the B state where most of the interactions disappear. The B state may well reflect a less constrained conformation of the substrate bound in the transition state, consistent with the stabilization observed upon the Pro + Gly substitution at P2.
VI. CONCLUDING REMARKS Wyman’s theory of linked functions (Wyman, 1948, 1964) provides the conceptual framework needed for a rigorous, model-independent analysis of mutational effects in proteins. Once this analysis is formulated in terms of the principles of sitespecific thermodynamics (Di Cera, 1995), a hierarchy of effects can be unraveled from experimental data. Each site subject to structural perturbation can be treated as a unit switching from two states, 0 = wild type and 1 = perturbed, and the energetics of the system can be mapped into a manifold of Nsites coupled through high-order interactions. Coupling among the sites leading to nonadditivity is the dominant feature emerging from mutational studies of protein stability and ligand recognition in a variety of systems. Basic
114
ENRICO DI CERA
properties of the coupling free energy of a double-mutant cycle enable the identification of the precise mechanism of coupling from a modelindependent analysis of the data. This in turn guides the development of ad hoc allosteric models that encapsulate the relevant features of the system. The applicability of the approach presented here is only limited by the availability of high-order perturbations in the system. When one is dealing with a protein, triple and quadruple mutations are necessary to fully exploit the virtues of this analysis, which may represent a serious drawback in many systems of interest. However, at least in the case of ligand recognition, the approach is feasible insofar as the highdimensional manifold can be constructed with perturbations in the ligand. Short peptides can be synthesized easily to construct substrate libraries carrying substitutions such as those used to dissect thrombin specificity. These libraries can then be combined with perturbations in the enzyme to enhance the dimensionality of the system. Higher dimensions in the system can be introduced by studying the effect of linked ligands on the coupling among sitedirected mutations. The energetics of the library of substrates used for wild-type thrombin has recently been characterized in the presence of thrombomodulin (Vindigni et aL, 199’7b).Each variable introduced in the system, subject to the requirements of the double-mutant cycle in Eq. (4)that the site be in two possible states, increases the dimensionality of the manifold of configurations by one. Hence, significantly complex systems can be constructed by suitable combinations of ligand binding and mutational events involving the protein and/or the ligand. The key advantage of the approach based on site-specific thermodynamics is that effects of different nature can be treated within the same conceptual framework and analytical formalism. Other possibilities to extend the dimensionality of the system is by exploring the effect of different substitutions at the same site, which can test the context dependence of the interaction patterns obtained experimentally. When ligands are used as probes of site-specific properties of the protein, much information can be obtained on the nature of interactions stabilizing the complex and determining recognition. The combination of structural perturbations and the analysis presented here provide a rational strategy for the dissection of protein stability, ligand binding, and enzyme specificity. This approach brings Wyman’s original idea of linked functions into the mainstream of current studies of structurefunction correlations.
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
115
ACKNOWLEDGMENTS I am grateful to all members of my laboratory who have contributed at various levels to the experimental and theoretical aspects of this work, and especially to Quoc Dang, Enriqueta Guinto, Alessandro Vindigni, and Luyu Wang. I am also indebted to Frederic Richards andJames Wells for valuable comments and suggestions. This work was supported by NIH Research Grants HL49413 and HL58141, and was carried out under the tenure of an Established Investigator Award in Thrombosis from the American Heart Association and Genentech.
REFERENCES Ackers, G. R,and Smith, F. R. (1985). Annu. Reu. Biochem. 54,597-629. Ayala, Y. M., and Di Cera, E. (1994).J. Mol. Biol. 235, 733-746. Banfield, D. K., and MacGillivray, R. T. A. (1992). Proc. Nutl. Acud. Sci. U.S.A. 89,2779-2783. Bartunik, H. D., Summers, L. J., and Bartsch, H. H. (1989).J. Mol. Bwl. 210,813-828. Bashford, D., and Karplus, M. (1991).J. Phys. Chm . 95, 9556-9561. Berg, D. T., Wiley, M. R., and Grinnell, B. W. (1996). Science273, 1389-1391. Berliner, L. J. (1992). “Thrombin: Structure and Function.” Plenum, New York. Betz, A. J., Hofsteenge, J.. and Stone, S. R. (1991). Biochem. J. 275, 801-803. Blanquet, S., Fayet, G., Poiret, M., and Waller, J.-P. (1975). Eur. J. Biochem. 51, 567-571. BlombPck, B., Blombsck, M., Olsson, P., Svendsen, L., and Aberg, G. (1969). Scand. J. Clin. Lab. Invest. 24, 9-66. BlombPck, G. (1969). Br.J. Huemutol. 17, 145-151. Bode, W., Turk, D., and Karshikov, A. (1992). Protan Sci. 1, 426-471. Brezniak, D. V., Brower, M. S., Witting, J. I., Walz, D. A., and Fenton, J. W., I1 (1990). Biochemistly 29, 536-3542. Carter, P., and Wells, J. A. (1988). Nature (London) 332, 564-568. Carter, P. J., Winter, G., Wilkinson, A. J., and Fersht, A. R. (1984). Cell ( C u d d g v , Mass.) 38,835-840. Castro, M. J. M., and Anderson, S. (1996). Biochemisty 35, 11435-11446. Clackson, T., and Wells, J. A. (1995). Science 267, 383-386. Claeson, G. (1994). Blood Coagulation Fibn’nol. 5,411-436. Cornish, V. W., and Schultz, P. G. (1994). Curr. @in. Shuct. Bwl.4, 601-607. Craik, C. S., Largman, C., Fletcher, T., Roczniak, S., Barr, P. J., Fletterick, R. J., and Rutter, W. J. (1985). Science228, 291-297. Creighton, T. E. (1990). Bi0chem.J. 270, 1-16. Cunningham, B. C., and Wells,J. A. (1989). Sciace 244, 1081-1085. Cwirla, S. E., Peters, E. A., Barrett, R. W., and Dower, W. J. (1990). Proc. Nutl. Acud. Scz. U.S.A.87, 6378-6382. Dang, Q. D., and Di Cera, E. (1996). h c . Natl. Acad. Sci. U.S.A. 93, 10253-10256. Dang, Q. D., and Di Cera, E. (1997). Blood 89, 2220-2222. Dang, Q. D., Vindigni, A., and Di Cera, E. (1995).h c . Natl. Acud. Sci. U.S.A.92,5977-5981. Dang, Q. D., Guinto, E. R., and Di Cera, E. (1997a). Nut. Biotechnol. 15, 146-149. Dang, Q. D., Sabetta, M., and Di Cera, E. (1997b).J. Bwl. Chem. 272, 19649-19651. Davie, E. W., Fujikawa, K., and Kisiel, W. (1991). Biochaishy 30, 10363-10370. De Filippis, V., Vindigni, A., Altichieri, L., and Fontana, A. (1995). Biochemisty 34,95529564. Devlin, J. J., Panganiban, L. C., and Devlin, P. E. (1990). Science 249, 404-406.
116
ENRICO DI CERA
Di Cera, E. (1995). “Thermodynamic Theory of Site-Specific Binding Processes in Biological Macromolecules.” Cambridge University Press, Cambridge, UK. Di Cera, E., Guinto, E. R., Vindigni, A., Dang, Q. D., Ayala, Y. M., Wuyi, M., and Tulinsky, A. (1995).J. Biol. Chem. 270, 22089-22092. Di Cera, E., Dang, Q. D., and Ayala, Y. M. (1997). Cell. Mol. Lye Sci. 53, 701-730. Dickinson, C. D., Kelly, C. R., and Ruf, W. (1996). Proc. Nutl. Acud. Sci. U.S.A. 93, 1437914384. Dill, K. A. (1990). Biochemistly 29, 7133-7155. Ding, L., Coornbs, G. S., Strandberg, L., Navre, M., Corey, D. R., and Madison, E. L. (1995). Roc. Nutl. Acud. Sci. U.S.A. 92, 7627-7631. Doolittle, R. F., and Feng, D. F. (1987). CoM Sping Hurhur SymF. Quunt. Riol. 52,869-874. Eaton, D., Rodriguez, H., and Vehar, G. A. (1986). Biochemistly 25, 505-514. Edsall, J. T., and Blanchard, M. H. (1933).J. Am. Chem. SOC.55, 2337-2353. Edsall, J. T., and Wyman, J. (1958). “Biophysical Chemistry.” Academic Press, New York. Evans, M. G., and Polanyi, M. (1936). Trans. Furuduy Soc. 32, 1333-1360. Fenton, J. W., I1 (1988). Ann. N. Y. Acud. Sci. 370, 468-495. Fersht, A. R., and Serrano, L. (1993). Cum. q i n . Struct. Biol. 3, 75-83. Foster, D., and Davie, E. W. (1984). Proc. Nutl. Acud. Sci. U.S.A. 81, 4766-4770. Gailani, D., and Broze, G. J. (1991). Science253, 909-912. Garcia-Moreno, E. B., Dwyer, J. J., Gittis, A. G., Lattman, E. E., Spencer, D. S., and Stites, W. E. (1997). Bi@hys. C h m . 64, 211-224. Gibbs, C. S., Coutre, S. E., Tsiang, M., Li, W.-X., Jain, A. K., Dunn, K. E., Law, V. S., Mao, C. T., Matsumura, S. Y., Mejza, S. J., Paborsky, L. R., and Leung, L. L. K. (1995). Nature (London) 378, 413-416. Gibbs, J. W. (1875). Trans. Conn. Acad. 3, 108-248. Gilson, M. K. (1993). Proteins: Struct., Funct., Genet. 15, 266-282. Grand, R. J. A., Turnell, A. S., and Grabham, P. W. (1996). Biochem. J. 313, 353-368. Green, S. M., and Shortle, D. (1993). Biochemistly 32, 10131-10139. Green, S. M., Meeker, A. K., and Shortle, D. (1992). Biochemistly 31, 5717-5728. Grinnell, B. W. (1997). Nut. Biotechnol. 15, 124-125. Guinto, E. R., and Di Cera, E. (1997). Bi@hys. Chem. 64, 103-109. Guinto, E. R., Vindigni, A., Ayala, Y., Dang, Q. D., and Di Cera, E. (1995). Proc. Nutl. Acud. Sci. U.S.A. 92, 11185-11189. Hagen, F. S., Gray, C. L., O’Hara, P., Grant, F. J., Saari, G. C., Woodbury, R. G., Hart, C. E., Insley, M., Kisiel, W., Kurachi, K., and Davie, E. W. (1986). Proc. Nutl. Acud. Sci. U.S.A. 83, 2412-2416. Hedstrom, L., Szilagyi, L., and Rutter, W. J. (1992). Science 255, 1249-1253. Hill, T. L. (1944).J. Chem. Phys. 12, 56-61. Hill, T. L. (1984). “Cooperativity Theory in Biochemistry.” Springer-Verlag, Berlin. Hofsteenge, J., Braun, P. J., and Stone, S. R. (1988). Biochemistly 27, 2144-2151. Holler, E., Rainey, P., Orrne, A,, Bennett, E. L., and Calvin, M. (1973). Biochemistly 12,1150-1164. Horovitz, A., and Fersht, A. R. (199O).J. Mol. Biol. 214,613-617. Horovitz, A., and Fersht, A. R. (1992).J. Mol. Biol. 224, 733-740. Howell, E. E., Booth, C., Farnum, M., Kraut, J., and Warren, M. S. (1990). Biorhemistly 29, 8561-8568. Ishihara, H., Connolly, A. J., Zeng, D., Kahn, M. L., Zheng, Y. W., Timmons, C., Tram, T., and Coughlin, S. R. (1997). Nature (London)386, 502-506. Jackson, S. E., and Fersht, A. R. (1993). Biochemistly 32, 13909-13918. Jencks, W. P. (1981). Proc. Nutl. Acud. Sci. U.S.A. 78, 4046-4050.
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
117
Judice, J. R,Gamble, T. R., Murphy, E. C., de Vos, A. M., and Schultz, P. G. (1993). Science 261, 1578-1581. Keating, S., and Di Cera, E. (1993). Biophjs.J. 65, 253-269. Koshland, D. E., Nemethy, G., and Filmer, D. (1966). Biochemistrj 5, 365-385. &em, M. M., and Di Cera, E. (1998). Proteins: Struct., Funct., Genet. 30, 34-42. Lau, F. T.-K, and Fersht, A. R. (1987). Nature (London) 326, 811-812. Lesk, A. M., and Fordham, W. D. (1996).J. Mol. Biol. 258,501-537. LiCata, V. J., and Ackers, G. K. (1995). Biochemistq 34, 3133-3159. LiCata,V.J., Speros, P. C., Rovida, E., and Ackers, G. K. (1990).Biochemistly29,9771-9783. Lin, T.-Y.,and Kim, P. S. (1991). Proc. Null. Acud. Sci. U.S.A. 88, 10573-10577. Livnah, O., Sture, E. A., Johnson, D. L., Middleton, S. A., Mulcahy, L. S., Wrighton, N. C., Dower, W. J.,Jolliffe, L. K., and Wilson, I. A. (1996). Science 273, 464-471. Lottenberg, R., Hall, J. A., Blinder, M., Binder, E. P., and Jackson, C. M. (1983). Biochim. Biophys. Actu 742, 539-557. Makhadatze, G. I., and Privalov, P. L. (1995). Adv. Protein Chem.47, 308-425. Mann, K. G., Jenny, R. J., and Krishnaswamy, S. (1988).Annu. Rev. B w c h . 57,915-946. Mann, K. G., Nesheim, M. E., Church, W. R., Haley, P., and Krishnaswamy, S. (1990). Blood 76, 1-16. Martin, P. D., Robertson, W., Turk, D., Huber, R., Bode, W., and Edwards, B. F. P. (1992). J. Biol. Chem. 267, 7911-7920. Mather, T., Oganessyan, V., Hof, P., Huber, R., Foundling, S., Esmon, C., and Bode, W. (1996). EMBOJ. 15,6822-6831. Matthews, B. W. (1993). Annu. Rev. Biocha. 62, 139-160. Meeker, A. L., Garcia-Moreno, B., and Shortle, D. (1996). Biochaishy 35, 6443-6449. Mildvan, A. S.,Weber, D.J., and Kuliopulos, A. (1992). Arch. Biocha. Biophys. 294,327-340. Milla, M. E., Brown, B. M., and Sauer, R. T. (1994). Nut. Struct. Biol. 1, 518-523. Miyata, T., Aruga, R., Umeyama, H., Bezeaud, A., Guillin, M. C., and Iwanaga, S. (1992). Biochaistly 31, 7457-7462. Monod, J., Changeux, J. P., and Jacob, F. (1963). J. Mol. Biol. 6, 306-329. Monod, J., Wyman, J., and Changeux, J. P. (1965).J. Mol. Bzol. 12, 88-118. Neuberger, A. (1936). Biochem. J. 30, 2085-2094. Neurath, H. (1984). Science 224, 350-357. Ni, F., Ripoll, D. R., Martin, P. D., and Edwards, B. F. P. (1992). Biochemistly 31, 1155111557. Padmanabhan, R, Padmanabhan, K. P., Tulinsky, A., Park, C. H., Bode, W., Huber, R., Blankenship, D. T., Cardin, A. D., and Kisiel, W. (1993).J. Mol. Biol. 232, 947-966. Patthy, L. (1990). Blood Coagulation Fib-rnol. 1, 153-166. Perona, J. J., and Craik, C. S. (1995). Protein Sci. 4, 337-360. Perona, J. J., Tsu, C. A., McGrath, M. E., Craik, C. S., and Fletterick, R. J. (1993). J. Mol. Biol. 230, 934-949. Perry, K. M., Onuffer, J. J., Gittelman, M. S., Barmat, L., and Matthews, C. R. (1989). Biochemist? 28, 7961-7970. Privalov, P. L. (1979). Adv. Protein Chem. 33, 167-241. Rawlings, R. D., and Barrett, A. J. (1993). Bi0chem.J. 290, 205-218. Rawlings, R. D., and Barrett, A. J. (1994). In “Methods in Enzymology” (A. J. Barrett, ed.), Vol. 244, pp. 19-61. Academic Press, San Diego, CA. Rezaie, A. R. (1996). Biochemistly 35, 1918-1924. Rezaie, A. R., and Olson, S. T. (1997). Biochemistly 36, 1026-1033. Richieri, G. V., Low, P. J., Ogata, R. T., and Kleinfeld, A. M. (1997). J. Biol. Chem. 272, 16737-16740.
118
ENRlCO DI CERA
Robinson, C. R., and Sligar, S. G. (1993). Protein Sci. 2, 826-832. Rydel, T. J., Tulinsky, A., Bode, W., and Huber, R. (1991). J. Mol. Biol. 221, 583-601. Sandberg, W. S., and Terwilliger, T. C. (1989). Science 245, 54-57. Schechter, I., and Berger, A. (1967). Biochem. Biophys. Res. Commun. 27, 157-162. Schellman, J. A. (1990). Biophys. Chem. 37, 121-140. Scott, J. K., and Smith, G. P. (1990). Science 249, 386-390. Scrutton, N. S., Berry, A., and Perham, R. N. (1990). Nature (London) 343, 38-43. Shirley, B. A., Stanssen, P., Steyaert, J., and Pace, C. N. (1989).J. Biol. Chem. 264, 1162111625. Shortle, D. (1996). FASEBJ. 10, 27-34. Shortle, D., and Meeker, A. L. (1986). Proteins: Struct., Funct., Genet. 1, 81-89. Shortle, D., Stites, W. E., and Meeker, A. L. (1990). Biochemistry 29, 8033-8041. Smith, G. P. (1985). Science 228, 1315-1317. Smith, G. P., and Petrenko, V. A. (1997). Chem. Reu. 97, 391-410. Smith, M. (1985). Annu. Rev. Genet. 19, 423-462. Sonar, S., Lee, C.-P., Coleman, M., Patel, N., Liu, X., Marti, T., Khorana, H. G., Rajbhandary, U. L., and Rothschild, K. J. (1994). Nut. Struct. Biol. 1, 512-517. Spudich, J. L. (1994). Nut. Struct. Biol. 1, 495-496. Stirling, Y. (1995). Blood Coagulation Fibn'nol. 6, 361-373. Stubbe, J. A., and Abeles, R. H. (1980). Biochemist9 19, 5505-5512. Stubbs, M., Oschkinat, H., May, I., Huber, R., Angliker, H., Stone, S. R., and Bode, W. (1992). Eur. J. Biochem. 206, 187-195. Svendsen, L., Blomback, B., Blomback, M., and Olsson, P. (1972). Thromb. Res. 1,267-278. Takagi, T., and Doolittle, R. F. (1974). Biochemistry 13, 750-756. Tsiang, M., Jain, A. K., Dunn, K. E., Rojas, M. E., Leung, L. L. K., and Gibbs, C. S. (1995). J. Biol. Chem. 270, 16854-16863. Tsiang, M., Paborsky, L. R., Li, W.-X., Jain, A. K., Mao, C. T., Dunn, K. E., Lee, D. W., Matsumura, S. Y., Matteucci, M. D., Coutre, S. E., Leung, L. L. K., and Gibbs, C. S. (1996). Biochemistry 35, 16449-16457. Vindigni, A., and Di Cera, E. (1996). Biochemistry 35, 4417-4426. Vindigni, A,, Dang, Q. D., and Di Cera, E. (1997a). Nut. Biotechnol. 15, 891-895. Vindigni, A., White, C. E., Komives, E. A,, and Di Cera, E. (1997b). Biochemistry 36,66746681. Vu, T. K. H., Hung, D. T., Wheaton, V. I., and Coughlin, S. R. (1991). Cell (Cambridge, MUSS.)64, 1057-1068. Wallace, A., Dennis, S., Hofsteenge, J., and Stone, S. R. (1989). Biochrmulr 28, 1007910084. Wang, L., and Di Cera, E. (1996). Proc. Null. Acad. Sci. U.S.A. 93, 12953-12958. Warshel, A,, Naray-Szabo, G., Sussman, F., and Hwang, J. K. (1989). Biochemistry 28,36293637. Weber, G. (1975). Adu. Protein Chem. 29, 1-83. Wegscheider, R. (1895). Monatsh. Chem. 16, 153-158. Wells, C. M., and Di Cera, E. (1992). Biochemist9 31, 11721-11730. Wells, J. A. (1990). Biochemistry 29, 8509-8517. Wells,J. A. (1996). Science 273, 449-450. Wrighton, N. C., Farrell, F. X., Chang, R., Kashyap, A. K , Barbone, F. P., Mulcahy, L. S., Johnson, D. L., Barrett, R. W., Jolliffem, L. K., and Dower, W. J. (1996). Science 273, 458-463. Wyman, J. (1948). Adu. Protein Chem. 4, 407-531. Wyman, J. (1964). Adu. Prota'n Chrm. 19, 223-286.
SITESPECIFIC ANALYSIS OF MUTATIONAL EFFECTS
119
Wyman, J., and Gill, S. J. (1990). “Binding and Linkage.” University Science Books, Mill Valley, CA. Yang, A.S., Gunner, M. R., Sampogna, R., Sharp, K., and Honig, B. (1993). Proteins: Struct., Funct., Genet. 15, 252-265. Young, D. C., Zhan, H., Cheng, Q.-L., Hou, J., and Matthews, D. J. (1997). Protein Scz. 6, 1228-1236. Yu, M.-H., Weissman,J. S., and Kim, P. S. (1995).J. Mol. Biol. 249, 388-397. Zhang, E., and Tulinsky, A. (1997). Biophys. C h . 63, 185-200.
This Page Intentionally Left Blank
ALLOSTERIC TRANSITIONS OF THE ACETYLCHOLINE RECEPTOR
.
By STUART J EDELSTEIN13 and JEAN-PIERRE CHANGEUX’ ‘Neurobiologle Moibcuialre. institut Pasteur. 75734 Paris Cedex 15. France. and *DBpartement de Biochlmle. Unlverslt6 de Genbve. CH-1211 Genbve 4. Switzerland
I . Introduction ................................ A. The Acetylcholine Receptor: Similarities and Differences with Respect to Other Allosteric Proteins ........................... B. Consequences of Pseudosymmetric Oligomeric Structure ......... C. Role of Mutational Studies .................................... I1. Mechanistic Models .............................................. A . The Allosteric-Type Model .................................... B. Linear Free Energy Relations ................................. C. Alternative Models ........................................... 111. Recovery from Desensitization ..................................... Iv. Kinetic Basis of Dose-Response Curves ............................. A . Dependence on the Desensitization Rate ....................... B. Desensitization by Low-Concentration Prepulses ................. V. Multiple Phenotypes ............................................. A. A Generalized Allosteric Network .............................. B. The KPhenotype ............................................ C. The L Phenotype ............................................ D. T h e y Phenotype ............................................ E. Limiting Properties at Extremes of L ........................... VI. Deductions from Single-Channel Measurements ..................... A. Kinetic Consequences of Mutant Phenotypes .................... B. Single Ionic Events vs Single Ligand-Binding Events in Relation to Binding-Site Nonequivalence ............................... VII . Allosteric Effectors and Coincidence Detection ...................... VIII . General Considerations ........................................... A. Evaluation of Mechanistic Models .............................. B. Implications for Synaptic Plasticity ............................. C. Diseases and Nicotine Dependency ............................ References ......................................................
.
121 121 123 126 127 127 129 132 133 137 137 139 141 141 143 144 146 147 149 149 157 163 166 166 168 171 173
I . INTRODUCTION
A.
The Acetylcholine Receptor: Similarities and Differences with Respect to Other Allosta’c Proteins
The nicotinic acetylcholine receptor (nAChR) and other members of the superfamily of ligand-gated channels are responsible for rapid chemo-electrical transduction in the nervous system.The chemical relay 121 ADVANCEY IN PROTWN CHEMISTRY Val. 5 1
.
Copyright 8 1998 by Academic Press. All rights of reproduction in any form resewed . nnfiewwwt 7 5 nn
122
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
between electric impulses at the synapses of neuromuscular junctions or between neurons occurs via the quanta1 release into the synaptic cleft of neurotransmitter molecules in “pulses” of millimolar concentration and millisecond duration (Kuffler and Yoshikami, 1975; Clements et al., 1992). The postsynaptic membrane contains a high density of these ionotropic receptors, present mainly in closed states prior to neurotransmitter release, but capable of interconverting rapidly on binding of neurotransmitter to an open state with a permeable ion channel. However, the open state is transient and closure occurs either by returning to the initial state (followinga brief pulse of neurotransmitter, as commonly occurs under physiological conditions) or by converting to desensitized states (when neurotransmitters or other modulators are present for longer times). Presynaptic effects of nAChR may also contribute to synaptic function by potentiating the response of other ligand-gated channels (Brussard et al., 1995; Gray et al., 1996; Role and Berg, 1996; Wonnacott, 1997; Lena and Changeux, 199713). Mechanistic models capable of representing these various properties include principles of the Monod-Wyman-Changeux (MWC) theory of concerted transitions between conformational states (Monod et al., 1965). This theory was initially developed to account for the kinetic properties of bacterial and mammalian regulatory enzymes on the basis of symmetry features of their quaternary structures. It had it origins in Wyman’spioneering developments of linkage relationships for hemoglobin (Wyman, 1948, 1964) and has been applied to various aspects of ligand gating (Changeux et al., 1967, 1984; Karlin, 1967; Edelstein, 1972; Colquhoun, 1973; Heidmann and Changeux, 1979; Changeux, 1990; Jackson, 1989; Galzi et al., 1996b), including an extended form that generates values for all of the relevant kinetic constants through the application of linear free energy relations (Edelstein et al., 1996). More restricted sequential-type models (Del Castillo and Katz, 1957; Colquhoun and Sakmann, 1985; Lingle et d., 1992; Edmonds et d., 1995; Wang et al., 1997) along the lines of the induced-fit mechanism (Koshland et al., 1966) have also been widely employed. The sequential models assume that the ion channel opens only on binding of two ligand molecules (or possibly with one molecule in the case of brief openings) and thus do not account for the occurrence of spontaneous openings that have been observed in a number of cases (Jackson, 1984;Jackson et al., 1990; Auerbach et al., 1996). One of the goals of this chapter is to formulate experimental approaches that can distinguish between the two types of models. The nAChR and other ionotropic receptors constitute a special class of allosteric proteins (Galziand Changeux, 1995).They possess a number
ALLOSTERlC TRANSITIONS OF THE ACh RECEPTOR
123
of features in common with allosteric enzymes and hemoglobin (Monod et al., 1965; Edelstein, 1975; Perutz, 1989), including ( 1 ) an oligomeric structure; (2)topologically distinct sites responsible for homotropic and heterotropic interactions that can be related to the interaction of pharmacological agonists, antagonists, and effectors; and (3) concerted transitions between discrete conformational states as revealed by all-ornothing opening of the ion channel. In addition, the nAChR possesses distinct features (Galzi and Changeux, 1995; Galzi et al., 1996b), including (1) pseudosymmetry among the subunits related by a fivefold rotational axis perpendicular to the plane of the membrane; (2) homotropic interactions between partially equivalent sites; (3) a set of conformational states (activatable, active, and desensitized) with interaction times that operate in ranges varying from milliseconds to minutes; and (4)pleiotropic phenotypes in which point mutations result in concomitant modifications of apparent agonist affinity, channel conductance, and agonist-vs.antagonist specificity. The nAChR also possess properties permitting observation of singleion channels (Sakmann et d., 1980), a powerful experimental approach for the determination of kinetic properties and conductance levels of the channel. In this respect, conformational changes dependent on the ligand concentration are more readily measured than direct binding interactions. In contrast, for many other allosteric proteins ligand binding is more readily monitored than independent indices of conformational change (Edelstein and Changeux, 1996). Hence, for certain approaches, particularly analysis of stochasticprocesses, the nAChR appears in advance of other allosteric proteins and may lead to novel experimental approaches that could be applied to other allosteric systems. In addition, the possible monitoring of single ligand-binding events (see Section VI,B) may be possible with anticipated technical advances in fluorescence correlation spectroscopy (Eigen and Rigler, 1994; Schwille et al., 1997).
Consequences of Pseudosymmetric Oligomeric Structure Historically, concepts concerning the nAChR were developed initially from studies on receptors from fish electric organs and vertebrate muscle (Changeux, 1990). Biochemical analysis, cloning, and sequencing of these receptors’ subunits established their heteropentameric [2a:lp:1y / ~ : 1 6 structure ] and led to the identification of related neuronal forms (see Fig. l ) ,as well to more distant invertebrate forms (Le Nodre and Changeux, 1995). The neuronal subunits a2-a5 require interactions with /3 subunits in order to form functional receptors, with a putative B.
124
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
I
1
400
600
800 1
1
1
1
1
MYA
200 1
1
1
a9
FIG.1 . Evolutionary relationships between the subunits of the nAChR based on the analysis of Le Novkre and Changeux (1995),updated for sequence information published subsequently (as kindly supplied by N. Le Novkre).
[2a,:3Pj] stoichiometry (Cooper et al., 1991; Lindstrom, 1996) or more complicated combinations in certain cases (Conroy et al., 1992; Vernallis et al., 1993; Ramirez-Latorre et al., 1996; Le Novcre et al., 1996), whereas a7-a9 subunits may form functional homopentamers (Couturier et al., 1990; Sargent, 1993; Elgoyhen et al., 1994; Palma et al., 1996). Several experimental approaches have lead to the identification of functional domains, particularly chemical labeling and sitedirected mutagenesis (Changeux, 1990). In this respect, studies on a7 using sitedirected mutagenesis have been particularly fruitful and have contributed to the current structural model (Devillers-Thiky et al., 1993; Unwin, 1993a; Galzi and Changeux, 1994; Bertrand and Changeux, 1995; Karlin and Akabas, 1995), with the agonist binding site in the N-terminal domain (Karlin and Akabas, 1995; Galzi and Changeux, 1994) and the ion channel constituted by residues of the M2 transmembrane domain (Giraudat et al., 1986; Hucho et al., 1986; Imoto et al., 1988; Devillers-ThiCry et al., 1993), as presented in Fig. 2a. However, other transmembrane regions may also contribute directly o r indirectly to channel properties (Lo et al., 1991; Li et al., 1994; Akabas and Karlin, 1995). Features of the threedimensional structure of the Torpedo nAChR have been obtained by
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
125
a
C
d
M2
s L G IT v L*Ts L T Y F M
L Lv A
Chick a7:
I
Human GlyR a1 :
VGL GITTV LTMTT Q S SG S R
Human AChR al:
MTLS I S ~ L L S
Human AChR pl: Human AChR E: Human AChR ct4:
L T V FL L V I V MGLSIFALYTL L LT L' I YFL C T V S I N V L L A QF Y V IA FL I T L C I T V L LLS~ L L LT TI V F
FIG.2. Structural models of the nAChR. (a) Schematic representation of functional domains. (b) Longitudinal outline of a receptor molecule with respect to the cytoplasmic membrane (Unwin, 1996). Putative a-helices of the M2 domain lining the ion channel are indicated by the two angular bars. (c) Schematic cross-sectional view of the receptor showing binding sites at a-y and a-dinterfaces with subunits in the arrangement deduced for the receptor from Torpedo (Machold et al., 1995).In adult mammalian muscle r e c e p tors, the embryonic subunit y is replaced by E . (d) Sequence of the M2 channel domain for chick a7,with mutations indicated (discussed in the text) and the corresponding residues for other receptor subunits, including mutations that produce a genetic disease.
126
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
electron microscopic studies of ordered arrays at 9 resolution (Unwin, 1993b, 1996), as represented in outline in Fig. 2b. Numerous studies indicate that the binding sites for nicotinic ligands are located at the a-y and a 4 interfaces, or at equivalent positions for nonmuscle receptors (Oswald and Changeux, 1982; Pedersen and Cohen, 1990; Chatrenet et al., 1990; Galzi et al., 1991b; Czajkowski et al., 1993; Fu and Sine, 1994; Corringer et al., 1995), with the subunits arranged as shown in Fig. 2c (Machold et al., 1995). However, alternative interpretations have been presented that place the nicotinic binding sites closer to the center of each a subunit and the fl subunit between the two a subunits (Unwin, 1996). For nicotinic agonists, it has been suggested that higher affinity binding takes place at the a-S interface and that lower affinity binding occurs at the a-y interface (Blount and Merlie, 1989;Sine and Claudio, 1991;Prince and Sine, 1996).In contrast, higher affinity has been assigned to the a-y site for the competitive antagonist d-tubocurarine (Pedersen and Cohen, 1990). Hence, the site that binds more strongly may vary for different agonists or antagonists and, for a particular ligand, the degree of nonequivalence may vary from one conformational state to another.
C. Role of Mutational Studies In this chapter, the generalized MWC-type allosteric model (Edelstein et al., 1996)will be described and contrasted with the sequential-typemodel (Del Castillo and Katz, 1957;Colquhoun and Sakmann, 1985;Lingle et al., 1992; Edmonds et al., 1995;Wang et al., 1997) that has also been used to analyze the kinetic properties of the nAChR.Attention will also be directed to the degree to which differences in the affinities of the two binding sites for ACh are responsible for characteristic properties of muscle receptors. Applications of the models to experimental data for both wild-type receptors and for several mutant forms will be evaluated for dose-response experiments and for kinetic experiments (including single-channel recordings). The analysis of both sitedirected and spontaneous mutations has been critical to the current understanding of the functional mechanism. For example, the channel mutant L247T (see Fig. 2d), first studied bysitedirected mutagenesis in a 7 (Revah et al., 1991; Bertrand et al., 1992),was subsequently incorporated into muscle nAChR (Filatovand White, 1995; Labarca et al., 1995) and recently identified in a congenital myasthenic syndrome (Gomez et al., 1996). Neighboring sites have also been implicated in receptor function, by sitedirected mutagenesis in a 7 (DevillersThikry et al., 1992;Galzi et al., 1992),or as naturally occurring myasthenic mutants (Lena and Changeux, 1997a).The phenotypes proposed for the
ALLOSTEIUC TRANSITIONS OF THE ACh RECEPTOR
127
a 7 channel mutants can also be related to the natural mutations identified in patients of human startle disease, as discussed in SectionV, C. The myasthenic and startle mutations in the M2 region are presented in Fig. 2d. Among the myasthenic mutants characterized to date, the data published for the myasthenic mutant ~T264P(Ohno et al., 1995) are particularly valuable for the discrimination between functional models and are described more fully in Section VI,A. Overall, various experimental approaches involving both wild-type and mutant nAChR will be considered in order to delineate the differences in the predictions of the allostericand sequential-type models. 11. MECHANISTIC MODELS A.
The Allosteric-Type Model
In order to account for in vitrofast kinetic observations on ligand binding and ion channel opening with Tmpedoreceptors,four conformational states, B, A, I, and D, were postulated (Heidmann and Changeux, 1980; Neubig and Cohen, 1980;Changeux, 1990),where B is the basal (resting) activatable state, A is the active (open) state, I is the initial desensitized state, and D is the final desensitized state. The four states are in equilib rium, with an interconversion pattern that corresponds to the vertices of a tetrahedron (Fig. 3a), and three degrees of ligation ( i = 0, 1, or 2) are possible for each state. The affinity of each state for agonist increases in the order EA-I-D. The B state was originally designated as “R,” but “B” (for “basal”) was proposed (Edelstein et al., 1996) to replace “R” in order
”i
Di
FIG.3. The interconversion of conformational states for the nAChR. (a) The full set of reactions. (b) The linear progression of states imposed by the relative magnitudes of the rate constanrs (Edelstein et al., 1996).
128
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
to avoid confusion with the high-affinity state in the original MWC formulation (Monod et al., 1965). According to the scheme in Fig. 3a, all interaction pathways are possible in principle. Hence with 6 pairs of interactions and 3 degrees of ligation, a total of 36 conformational interconversion rates would be required to characterize the system fully (in addition to the ligand on and off rates for each state). The 36 interconversion rates are clearly too numerous to be evaluated, but their number can be limited by structural or kinetic constraints of the receptor that render obligatory a certain order of passage between states. Indeed, the different time regimes observed for the various transitions (-1 ms for B + A; -100 ms for A + I; -10 s for I + D) lead to selection of a predominant kinetic pathway, as indicated in Fig. 3b, that corresponds to the passage between states over the lowest transition state barriers. Since the secondary pathways (indicated by dashed arrows) can be assumed to contribute to less than 1% (Edelstein et al., 1996), the tetrahedral arrangement reduces to the linear cascade: Bi P Ai Ii Di. The linear progression permits the full description of all ligandbinding and conformational transition rates in the two-dimensional kinetic network depicted in Fig. 4,with ligand (agonist) reactions r e p
* *
BAko & A0+2X
Akoff
I)
Alko
& 10+2X IAko
2Akon
Ikoff
III/
2'kon
)IkO" B, + X
B2 FIG 4. The complete network of rate constants for conformational interconversions and two agonist binding sites, based on the linear progression of conformational states. Each column corresponds to a series of ligand binding events at two identical sites per receptor, with the rates specified along the vertical arrows by the state-specific intrinsic "on" and "off' constants, with statistical factors included. Each row corresponds to a series of transitions between states governed by rate constants that vary with the number of ligand molecules bound ( i = 0, 1, or Z ) , with the initial and final states in the superscript and the number of ligands in the subscript (Edelstein et al., 1996).
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
129
resented vertically and conformational interconversion reactions represented horizontally. The ratios of the various ligand binding and interconversion rate constants define the equilibrium parameters with respect to the key affinity ratio, c, as summarized in Fig. 5 for the B-A pair of states. The linear progression reduces the number of independent interconversion rate constants that must be evaluated to describe the system from 36 to 18. Nevertheless, 18 interconversion rate constants remain too large a number of independent parameters to be determined, but a substantial additional reduction can be achieved through the application of linear free energy relations.
B. Linear Free Energy Relations Linear free energy relationships have been widely used to relate the kinetic features of related reactions to properties of their respective
FIG.5. The reaction cycle for the B and A pair of states as a function of binding one molecule of agonist. Linkage relations require that each step of ligand binding produces a decrease in the allosteric equilibrium constants "L,, as follows from "C = "LJ"&-,. Since for each value of i the "L, value is set by the ratio of the appropriate rate constants, BAI_3 = ABk,/"k,, the decrease in "L, with each zth ligand binding step must correspond to changes in the interconversion rates such that % = ("k,/"k,) (%-,/''%-,). According to this equation, the stabilization of the higher affinity member of each pair of states resulting from the binding of one molecule of agonist must be reflected by a decreasing interconversion rate constant toward the lower affinitystate and/or an increasing interconversion rate constant toward the higher affinity state. Thus, for the progression from BAZ+I to BAL,,since B"L,_I> BALp, the decrease upon agonist binding must correspond to ABk,/"k,-l < 1 and/or "k,-l/BAk, < 1. Hence, ligand binding drives the B, a A, equilibrium toward A by systematically increasing the B -+ A rates and/or decreasing the A + B rates. Equivalent relations apply to the A-I and I-D pairs of states (Edelstein et al., 1996).
130
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
\ \
I I
\
I
\ \ \
B j
\
2,
\ \
... I
+
... D
State FIG.6. Linear free energy relations and the scaling of transition state barriers. For each conformational state (A or B) or the transition states (TS), the energy of stabilization resulting from ligand binding is depicted by a “ladder” of equally spaced steps. The vertical positions of the ladders are set by AG = 0 for B, and by AG = - R T In for Ao.The transition states are placed according to the free energy of activation A d derived from transition state theory as expressed by the equation k = K(kBT/h)e-ACt/Rr,where K is the transmission constant (which can be assumed to equal 1.0 if there are no barrier recrossings), ha is Boltzmdnn’s constant, h is Planck’s constant, R is the gas constant, and
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
131
transition states (Leffler, 1953; Szabo, 1978;Jencks, 1985; Fersht et al., 1986).They have been particularly successful in describing the variations as a function of ligand binding for the interconversion rate constants of the two principal conformational states of hemoglobin over a wide range of rates (Sawicki and Gibson, 1976; Eaton et al., 1991; Henry et al., 1997). Similar principles have been assumed to apply to the conformational interconversions of the nAChR and have been used to characterize the transition state for the interconversion between each pair of conformations in terms of its position on a hypothetical linear reaction coordinate (Edelstein et al., 1996). The position determines the effect of ligand binding on the interconversion rates, thereby limiting the degrees of freedom in the assignment of values to the rate constants. Specifically, the application of linear free energy relations is based on the difference in affinity for agonists between the partners of each pair of states, as indicated by affinity ratio BAc= kJKB defined in Fig. 5 (or the equivalent ratios for AIcand IDc for the A-I and I-D pairs of states, respectively). The dependence of conformational interconversion rates on ligand binding is assumed to follow from the stabilization of the transition state for each interconversion by the ligand. The extent of this stabilization is assumed to be intermediate with respect to the effects of ligand binding on each on the two participating allosteric states and weighted toward the properties of the allosteric state that the transition state more closely resembles. This assumption may be expressed quantitatively in terms of the position of the transition state on a hypothetical reaction coordinate. In this case, for each pair of states a positional parameter is defined ("p, AIp,or IDp),as presented in Fig. 6 for "p, such
T is the absolute temperature (Steinfeld et al., 1989). For the interconversions between states, the rate constants for the doubly liganded forms are specified o n the basis of the experimental data, as summarized in Table I. The changes in the rates for the unliganded and singly liganded forms are determined by the transition state positional parameters, For the series of interconversion reactions, the differences in the activation energies for the successive transition states reflect the differences in the energy of stabilization of the B and A states with each successive ligand binding, weighted by the "p. Hence, the successive interconversion rate constants scale with the corresponding affinity ratio, with the positional parameter in the exponent: = "c exp("p) and "k,-,/"k, = BAc exp (1 - "p). Since the product of these two equations is equivalent to BAc = ("kJ %,) (BAkt,-I/"Bka-I) (see the legend to Fig. 5), these relations permit the positional parameter to define the dependence of the interconversion rate constants on ligand binding (Edelstein et al., 1996). For example, with "p = 0.2 (Table I), it can be seen that the vertical spacing of the transition states for the B, P A, interconversions at different degrees of ligand binding more closely resembles the spacing between the A, forms than the spacing between the B, forms. Equivalent relations apply to the A-I and I-D states.
132
STUART J. EDELSTEIN AND JFAN-PIERRECHANGEUX
that this parameter characterizes the transition state on a linear scale between 0 and 1 with respect to its proximity to the lower affinity state of the pair. On the basis of these relations the full series of rate constants for the Bi P A, interconversions can be generated from the ligand binding parameters ( B A ~ ) with three additional values: BAp,one B + A rate constant, and one A + B rate constant (see legend to Fig. 6). In this way the six rate constants for each pair of states in the linear scheme are reduced to two rate constants and one positional parameter. With homopentameric receptors, the same number of parameters provides estimates for all 12 interconversion rates for each pair of states (Edelstein and Changeux, 1996). For detailed kinetic studies on muscle nAChR, because of the distinct time ranges over which the three pairs of states interconvert, sufficient data are available to permit reasonable estimates for all values (Edelstein et al., 1996), as listed in Table I. The value of ““p = 0.2 indicates that with each ligand-binding step the increase in the B + A rate is larger than the decrease in the A + B rate.
C. Alternative Models For many of the experimental measurements on nAChR reported in the literature (Lingle et al., 1992; Edmonds et al., 1995), the data for activation have been interpreted in terms of a “sequential” model, in which channel opening for muscle-type receptors occurs only upon binding of the second molecule of agonist, as represented in Fig. 7 by the step B2 P A2. In this case, channel opening and closing are characterized by only two rate constants, fi and a,respectively. The reasons for having invoked this model include the difficulties of implementing a full MWCtype model prior to the introduction of the linear free energy relations (Edelstein et al., 1996) and the fact that under certain experimental conditions (such as relatively high agonist concentrations), the assumption that channel opening occurs only for biliganded molecules provides an adequate description of the system. In some cases, singly liganded openings have been incorporated into the sequential model to account for brief openings (Colquhoun and Sakmann, 1985; Wang et al., 1997). However, in other cases that include channel opening of unliganded receptors, the sequential-type model is not adequate; such cases are considered in Sections V and VI. Additional aspects concern desensitization (as represented in Fig. 7 by the step A, P D2) and recovery. Ever since the benchmark studies of Katz and Thesleff (1957) it has been generally noted that following desensitization, recovery occurs spontaneously upon removal of the ago-
133
ALLOSTEIUC TRANSITIONS OF THE ACh RECEPTOR
B, + X
D;
+x
2 Bkoff
FIG.7. The sequential-type model, with ligand binding to the B and D states and formation of the open state (A,) from B2 as defined by the constant k&. = [Bn]/ [A,] = a/@.In this scheme, only one desensitized state is included; it is designated “D,” but its properties would correspond to “I” in Fig. 4.
nist, but “silently,” i.e., with no channel opening during the recovery period. Since return to the resting state via A2 would imply channelopening events, it has been argued that a distinct “recovery” pathway must exist, as represented in Fig. 7 by the series D2 + D1 + Do + B,. The overall model presented in Fig. 7, with activation restricted to biliganded molecules and a distinct recovery sequence, represents the “cyclic” scheme that has been used to interpret experimental data (Franke et aL, 1993). However, an explanation of how silent recovery can also be accommodated by an allosteric-type model is presented in the following section. 111. RECOVERYFROM DESENSITIZATION When the relationships linking ligand binding, allosteric equilibria, and transition state barriers were evaluated for all three pairs of states for muscle receptors studied by single-channel measurements or rapid agonist application (Colquhoun and Sakmann, 1985;Franke et al., 1993), the values for the various parameters presented in Table I were deduced (Edelstein et al., 1996). The overall properties of the system may then be represented by the free energy profile presented in Fig. 8. The vertical
TABLE I Parameter Valws for the Four-State Allosteric Kinetic Mechanism" State parameters
Independent parameters Ligand on rates ( A T ' S - ' ) Ligand off rates (s-') Deduced parameters Equilibrium constants (M) Amnity ratios
I state
A state
B state Bk,,b.= 1.5 X loR Btr = 8000
A%.
=
1.5 X lox
'k,," = 1.5 X lo8 'k,,, = 4.0
At = 8.64
I(B = 5.3 x 10-5 % = 1.08 X
D state
I(A = 5.7 x 10-8
K, = 2.7
"k,,. = 1.5 X lo8 = 4.0
"k,,S
&
X
"c = 0.46
= 2.7 X
lo-'
mc= 1.0
Interconversion parameters B e A Independent parameters TS positional parameter Interconversion rates (s-l) Deduced parameters Interconversion rates (s-')
Allosteric constants
"p "b "b
A-I
I w D
"p = 0.99
= 0.2 = 3.0 x 104 = 7.0 X lo2
Mkq
"k, = 0.54 m& = 1.08 x 104 "kl = 1.3 X lo2 "kl = 2.74 X lo3 "L,, = 2 x 104 "15, = 21.6 "& = 2.3 X
'"p = 0.99 '"b= 5.0 X
= 20.0 = 0.81
D'b
= 1.2 x 10-3
"'k, = 5.0 X = 1.2 x 10-3
Mk, = 19.7 "k, = 3.74
01%
Mkl = 19.85
'"k, = 5.0 X
uLkl= 1.74
"'k, = 1.2 X lo-%
M L g = 0.19 "L1 = 8.7 X
=
ID&
= 2.5 X
"4 = 2.5 X
4.0 X
ID&
= 2.5 X
lo-'
~
For the 14 independent rate constants (4 on and 4 off rates for the ligand, 3 forward and 3 back allosteric transition rates for the doubly liganded forms, and 3 transition state positional parameters) necessary to define the B, A,, A, P I,, and I, D, allosteric transitions with two equivalent agonist binding sites per molecule, parameter values were deduced for the nAChR on the basis of results with rapid agonist application techniques using outside+ut patches containing embryonic-like nAChR from mouse muscle (Franke et al., 1993) and earlier singlechannel measurements (Colquhoun and Sakmann, 1985), as reported previously (Edelstein et al., 1996).
*
*
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
135
20 -
-a,
h
E"
10-
L Y
v
A
P a 0C
W
-10 -
State FIG.8. Free energy diagram for the fourstate allosteric model, including all liganded and unliganded allosteric states, as well as their respective transition states. The B, A, I, and D states are each represented by a free energy ladder, with details as described for the B and A states in Fig. 6 and values of the relevant parameters as described in Table I (Edelstein et al., 1996).
ladders for each state and the intervening transition states correspond to the change in free energy for each molecule of agonist bound. Hence the step sizes for the B, A, I, and D states increase with affinity according to the series of dissociation constants & > KA > KI > KD.The vertical alignment of the ladder for each state is set by the values assigned to the relative concentrations of Bo, &, Io, and Do. The transition state heights are determined by the positional parameters (see Fig. 6 ) . The progression of doubly liganded states B2+A2+ I2 + D2represents the allosteric cascade for the conformational changes elicited by application of a strong and prolonged pulse of agonist, with the time of passage through A2 corresponding to the average open time in singlechannel measurements. The kinetic properties of the system can be represented in simulations, as shown in Fig. 9a and b, with the time axis presented on a logarithmic scale to permit visualization over several time regimes (Edelstein et d.,1996). Since the concentration of agonist is high, this
a
1 0.8 Q, v)
C
0.6
0
a v)
0.4
Q,
U
0.2
0
b
4
-2
4
-2
log time
0
2
1 0.8
0.6 0.4 0
a 0.2
0 0
2
2
4
log time C
1
0.8 c 0 .c m S
a
0.6
0.4
0
a
0.2 0
-2
0 log time
FIG.9. Kinetic simulations presenting activation, progression through the states on agonist binding, and recovery following agonist removal. The states are labeled, with the number of ligand molecules bound indicated by the line format (0, dotted line; 1, dashed line; 2, solid line). The ligand concentration is lo-' M. Values of the parameters utilized are presented in Table I. (a) The appearance of the open state (in the form of a current change, 1 - [A states]) on a scale of log time (in seconds), with the inset presenting the same data on a linear scale (the vertical bar = 0.1 fractional amplitude change; the horizontal bar = 0.5 s). (b) The fractional population represented by each of the four states during an agonist pulse, with time on a logarithmic scale. (c) Recovery begins upon removal of free agonist at the point marked by the arrow in (b). (Edelstein et al., 1996.)
WOSTEIUC TRANSITIONS OF THE ACh RECEPTOR
137
simulation follows the path Bo + B1 + B2 + A2 + Ip + D2. At early times in Fig. 9b the progression from unliganded to liganded forms is apparent for B, followed by interconversion of biliganded B to biliganded A and I. Transient channel opening corresponds to the appearance and disappearance of A. The population of biliganded D increases only at longer times via conversion from biliganded I due to the slow rate of the I2 + D2 interconversion. Following termination of the pulse, agonist dissociation drives the system to the unliganded states and the initial distribution is reestablished relatively rapidly, particularly from I2 along the pathway I2 + I1 + I,, + A,, + Bo (Fig. 8). These features can be visualized in the kinetic simulation of recovery presented in Fig. 9c. With "ko 9 Y\ko, recovery along the pathway I. + A. + Bo occurs with so rapid a passage through the A,, state that channel opening is negligible (Edelstein et al., 1996). Following agonist removal, the I2 state loses agonist molecules and is transformed to Bo in less than 1 s. The D2 state loses agonist molecules to form Do within 10 s, but requires longer times (103-104s) to reequilibrate with Bo and to return to the initial low levels. Such long recovery times could account for certain slow physiological responses, possibly related to nicotine pharmacology (see Section VII1,C). It is clear from this simulation that the four-state model predicts virtually negligible channel opening during the recovery period, but without the necessity of imposing a separate recovery pathway from I (or D) to B that arbitrarily disallows passage through A, as required in the "cyclic" model (Franke et al., 1993).
IV. KINETICBASISOF DOSE-RESPONSE CURVES A . Dependence on the Desensitization Rate The dose-response analysis has been widely used for characterization of the cooperativity and affinity (EC,,) of ligand-gated channels, but it is important to ascertain under what conditions such an equilibriumbased analysis is appropriate for a transient phenomenon. Therefore, simulations were performed with the complete four-state model to test this issue (Edelstein et al., 1996). The relative rate of the initial phase of desensitization,A2+ 12, can result in systematic errors in the apparent values of the Hill coefficient, n, and ECS0,as described in Fig. 10. The errors are relatively minor for the value of Nk2 = 20 s-l (Table I ) , but for higher values of "K2 significant distortions in the simulated doseresponse curves were predicted. For example, the apparent values of
138
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
a
b
log (ACh) FIG.10. Dose-response simulations. (a) Kinetic simulations for increasing concentrations of agonist in the concentration range 10-6-10-3 M, with the value of Mb = 20 S K I (Table I). Simulations for each concentration (in increments of 10” = 1.58 times the previous concentration) are presented as “Response” (calculated as 1 - A,,,,) vs. log time (in seconds), and the minimum of each curve, corresponding to the maximal channel opening at that concentration, is marked by a filled circle. (b) Dose-response curves for the simulation in (a). Tke predicted response curve (presented as the continuous dotted line) is described by A.,,, the theoretical equilibrium for the normalized fraction of A in a system limited to the B and A states:
where aAis the concentration of ligand [XI normalized to the a f h i t y of the A state: aA= [X]/KA and the values of the relevant parameters are from Table I. The individual points for maximal channel opening at each concentration from (a) are transferred to give the corresponding filled circles. In addition, the filled squares correspond to simulations carried out with Mb = 200 s-’ (Edelstein et al., 1996).
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
139
the Hill coefficient and affinity at MuK2 = 20 s-’, n = 1.6 and EC50= 10 p M , differ only slightly from the values observed with a desensitization sufficiently low (“k2= 2 s-l) to avoid any distortions: n = 1.7 and ECSo= 9 pM.Hence, for both the n and EC50values, the apparent rate constants for the AChR (Table I ) are such that errors are limited to -10% (Fig. lob). However, were the “k2value 10-fold faster (NUk2 = 200 s-I), the relevant values would be n = 1.3 and EC50 = 20 p M (Edelstein et aZ., 1996), with major discrepancies between the apparent and true (desensitization-free)properties. Therefore, as long as desensitization is sufficiently slow so as not to introduce significant errors, doseresponse analysis for the nAChR can provide a useful experimental protocol and a number of investigations relying principally on such measurements have led to important observations concerning pleiotropic mutant phenotypes, as described in Section V. Nevertheless, it is important to emphasize that for a multistate system such as the ligandgated channels, kinetic effects may prevent the distribution of conformational states from attaining their equilibrium positions. B. Desensitization
Low-Concentration Prepulses
A prominent feature of ligand-gated channels is the desensitization by low “prepulse” concentrations of agonist that are insufficient to provoke significant channel opening but elicit desensitization when followed by a stronger test pulse (Katz and Thesleff, 1957; Rang and Ritter, 1970). By studying a range of prepulse concentrations and plotting the fraction of residual activity observed with the strong pulse, investigators have obtained desensitization curves with the midpoint defining an apparent inactivation constant, IC50.Typically, the prepulse is applied for a duration of -10 s (Franke et al., 1993). If equilibrium conditions prevailed, the ICs0value obtained would be related to the dissociation constant for the high affinity of the desensitized state (Heidmann and Changeux, 1978). Therefore, to test whether the equilibrium assumption is reasonable for such an analysis, simulations were performed at very low prepulse concentrations of ligand. As presented in Fig. 1la, kinetic simulations reveal that for a low concentration such as 0.4 p M , progression through the l3-A-I-D states is relatively slow and for a 10 s prepulse some I state appears, but only a small fraction of the D state that would be produced by a pulse sufficiently long (-1000 s) to reach the final equilibrium value. When a series of simulations at different concentrations are performed, the degree of desensitization after 10 s can be measured and compared to the equilibrium value of desensitization. In this way a
140
STUART J. EDELSTEIN AND JEAN-PIERRE CHANCEUX
a
1
C
0.8
0 .-
m 0.6 3 Q
0 0.4
a
0.2
L
/
_I
0
b
1
€
0.8
0.6 7
0.4 02 0
t t
-8
-7
I -6 log (ACh)
5
4
FIG. 1 1 . Simulations of prepulse desensitization with low concentrations of agonist. Simulation of the populations of states for an agonist concentration of 4 X M. (a) Data presented as a function of time in seconds for the predominant molecular species, B,,, I p , D,,and DP. (b) Inhibition-response curve to obtain apparent IC,o values. The solid line shows values pr_edicted by the allosteric theo-v for the normalized fraction of molecules in the D state (Dn0,) and presented as 1 - D.,,, where
and aDis the concentration of ligand [XI normalized to the affinity of the D state: aD= [XI/&. Using the values in Table I leads to an IC,,, values of 2.66 X M. The points in (b) are from a series of simulations as in (a) for the fraction of activatable receptors remaining after a 10 s prepulse at different concentrations of ACh. The dashed curve through the points corresponds to an apparent IC,,, value of 1.2 X 10-fiM(Edelstein et al., 1996).
hypothetical curve for the determination of IC5,]is produced for the equilibrium properties and compared to the simulated values with 10 s prepulses. These data, presented in Fig. l l b , show the systematic divergence of the two curves. The points obtained from the simulations for 10 s prepulses are considerably to the right of the equilibrium curve
WOSTERK TRANSITIONS OF THE ACh RECE€TOR
141
and imply an apparent affinity for the D state (when IC50values are used) that is significantlyweaker than the true equilibrium value (Edelstein et al., 1996).
V. MULTIPLEPHENOWES A . A Generalized Allosteric Network Point mutations within receptor subunit genes often result in “complex” and extremely pleiotropic phenotypes with, for instance, concomitant modifications of the apparent affinity for agonist, channel properties, and agonist vs. antagonist specificity (Revah et al., 1991; Bertrand et al., 1992; Devillers-ThiCry et al., 1992; Yakel et al., 1993; Langosh et al., 1994; Rajendra et al., 1994; Labarca et al., 1995). Following the discovery of these mutations, the interpretation of their complex phenotype in molecular terms became a challenging issue. For example, a single mutation in the M2 channel domain of the a7 nAChR, Leu-247 to Thr (Revah et al., 1991; Bertrand et al., 1992; Devillers-ThiCry et al., 1992),yields a receptor that is insensitive to the channel blocker QX222, has lost desensitization, and displays an apparent affinity for acetylcholine (ACh) up to 200-fold higher than for wild type. In addition, the mutant receptor exhibits two conducting states activated by high (the 40 pS state) vs. low (the 80 pS state) concentrations of ACh. Moreover, a competitive antagonist of the wild-type receptor, dihydro-O-erythroidine (DHPE), behaves on this mutant as a full agonist (with 10-fold higher apparent affinity than ACh) and exclusively activates the highconductance state. In order to interpret such complex properties it is necessary to take into account the fact that mutations at several different positions along the primary sequence of receptor subunits may produce similar, although not identical, phenotypes. For instance, shifts in the neurotransmitter dose-response curve are obtained by mutating amino acids contributing to either the ligand-binding domain (Schmieden et al., 1992; Galzi et al., 1991a;Tomaselli et al., 1991; O’Leary and White, 1992; Aylwin and White, 1994) or the ion channel domain (Revah et al., 1991; Bertrand et al., 1992; Devillers-ThiCry et al., 1992; Yakel et al., 1993; Langosh et al., 1994; Rajendra et al., 1994; Labarca et al., 1995), even though they are located 20-40 A away from each other in nicotinic receptors (J. M. Herz et al., 1989). Moreover, mutations may alter the number and distribution of the multiple conducting states, as noted for the L247T mutation of the a7 nAChR, as well as for the hyperekplexia
142
STUART J. EDELSTEIN AND JEANPIERRE CHANGEUX
mutations of the glycine receptor (Langosh et al., 1994; Rajendra et al., 1995; Lynch et al., 1997). A context for these phenomena has been provided (Galzi et al., 1996b) by noting that the four-state allosteric model can be extended to a generalized allosteric network, as summarized in Fig. 12. Receptor molecules are assumed to exist in several (at least three) discrete conformations, S,, which correspond to thermodynamically stable states with defined tertiary and quaternary structures. These conformations are qualitatively described by a structural parameter Z,, and functionally defined as closed (but activatable), active (channel open), and desensitized (closed but refractory states). Each state is characterized by its affinity for the agonist ( K , ) or other ligands, and its conductance ( y t , in pS). The interconversion between any two conformational states S, and S, occurs freely with an allosteric equilibrium constant YL = [SJ/ [S,], and ligands stabilize the conformations to which they bind with higher affinity. One receptor oligomer, with a given subunit composition, has access to a unique set of conformational states, possibly including more than one conducting (Revah et al., 1991;Bormann et al., 1993) or desensitized (Heidmann and Changeux, 1980; Sakmann et al., 1980) state. Substituting one subunit for another, or mutating amino acids in one or more subunits, may alter the pattern of the conformational network by changing the intrinsic binding properties ("K' phenotype) or the conductance ("y" phenotype) of one or more conformations, or by changing the
5 Ki Ci Yi
FIG 12. The allosteric network of receptor molecules in multiple conformational states. Each conformation S, corresponds to a unique quaternary structure (s,)with intrinsic binding properties ( K , )and conductance ( y , ) .The interconversion between any two conformational states S, and S, is described by an allosteric equilibrium constant "L = [S,]/[S,] (Galzi el al., 1996b).
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
143
equilibrium constants between conformational states (“L” phenotype). In addition, the number of conformational states may vary, i.e., certain conformations may become virtually inaccessible, or conversely, stable. For the sake of simplicity, in all cases considered both the wild-type and mutant receptors were assumed to interconvert to the same finite number of identical quaternary structures ( S ) .Also, as kinetics of recep tor activation and desensitization take place over significantly different timescales (desensitization is generally slow compared to activation, as discussed above in relation to Fig. lo), the conformational scheme used to describe receptor activation was, in a first approximation, reduced to only those interconverting states involved in the activation process (resting and active states). Taking into account the intrinsic properties of individual conformational states and their possibilities to isomerize to other conformational states (Fig. 12), three main classes of effects may be expected in such an allosteric system with increasing numbers of interconverting states (Galzi et al., 1996b): (1) the binding of “ K ’ phenotype; (2) the conformational interconversion or “L” phenotype; and (3) the conductance or ‘‘7” phenotype.
B. The K Phenotype The K phenotype is assumed to result from mutations that selectively alter the intrinsic binding affinities of individual conformational states. In this context two possibilities may be envisioned. First, the affinity of each conformation changes but the affinity ratio ( y c = K,/K,) between conformations remains constant. The apparent affinity (EC,,) for response activation would then change with neither modifications of cooperativity (Hill coefficient) nor response amplitude. In other words, the wild-type and mutant dose-response curves are parallel. Second, the mutation selectively alters the affinity of certain states only, leading to changes in the affinity ratios (”c). In this case, not only would the apparent affinity be affected, but also cooperativity and possibly response amplitude. Furthermore, as c increases, agonists may progressively become partial agonists or even competitive antagonists. Finally, for none of the K phenotypes would the spontaneous equilibrium between any states S, and S, be altered in the absence of ligand. A possible example of this phenotype may be considered on the basis of the amino acids Tyr-93, Trp-148, Tyr-190, and Tyr-198 identified by affinity and photoaffinity labeling of the ACh-binding site from the electric organ nicotinic receptor (Devillers-ThiCry et al., 1993). Substitution of their homologs to phenylalanine on the corresponding chick neuronal a7 residues Tyr-92, Trp-148 and Tyr-187 (Galzi et al., 1991a)
144
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
or on mouse muscle a 1 subunits (Tomaselli et al., 1991; Aylwin and White, 1994) yields functional receptors, with reduced sensitivity to ACh but unchanged Hill coefficients and maximal current amplitudes. These alterations may be interpreted in terms of a K phenotype, with the intrinsic affinity of the activatable and active conformations being affected to the same extent. Indeed, simulations of a7 nicotinic receptor dose-response curves with changes in solely the K values for Y92F, W148F, and Y187F mutant receptors (Galzi et al., 199613) fit the experimental data points and yield EC5, values and Hill coefficients consistent with the experimentally determined values (Galzi et al., 1991a). Mutations in other parts of the extracellular domain of ligand-gated ion channels alter the pharmacological specificity in a different way. Mutation of Asp900 in muscle a1 (O'Leary and White, 1992) or Gln198 in neuronal nicotinic a3 (Galzi et al., 1992), as well as Ile-111 and Ala-212 in a 1 glycine receptor subunits (Schmieden et al., 1992), affect the relative affinity and efficacy of distinct agonists. Mutation of Asp200 to Asn, in particular, converts the partial agonists TMA and PTMA into competitive antagonists (O'Leary and White, 1992), as expected for changes of intrinsic binding properties of only certain states within the network, i.e., altered c values in a K phenotype (Galzi et al., 1996b). Yet, uncertainties persist about this interpretation since the properties of these mutants may also be accounted for by an L phenotype. Additional experimental data are required to reach definitive conclusions.
C. The L Phenotype The L phenotype is assumed to result from mutations that selectively alter the equilibrium constant between two given interconvertible conformations. The intrinsic properties of each conformation, i.e., the microscopic binding constants and the state of channel activity, are further assumed to remain unchanged. If two states are considered, namely, an inactive (channel-closed) B state and an active (channel-open) A state, the fraction of receptor molecules spontaneously existing in the active state in the absence of ligand is described by BALo = [B,] / [A,]. Furthermore, regulation of channel opening by an agonist depends on its affinity for the active state, as compared to its affinity for the inactive state ("c = KA/KB).Agonists are characterized by a small value of %, partial agonists by a larger BAcvalue, and competitive antagonists by an even larger one. For an L phenotype, as BALo increases, agonists may progressively become partial agonists and competitive antagonists, as shown in Fig. 13.For decreasing BALo values, the reciprocal progression takes place, and, in addition, competitive antagonists may become partial agonists
145
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
a
b 6%
= 800,000
_ _ A, Y
I
-8
I
-6
-4
-8
-2
-6
Log [XI
-4
-2
Log [XI
i7cpJv rl
C
IT-
0.8' l B A B ALc == 20.1 0
0.2
,
-
.
i l v ; p - #
0 -8
-6
-4
Log [XI
-2
-8
-6
-4
-2
Log [XI
FIG. 13. The L phenotype as illustrated with curves of a n d x for four combinations of "Lo and "c (Edelstein and Changeux, 1996): (a) high "Lo and low "c; (b) high "Lo and high "c; (c) low BALoand low "c; (d) low "Lo and high "c. The values of "Lo and "c correspond to the data analyzed with a two-state model (Galzi et al., 1996b) for the nAChR a7,wild type ("Lo = 800,000) and the channel mutant V251T ("Lo = 20), with respect to the agonist ACh ("c = 0.1) and the partial agonist DHPE ("c = 0.5) on the basis of published experimental data (Devillers-ThiCryet al., 1993; Galzi et al., 1992). In addition the absolute values of the binding affinities are fixed by KA = 2.5 X Mfor ACh, and K A = 3.5 X 10-6Mfor DHPE.
(intermediate BAcvalues) or remain competitive antagonists (large BAc values). Also, in equilibrium binding experiments, apparent affinities will be displaced more for ligands with small % values than for ligands ~ Furthermore, the model predicts that for very low with large B A values. values of BALo,spontaneous stabilization in the active state may occur, yielding constitutively active mutants (spontaneous channel opening), a phenomenon that cannot be accounted for by the sequential-type model. In addition, at low BALo positive allosteric effectors of the wild type, which behave as very weak agonists, may become partial agonists of the mutant. Finally, changes in the BALo value will generally be accompanied by changes in cooperativity and maximal response amplitude.
146
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
A possible example of this phenotype arose from studies of chemical labeling of T e e d o nicotinic receptor with noncompetitive blockers, which led to the identification of amino acid rings from the M2 segment of all five subunits that contribute to the channel domain and are conserved in the family of nicotinic receptors (Devillers-ThiCryet al., 1992; Karlin, 1993). In the case of the a 7 nicotinic receptor, the available data on the alterations of receptor properties that take place on substitution of the ring of Val-251 to Thr or of Thr-244 to Gln can be interpreted in terms of L phenotypes. Indeed, ACh dose-response curves can be simulated for wild-type and mutant receptors (Galzi et al., 1996b), with the single assumption that L values are high for the wild type (BALo= 8 X lo5) and low for the mutants (BALo= 20), corresponding (within the limits of precision of the experimental data) to the simulated curves presented in Fig. 13. Such simulations also account for the higher maximal amplitudes of the ionic response observed for these mutants (Devillers-ThiCry et al., 1992; Luetje et al., 1993). Furthermore, the competitive antagonist of the wild-type receptor, DHPE, with its specific binding K and BA~values (see details in the legend to Fig. 13), behaves as a competitive antagonist when the BALo value corresponds to the wildtype receptor, and as a partial agonist when the ‘*Lovalue corresponds to the V251T or T244Q mutant receptor. An interesting comparison with the L phenotype for the nAChR is afforded by certain mutations in the channel domain of glycine receptors. Two mutations identified in M2 from the glycine receptor, R271L and R271Q (see Fig. 2d) cause the neurological disorder hyperekplexia (Shiang et al., 1993) by drastically reducing the apparent affinity of the receptor for the agonist glycine (Langosh et al., 1994; Rajendra et al., 1994, 1995). These mutations, in addition, decrease the maximal amplitude of agonist-evoked currents, reduce the number of conducting states when present in the homooligomeric a 1 receptors from 5 (wild-type) to 3 (R271L) or 1 (R271Q), and convert the partial agonists p-alanine and taurine into competitive antagonists. Accordingly, their phenotype appears as a “mirror image” of the phenotypic changes observed in the nicotinic a 7 receptor L247T or V251T: the glycine receptor mutants would then resemble nAChR wild-type (Galzi et al., 1996b).
D. They Phenotype The y phenotype is assumed to result from changes of the state of activity of the ion channel (e.g., nonconducting to conducting) in one (or possibly more) conformation, with no alterations of the intrinsic binding parameters of each state (i.e., its pharmacological specificity)
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
147
nor of the equilibria (and kinetics) of interconversions. For example, it may be assumed that one desensitized conformation, which exhibits high affinity for agonists but has a closed channel, changes to the conducting state after mutation. In such a three-state model (one activatable and two conducting states), the expected changes of the physiological response properties are fourfold, as compared to the wild type: (i) desensitization of the response to agonists is reduced, since isomerization to a desensitized conformation is no longer accompanied by a closing of the ion channel; (ii) the apparent affinity for activation is higher for agonists, since desensitized conformations exhibit higher affinity for agonists; (iii) a new conducting state, in addition to the wildtype conducting state, may be observed; and (iv) the pharmacological drug profile of the two conducting states differ. Agonists cause the opening of one conducting state at low concentration (the high-affinity, desensitized but conducting state) and of two conducting states at high concentration, whereas competitive antagonists, if stabilizingthe desensitized conformation, will activate only the new conducting state at any concentration used. Analogies exist between the phenotypes of the M2 mutants L24’7T, on the one hand, and T244Q or V251T, on the other. Yet, if the L247T mutant receptor were to correspond to an L phenotype, a single change in L value would not fully account for the experimentally determined ACh and DHPE dose-response curves (Revah et al., 1991; Bertrand et al., 1992; Devillers-ThiCryet al., 1992). Indeed, with the L value yielding an appropriate ECS0for ACh, DHPE will not behave as a full agonist, but rather as a partial agonist as on the T244Q or V251T mutants; moreover, under no circumstances will the apparent affinity for DHPE be, as observed,higher than for ACh. Rather, the occurrence, in addition, in L247T of two conducting states with distinct pharmacological profiles (Revah et al., 1991; Bertrand et al., 1992; Devillers-ThiCry et al., 1992), favors an interpretation in terms of the y phenotype scheme. In such a case, simulated dose-response curves satisfactorily fit the experimental data, as shown in Fig. 14 for both ACh (ligand 2) and DHPE (ligand 1),assuming that one of the conducting states is identical to the wild-type conducting states (not stabilized by DHPE), while the other (assumed to correspond to a desensitized conformation of the wild-type) binds DHPE with affinity higher than that for ACh (Galzi et al., 1996b).
E, Limiting Properties at Extremes of L The range of properties resulting from the L phenotype arises from differences in the binding (Q and ionic response (A)functions. As
148
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
Ligand 1 Ligand 2
--
Y
.c 0 3 -0
6-A
-I
c
$ 0.5.
I
c
.-0 .I-
0
?
LL
0
I
-8
-6
-4
-2
Log [ligand] FIG. 14. Theoretical dose-response relationships describing the y phenotype. The curves are generated with a three-state model, B P A P I, assuming that either only the A conformation or both the A and I conformations contribute to the physiological response, for the curves as indicated. The equation for the case of both A and I as open states is
*
where a = [XI/& The L values are for the B A transition, BALo= 8 X lo", and for The affinity values are, for ligand 1, K A = 2.5 X the A P I transition, "Lo = 1.2 X "Ic = 0.4; for ligand 2, K A = 3.5 X lo-', "c = 0.5, Kl = 3 X % = 0.1, KI = lo-', "c = 0.0857. Intrinsic affinities increase from state B to A to I for both ligands. Ligand 1 is a competitive antagonist when the I state corresponds to a closedchannel state and becomes an agonist when the I state has an open channel. Ligand 2, which is an agonist in both cases, stabilizes one or two conducting states depending on the biological activity of the I conformation (Galzi et al., 1996b).
pointed out some 30 years ago (Rubin and Changeux, 1966; Changeux and Rubin, 1968), where it is possible to monitor and separately, distinctive differences in the two functions may be observed, thereby constituting a diagnostic test of the two-state model. For very low values of L, a significant fraction of molecules is in the A state in the absence of ligand: the system may be qualified as hyperresponsive (Edelstein and Changeux, 1996). On addition of ligand, the curve f o r x remains above the curve for as a function of ligand binding, with the curve for approaching saturation at ligand concentrations that give incomplete binding. For very high L values, the curve for remains below the curve for with a maximal value of < 1, even when all binding sites
r
r
r,
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
149
are saturated: the system is hyporesponsive. At intermediate values of L, differences between and may also occur, but they involve more subtle distinctions in the shape of curves. The full extent of possible differences between y and as a function of the L value are presented in Fig. 15 for a protein with five sites (Edelstein and Changeux, 1996), such as the a 7 nAChR (Palma et al., 1996). For ligand binding, y always varies from 0 to 1 and occurs, for a protein with A and B states, within the affinity limits of YAand YBthat define the “ligand binding range.” In contrast, for the state function, at the extremes of L, does not vary between 0 and 1 with increasing ligand binding, but has a limited allostmi range, Q (Rubin and Changeux, 1966), as summarized in the lower portion of Fig. 16. Moreover, as noted in Fig. 15, the “ligand response range” can extend significantly beyond the limits of K Aand KB.For a protein with five sites, the apparent affinity, EC50(as reflected by [XI,,, the concentration of ligand at the midpoint of the curve) may be as much as 6.7 times lower than K A (the extreme hyperresponsive pattern) or 6.7 times higher than KB (the extreme hyporesponsive pattern). This distinction is relevant for a7 receptors, since for the wild type, according to the simulations described in Fig. 13, the EC5, value is -5 times higher than the value of K B and thus close to the theoretical limit of 6.7 for a pentamer. Differences in the cooperativity of the binding and state functions also occur, as measured for example by the Hill coefficients, n50 at = 0.5 and n.;” at A‘ = 0.5 (Fig. 16), where is the normalized form of the response (Changeux and Rubin, 1968), as defined in the legend to and at the Fig. 15. The maximum value for nLo is higher than for extremes of L the value of %, falls to the limit of 1.0 (Rubin and Changeux, 1966), but the lower limit of A‘ for a homopentamer (for BAc= 0.1) is nj, = 1.27 (Edelstein and Bardsley, 1997). This value > I arises from the contributions of higher order reactions to the formation of molecules in the A state, as summarized in Fig. 17. In this context, the cooperativityof the dose-response curves of the a 7 nAChR predicted by a two-state model, nLo = 1.27 (Fig. 13),and the value observed experimentally, n = 1.4 (Revah et al., 1991), imply that the system is near the lower theoretical limit for a pentamer.
A’
r
VI. DEDUCTIONS FROM SINGLE-CHANNEL MEASUREMENTS A. Kinetic Consequences of Mutant Phenotypes Several classes of congenital myasthenic syndromes have been described involving specific components of the neuromuscular junction
150
STUART J. EDELSTEIN AND JEANPIERRE CHANGEUX
-
Y -
FIG.15. The state and binding functions a t d theiflcooperativity for a homopentamer with B and A states. The allosteric functions A and Y are presented as a function [XI/ K A for a series of values of the allosteric parameter, BAl,o. All cumes are calculated with B A = ~ 0.1; is determined with the equation
r
v
= [a(l
+ a)"' + " L
0 BA
ca(1
+ " c a ) N - ' ] / [ ( l + a)"+ "Lo(l + " c a ) N ) I
where N is the number of ligand-binding sites and a is i h e concentration of ligand normalized to the affinity of the A state: a = [XI/&. For Y the apparent affinity, [XI,, (defined as the value of [XI at F = 0.5), occurs between the limits K Aand KB corresponding, respectively, to the pure A state ( YA) at the low "Lo extreme and the pure B state ( Y , ) at the high BAZA,,extreme. These limits constitute the "ligand-binding range." The equation for reduces at very low '"'Loto FA= 1/(1 + K , / [ X ] ) and at very high B.4Lo to FB= 1/(1 KB/[X]). The limits for binding are thus independent of N a n d reflect only the intrinsic binding constants of the B and A states. For the state function, the cumes are described by the equation
r
+
A = 1/[1
+ "Lo(l + " c ( ~ ) ~ / ( +l a ) " ]
which predicts variations between A,,,in the absence ligand and A,, at the saturating ligand, where A ,,,,,, = 1 / ( 1 + "L,) and .&, = 1/(1 + BALo["~]~v). The low ',Lo and high "Lo limits of the midpoints ofA are set, respectively, by [XI,, = KA - 1 and [XI,, = &/[( ~ - )11, constituting the "ligand-response range" that considerably exceeds the "ligand-binding range." Adapted from Edelstein and Changeux (1996).
(6)
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
151
Allosteric
0.1
1.o
10
100
[XLdK 1 at the extremes, as described in Fig. 17. Adapted from Edelstein and Changeux (1996).
(Engel, 1993),including a series of point mutations in the nAChR (LCna and Changeux, 1997a),most of which are hyperresponsive. Site-directed mutations of the a 7 neuronal nAChR had previously permitted identification of residues in the M2 transmembrane segment where substitution of the wild-type residue produces dramatic increases in the sensitivity to ACh (Revah et aL, 1991; Galzi et al., 1992; Devillers-ThiCry et al., 1992), also known as "gain of function" (Treinin and Chalfie, 1995). Pathological myasthenic syndromes have also been reported for two
152
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
FIG.17. Receptors at various degrees of saturation during ligand binding. At vely high values of L, virtually all channel opening coincides with formation of the fully liganded species, S5. [For B A L O = 800,000 and BAc= 0.1, the fraction of S5 in the A state is given by 1/(1 ‘*L0”c5) = O.Ll, whereas the fraction of S, in the A state is only 0.01.1 At the point corresponding to A ’ = 0.5 (where the dashed horizontal and vertical lines cross), the observed reaction for the formation of receptor species with 5 ligands (S,) can be represented by the linear sum of the reactions S4 + S,, Ss + S,, S2 + S,, etc., each with increasing reaction order. As a result the overall cooperativity at 50% is given by the equation
+
where the numbers in the brackets correspond to the order of the reaction for formation of S, from S,, and where S, is the fraction of molecules with i ligands bound (Edelstein and Bardsley, 1997). In this case, the value of n;o = 1.27 arises from 2([-110.37 [-210.11 + [-3]0.016), where the values after the intergers in brackets correspond to the fractional population of the species S,, Ss, and Sp at their values along the dashed vertical line and the terms for S, and So do not contribute significantly. For the case of BAc< 0.1, the lower limit of cooperativity is n& = 1.29.
+
mutants that provoke “loss of function” (Ohno et al., 1996, 1997). The myasthenic mutant ~T264P(Ohno et al., 1995) is of particular interest, since it displays high affinity that could contribute to excess calcium uptake, with pathological consequences. When expressed in HEK (human embryonic kidney) cells, receptors with this mutation exhibit spontaneous channel opening and a trimodal distribution of open times (Ohno et al., 1995). Since no mechanistic interpretations were provided for this mutant, these data were examined to test whether the allosteric
AILOSTERIC TRANSITIONS OF THE ACh RECEPTOR
153
model could provide an explanation of these properties. Therefore, simulations based on the data reported with ~T264Pwere carried out for an ACh concentration of 3 X lo-’ M and compared to wild-type human muscle receptors expressed in HEK cells, as presented in Fig. 18. Although a complete kinetic analysis was not included in the initial description of the ~T264Preceptors, the simulation presented in Fig. 18 corresponds approximately to the properties described for an ACh concentration of 0.3 pMand is contrasted to the wild-type human muscle nAChR. An adequate fit of the mutant data was achieved by modifylng the interconversion rates between states based on lowering the wild-type value of BALo= 9 X los to a value for the ~T264Pmutant of MLo= 100, which markedly facilitates the B + A transitions (Edelstein et al., 1997a). The result is a substantial increase in the sensitivity to ligand, such that at 3 X lo-’ MACh the probability of opening increases to Popen = 0.22 = for the ~T264Pmutant, compared to the wild-type value of Popen at the same concentrations (Fig. 18). For ~T264Preceptors, the assumption of a diminished value of MLo but of normal ligand-binding constants leads to the prediction of three distinct peaks in the dwell time profiles of opening events (Fig. 19b), in contrast to a single peak for wild-type receptors (Fig. 19a). For the latter, the predictions of the allosteric model are in excellent agreement with the data points (.) obtained from the kinetic constants reported to describe an extensive data set for wild-type receptors expressed in HEK cells (Milone et al., 1997). The simulations demonstrate that for wild-type receptors the allosteric model can represent patchclamp data in as satisfactory a manner as can the sequential-type models. Whether critical experiments can be designed to distinguish between the models remains to be ascertained. The principal difference concerns channel opening in the absence of ligand. While such events are predicted only by the allosteric model, for wild-type receptors they are expected to be rare (- 1/ 15 s) and rapid (-5 ps), and may therefore escape detection. With respect to the mutant, the trimodal pattern predicted by the model for the mutant is compared to the three peaks reported for recombinant ~T264Preceptors (Ohno et al., 1995) and represented in Fig. 19b by the individual points (*) calculated by summing the three experimentally observed phases and scaling the total number of events to minimize amplitude differences with the dwell time peaks predicted by the allosteric-typemodel. This initial fitting gives a reasonably satisfactory agreement between theory and experiment, considering the difficulties in extracting the parameters from the three overlapping experimental curves and the relatively limited quantity of data reported (Ohno et aL, 1995). Because the change in BALoresults in a low value of the %:,rate
a
WILD TYPE Molecular species:
Channel openings: closed open
[
U
~T264P
0.1 s
Molecular species:
Channel openings:
FIG.18. Stochastic simulations of conformational transitions between the B and A states at various degrees of ligand binding. Passages among the possible molecular species and the corresponding channel-opening events for human wild-type receptors in (a) and for the myasthenic mutant eT264P in (b). The simulations were conducted for a ligand concentration of 3 X lO-'M (Edelstein et al., 1997a) using the following parameters: all ligand "on" rates were set at 1 X 10"M-ls-l., hgand "off' rates were for the wild type, 5k,,, = 1.65 X 10' s-' and hk,,, = 0.1 S - I , and for the mutant eT264P, Bk,ll = 7.56 X lo5 s-I and *ken = 4.4 s-l. Interconversion rates were for the wild type, B.4k,, = 2.06 X lo-' s-l, Bhkl = 3.08 s-I, "b = 4.60 X lo's-), "ko = 1.86 X lo5 s-l, = 1.68 X ,*b = 1.52 X lo5 SC', and for the mutant eT264P, nAko = 65.8 s-', Wkl = 3.94 X los s-', = 2.40 X lo5 s-', '\Bk, = 6.58 X 10%s-I, "k, = 2.29 X lo2 s-', "b = 8.0 s-'. For wildtype nAChR the parameters were based on measurements of human muscle receptors expressed in 293 HEK cells (Milone et al., 1997; Wang et al., 1997). The values for and nhbwere set by the published values of aP and p2, respectively, with the other interconversion rates calculated using linear free energy relations on the basis of the value of the transition state parameter, "p = 0.2 (Edelstein et al., 1996). Values for the A state were obtained by principles of microscopic reversibility. The ratio "Bk/'*k2 yields a value of the primary constant for the allosteric transition, nALo= 9 X 10" for the wildtype nAChR. For the eT264P mutant, "kn and "b were calculated using the T~ and T? values reported for 0.3 p M ACh (Ohno et al., 1995). For "k,, the value derived from T~ (see the legend to Fig. 19) was corrected to 229 SKI, since according to linear free energy relations it should be intermediate between "ko and "b o n a logarithmic scale (Edelstein et al., 1996). For eT264P, the value of the transition state parameter is "p = 0.45. From fitting the dwell time distributions, the value of RALo was set at 100 and the rates for '"k0, "kl, and "*k2were deduced according to linkage relations. While the change in the value of "Lo dominated the properties of the ~ T 2 6 4 Pmutant, the fit to the data was improved by additional changes in Bk,U and *kOU.
ALLOSTERlC TRANSITIONS OF THE ACh RECEPTOR
a
155
WILD TYPE
FIG.19. Openchannel dwell times for stochastic simulations of singlechannel events: (a) wild type; (b) mutant ~T264P.The dwell time probability density function, or pdf
(Sigworth and Sine, 1987; Colquhoun and Hawkes, 1995), with the square root of the number of events vs. time on a logarithmic scale, present the predicted distribution of all species (thick line) and the contributions of the individual components (thin lines) for simulations as described in Fig. 18 for an ACh concentration of 3 X lO-'M. At this concentration, Popen = for the wild-type nAChR and Pop" = 0.22 for the mutant cT264P. where Popen= 1/(1 + "Lo {([XI/&) + l}z/{([X]/KA + 1)}2).with the ligand concentration indicated by [XI and the equilibrium constants as defined in Fig. 5. For wild-type receptors in (a), the individual points are obtained from the kinetic rate constants presented in the legend to Fig. 8b of the article by Milone et al. (1997) to represent activation over a wide range of ACh concentrations. For ~ T 2 6 4 Preceptors in (b), the individual points (-) are presented for the sum of the three phases of the experimentally observed open dwell times corresponding to the published values (Ohno et uL, 1995) for 0.3 pMACh of To = 150 ps, a, = 0.67; TI = 1.8 ms, a, = 0.16; and T~ = 69.5 ms, u2 = 0.17. The contributions to the pdf for eT264P of Az channel opening involve passage to BP,as well as passage to A,, and the sum of both processes is indicated by ZA, [in (b)]. The simulations corresponding to 10 bins for each integer interval of log t, with peak heights based on the number events occurring in a total time of t of 1 s. Other details concerning the calcuation of dwell time probabilities are presented in the legend to Fig. 22. From Edelstein et al. (1997a). (0)
156
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
constant, only about twice the value of (see the legend to Fig. 18), a significant fraction of the opening events involving A2 will terminate by dissociation of a ligand to yield Al. The sum of events terminating by both A2 + B2 and A2 + A, + B 1 is indicated by the peak labeled ZA2 (Fig. 19b). On the basis of this analysis, the three peaks of the cT264P mutant receptors may be readily interpreted in terms of the allosteric model as reflecting non-, mono-, and biligand receptors. The sequential-type models, which do not include an open state for nonliganded receptors, cannot satisfactorily represent such data. The sequential scheme contrasts with the basic postulate of the allosteric model according to which the conformational equilibrium is established prior to ligand binding (Fig. 4). While certain quantitative aspects thus remain to be clarified, the simulations and comparison to experimental data illustrate the key role predicted for the value of BALo in determining the relation between binding and ionic events. On the basis of the available data, it may therefore be tentatively concluded that the altered properties of cT264P receptors represent an L phenotype (see Section V,C, above). Altered desensitized states may also contribute significantly to mutant phenotypes, as deduced for the myasthenic mutant identified in the region of the agonist binding site, aG153S (Sine et al., 1995). This position occurs in the loop B region that has been found to play a critical role in the ligand-binding properties of the desensitized states, while the loop C region participates in agonist selectivity (Corringer et al., 1997),where A, B, and C refer to the three regions identified in specific labeling experiments (Devillers-ThiCryet al., 1992). The distinctions in the fine structure of the agonist-binding sites concerning desensitization and agonist selectivitywere deduced by comparing the a 7 homooligomer and the a4P2 heterooligomer, which display striking differences in their apparent affinities for ACh and in their pharmacological specificity for ACh vs. nicotine. Sets of residues from the regions initially identified within the agonist-binding site of the a4 subunit were introduced into the homooligomeric a7-V201 -5HT3 chimera, which carries the intact a 7 agonist-binding site (Eiselk et al., 1993). Introduction of the a 4 residues 151-155 of loop B produced an approximately 100-fold increase in the apparent affinity for both ACh and nicotine in equilibrium binding measurements, whereas electrophysiologicalrecordings revealed a much smaller increase (3- to 4fold) in the apparent affinity for activation. These observations are consistent with the notion that the residues mutated alter the transition to the desensitized state. In contrast, introduction of the a 4 residues 183-191 (from loop C) into a 7 selectively increased the apparent affinities for binding and activation by ACh, resulting in a receptor that no longer displays differences in the responses
AI.I.OSTERIC TRANSITIONS OF THE ACh RECEPTOR
157
to ACh and nicotine, demonstrating the importance of the C loop in agonist selectivity (Corringer et al., 1997).
B. Single Ionic Events vs. Single Ligand-Binding Events in Relation to Binding-Site Nonequivahce Single-channel measurements on muscle nAChR have made a prm found contribution to the understanding of these receptors since they provide high temporal resolution and the advantage of observations at the level of individual molecules (Neher and Sakmann, 1976; Sakmann et al., 1980; Colquhoun and Sakmann, 1985). However, to the present time, the linked events of ligand binding have only been inferred, indirectly, from single-channel recordings, since parallel observations on binding steps have not been possible. As a result, considerable ambiguity exists concerning the degree of equivalence of the two ligand-binding sites. Two equivalent sites were used in the sequential-type scheme to model single-channel measurements (for reviews, see Lingle et al., 1992; Edmonds et al., 1995), as well as in the four-state allosteric model (Edelstein et al., 1996). In contrast, in a number of other studies with muscle nAChR, marked apparent differences (up to 2 orders of magnitude) in the affinities of the two ligand-binding sites, such as may result from specific differences for binding sites at the a - y and a 4 interfaces (see Section I,B), have been used to develop alternative interpretations of experimental data (Jackson, 1988; Sine et al., 1990, 1995; Chen et al., 1995; Zhang et al., 1995; Akk et al., 1996; Ohno et aL, 1996; Milone et al., 1997). Moreover, a wide range of magnitudes for the differences in the kinetic parameters of the two sites for wild-type receptors have been assigned in these reports. Species differences and dependence on expression systems may in part explain such variability in the nonequivalence of the two binding sites (Edmonds et al., 1995), but uncertainties remain concerning the intrinsic functional properties of the two sites. Indeed, widely different rate constants can provide a satisfactory apparent description of the same data, since the properties of human muscle nAChR expressed in HEK cells have been interpreted in terms of a rather wide range of ligand-binding affinities (Sine et al., 1995; Ohno et al., 1996; Milone et al., 1997; Wang et al., 1997), varying from a 350-fold difference for the affinity of the two sites in the B state (Sine et al., 1995) to identical affinities for the two sites (Wang et al., 1997). It is clear that independent binding measurements are needed to overcome these ambiguities, and developments in the field of fluorescence correlation spectroscopy (Eigen and Rigler, 1994; Rauer et al., 1996; Edman et al., 1996; Schwille et al., 1997) now place such measurements in the realm of possibilities
158
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
for the near future. Therefore, simulations were performed in order to study what additional insights would be available with measurements that simultaneously follow both single ionic and binding events (Edelstein et al., 1997b). In this respect, theory has preceded experiment but has provided a stimulus for the necessary technical advances. The allosteric model was modified to incorporate two nonequivalent sites as described in Fig. 20. When the model is evaluated in stochastic simulations, trains of molecular forms are generated that vary with respect to the conformational state and/or the degree of binding site occupancy (Fig. 21a). Each change in the number of ligands bound is scored as a binding event (Fig. 21b), and each transition to an A-state molecular species is scored as an ionic event (Fig. 21c). Hence, Figs. 21b and c correspond to measurements that are expected to be produced experimentally in joint single-binding and singlechannel recordings, respectively. These stochastic simulations extending over 0.5 s only partially illustrate the behavior of the system. A more complete description is provided by the probabilities of events for each time interval (bin
BL
FIG.20. Subunit structure and ligand binding sites at the a - y and a-6 interfaces within the B state. Receptors with the ligand occupying the higher affinity site for agonist (a-6)are designated by the subscript H and those with the ligand occupying the lower affinity site for agonist (a-y)by the subscript L. Equilibrium constants are defined by the ratio of the corresponding "on" rates (k) and "off" rates (k'), e.g., = Bk'/sk. In the framework of a two-site protein with ligand binding characterized by two apparent ligand-binding dissociation constants, KI and K2,if Ke(H) &(I.). then KI = KB(H) and K2 = K B ( l lWhere . differences between KB(H)and KB(Llare smaller, the exact values are KI = KB(H)KB(l)/(KB(IIl + KB(L))and K2 = KBCH) + KBl,.,.For identical sites, where KB = = K B ~the ) , values of Kl and K2 are set by K, = K B / 2 and K2 = 2KB.The equilibrium = 'k,,/'k+, and KB([.)= "/lL /%,. constants are defined by the corresponding rates: Similar definitions may be applied to the other states to incorporate nonequivalence of their binding sites. (Edelstein et al., 1997b.)
*
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
159
a Molecular species
b
Binding events
!E t
C
Ionic events
closed
I[
open
I
I
u u uu
0.05 s
FIG.21. Stochastic simulations of ligand-binding and conformational transitions for a receptor with two nonequivalent sites: (a) passages among all possible molecular species; (b) passages scored as binding events; (c) passages scored as ionic events. For the species with one molecule of agonist bound, its presence on the high- or low-affinity site is noted, respectively, by H or L in the subscript, e.g., BI(H) or BI(,, for the B state. The simulations M for nonequivalent ligandwere conducted for a ligand concentration of 2 X binding sites using the following parameters derived from the data of Jackson (1988), with corrections incorporated (Jackson, 1993), and small adjustments made to permit agreement with the model based on linear transition state theory: k, = 1 X lo8 M-ls-’ for all sites except BL, for which k,, = 0.05 X 108M-ls-1 . The “off” rates (s-l) are = 2.5; A#L = 30; for the I state identical “off” rates Bk’H = 500; ’& = 1.8 X lo’; = for both sites were set at 4.0. The interconversion rates (s-’) were “ k , = 0.028; 1.8; nAkl(L)= 44; BAh = 2800; *’k, = 5013; “4 = 1604; “k, = 670; mb = 214. For the A P I transition, since no specific data are available on site nonequivalence, the values for equivalent sites (Table I ) were utilized. (Edelstein et al., 199%)
width) sampled, leading to the “probability density function,” or pdf (Sigworth and Sine, 1987; Colquhoun and Hawkes, 1995). However, since ligand binding dwell times will be lengthened by multiple passages between two conformational states at the same degree of ligand saturation, the contributions to the total binding events of all such multiple passages (which for nAChR involve the A state) must be included, as illustrated in Fig. 22. This analysis permits the complexity of binding events to be anticipated and potentially to provide information on the
160
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
time
FIG 22. Consequences of multiple transitions between conformational states. The individual dwell time profiles are presented for A2 binding events prolonged by passages to BP. Each pdf is presented as the square root of the number of events vs. time o n a ugo(z,),where a, is the fractional amplitude logarithmic scale and is defined by g(x) = of the$h component and go(z,) = exp[zJ - exp(z,)], with z, = x - sJ, x = In t, 1 = time in seconds, and s, = logarithm of the fi time constant (Sigworth and Sine, 1987). Since the number of events is proportional to t (the length of time examined), J (the fractional concentration of the reacting component), and 1 / ~ ,(the rate of relevant reaction), the amplitude of the jth component is given by: a, = ( A t ) / ? . The peak height for a specific class of events is given by A$ = ( J t drcr,[l - p,]e-')/q, where dx is the interval of In t used to set the width of the bins, r, is the relevant ratio of kinetic rate constants, [l - p,] gives the fraction of events remaining after a series of passages to a neighboring state, each with a probability of p,, and 6' corresponds to the maximum value of g(x), which occurs at the logarithm of 5. For the simulations presented here dx = 0.23, corresponding to 10 bins for each integer interval of log 1, with peak heights based on the number of events occurring in a total time t of 1 s. The term r, is calculated from the appropriate rate constants of alternative pathways. For example, each passage viaA2may be terminated by a transition (to I? or B4) or by a ligand dissociation; hence, the probability of a transition "#, to B2 will be given (for nonequivalent binding sites) by r, = ABk.J("& + *kL ,"&). Successive passages correspond to the series of distinct pdf curves presented here, with reduced probability and progressively longer characteristic values of the average 7,. The probability for each successive passage to A2 is diminished by a factor, p, = (BAh 'KL B#I ) ] [ A B b / ( M k l =k;C M k 2 ) The ] . sum of all such events is given by the series HS = 1 + p, + p,2 + ... = 1/[1 - p,], and the fraction of primary ligandbinding events without passage to another state is given by [ l - p,]. The contributions of all prolongations are summed, added to the primary A-state binding events, and the totals are indicated by HAPin Fig. 23. (Edelstein et al., 1997b.)
+
+
+
+
+
+
+
mechanism of signal transduction that would not be available from single-channel recordings alone. For example, it would be possible to resolve the contributions of the two potentially nonequivalent binding sites, as illustrated in Fig. 23. For simulated recordings of muscle AChR at low ligand concentrations, only minor differences are predicted for dwell time profiles of ionic
161
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
a Bindinq events
C
10-6
time
102
1
FIG.23. Dwell time probability profiles for stochastic simulations of binding events and ionic events for nAChRat low ligand concentrations: (a) ionic events and (b) binding events for simulations based on data interpreted with equivalent sites; (c) ionic events and (d) binding events for simulations based on data interpreted with nonequivalent sites. The dwell times are presented as the total events, corresponding to simulated experimental measurements (thick lines), along with the underlying contributions of the individual components (thin lines). The simulations are based on the values in Table I and the legend to Fig. 21, with a ligand concentration of [XI = 0.3 p M in (a) and (b) and [XI = 1.7 pMin (c) and (d), corresponding in both cases to a probability of channel & = 0.002, computed with the equation Pop. = 1/(1 + K,,.{(BK,BK,)/ opening, F [XI' + BK2/[X] + l}/{(*K~K,)/[X]* *&/[XI + 1)). Other details are as described in Fig. 22. (Edelstein et al., 1997b.)
+
events with parameters based on analyses with equivalent sites (Fig. 23a) vs. nonequivalent sites (Fig. 23c). At the concentrations of these simulations, ionic events are predicted to be rare, - l / s (corresponding to a probability of channel opening of Popen = 0.002), whereas binding events are predicted to be at least an order of magnitude more abundant (Fig. 23b and d). With respect to the two principal models, for both equivalent (Fig. 23a) and nonequivalent sites (Fig. 23c), more ionic events (at shorter average times) are predicted by the allosteric model (thick lines) compared to the sequential model (thin lines, correspond-
162
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
ing to At, the only molecular species producing ionic events in the simple form of the sequential model). The predicted profiles for normal human receptors expressed in HEK cells (Fig. 19a) are dominated to a great extent by AS, since the data analyzed yielded a higher value of BALocoupled with a higher affinity of the A state (Edelstein et aL, 1997a). With the parameters based on nonequivalent sites, the shoulder at longer times (- 10-2s) on the profile of ionic events in Fig. 23c is slightly more pronounced and at all concentrations fewer events are predicted than for equivalent sites, due to a lower estimate for the value of ABk2. However, for single-channel experimental data with the usual limits of precision, it would be difficult to distinguish between the equivalent and nonequivalent interpretations. It can thus be concluded that a compensation of parameters leads to similar properties in the two cases. This compensation may explain why experimental single-channel recordings have been interpreted with equivalent sites in some cases and with nonequivalent sites in other cases (Lingle et al., 1992; Edmonds et aL, 1995). As a result, meaningful conclusions cannot readily be drawn from single ion channel recordings alone. In contrast to the similarity of ionic events, larger differences in the simulated binding events are predicted for equivalent sites (Fig. 23b) vs. nonequivalent sites (Fig. 23d). In the latter case, binding of the first ligand to a receptor molecule is predicted to occur almost exclusively at the higher affinity site to generate BI(H).Since an off rate 16-fold lower than in the case of equivalent sites was deduced, the peak in the dwell time profile for binding events is predicted to lie at significantly longer times: 2 X s for nonequivalent sites (Fig. 23d) compared to 1.2 X s for equivalent sites (Fig. 23b). Hence, if single ligandbinding events were measured experimentally, their dwell time profiles could provide a direct test of the extent of binding site nonequivalence. At higher ligand concentrations in the range of a probability of channel opening of Popen = 0.5, the simulated ionic events arise mainly from transitions to A2 and are anticipated to be almost as abundant as the binding events. As a consequence, the predictions of the allosteric and sequential models are virtually identical for ionic events, and very similar binding events are also predicted for both equivalent and nonequivalent sites. Therefore, experiments at low ligand concentrations should be favored in order to distinguish between allosteric vs. sequential models and equivalent vs. nonequivalent sites. In comparison to these deduction for muscle-type nAChR, the situation for neuronal nAChR is more complicated because of the variety of neuronal forms and the differences for homopentameric and heteropentameric assemblies. For homopentameric neuronal nAChR composed
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
163
of a?,a8,or a9, the identical subunits presumably impose an equivalence of sites at each interface, with each subunit contributing a principal component at one interface and a c o m p k t a r j component at the other interface (Corringer et aL, 1995). Other neuronal receptors with the heteropentameric composition a4P2 or a3P4 would presumably possess functionally equivalent ligand-binding sites at identical a4-/32 or a3-P4 interfaces, with the complementary component provided in these cases by the P-type subunit. More complicated combinations may be expected for receptors with subunit compositions a3a5P4 (Vernallis et aL, 1993), a4a5P2 (Conroy et aZ., 1992; Ramirez-Latorre et aL, 1996) or a6P2P3 (Le NovSre et aZ., 1996). For these receptors, it has been suggested that a5 and P3 could exert a y-like role (Le NovSre and Changeux, 1995). However, these systems have not as yet been analyzed in sufficient detail to draw conclusions concerning the degree of functional nonequivalence of the ligand-binding sites. EFFECTORS AND COINCIDENCE DETECTION VII. ALLOSTERIC Various modifications of the functional properties of AChR are produced by noncovalent interactions with pharmacological agents and other modulators (Lena and Changeux, 1993; Changeux, 1996) and covalently by phosphorylation (Huganir and Greengard, 1990; Levitan, 1994). In general, when differences in current are observed in the presence and absence of a potential allosteric effector, it has not generally been determined whether affinity changes and/or specific conductance changes are responsible for the differences. In this respect the study by Mulle et al. (1992) is of particular interest since potentiation by calcium was shown to result from an increase (-%fold) in the frequency of channel opening. A modeling of these data shown in Fig. 24 indicates that the potentiation by calcium can be represented simply in terms of a 2.3-fold reduction in the allosteric constant, with all other parameters remaining unchanged from the values in Table I. The lower value of BALois sufficient to account for the shift of the curve to the left and the augmented increased response (Fig. 24a). The effect of calcium is thus a true allosteric modulation, because the change in frequency of openings indicates an altered value of the rate constant 9 z 2 that corresponds closely to the change estimated for "Lo, since BALoBAC2 = ABk2/BAk2.Calcium modulation of the Lvalue for activation has also been observed for the nAChR-5HTSR chimera (Galzi et aL, 1996a), and more general implications of calcium as an intracellular signal may be related to the fact that neuronal nAChRs as a class have a high relative permeability to calcium (Rathouz et al., 1996; Vernino et aL, 1992).
164
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
a 1
0.8 a,
a C
0.6
0
n
u:
0.4 0.2
..
0
01 0.1
1
10
I 100
Frequency FIG. 24. Data and theoretical curves for the potentiation of rat medial habenular neurons by external calcium. (a) Dose-response curves for the effect of ACh and currents in the presence of 4 mM calcium (B) and in the absence of added calcium ( 0 )from the report by Mulle et al. (1992). The dashed lines are obtained with the four-state allosteric model using the parameters of Table I, with the exception of the values of BALo = 5.25 X lo5for the curve that coincides with the squares (corresponding to the presence of
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
165
Allosteric effects of the type observed for potentiation by calcium have a number of implications for regulation of activity in the nervous system. At synapses, allosteric modulators could provide signals for coincidence detection (Heidmann and Changeux, 1982; Changeux and Heidmann, 1987). Such an effect for ACh and calcium may be visualized in Fig. 24b, assuming a constant threshold that is attained only in the presence of both ACh and calcium (on the left), but not with ACh alone (on the right). Since calcium may enter neurons via the open ligand-gated cation channel of various neuronal AChR or voltage-gated channels, regulatory effects may also be produced related to intracellular accumulation (such as phosphorylation) or extracellular depletion (such as habituation resulting from repeated stimulation by ACh). In addition to its role in regulating synaptic efficiency, diminished extracellular calcium concentrations might also participate in retrograde signaling. In general, calcium or other allosteric effectors might provide an alternative to the NMDA (&methyl-D-asparticacid) Ca'+/M$+ coincidence detection system (Wigstrom and Gustafsson, 1985), as discussed more fully in Section VII1,B. Similar effects that could play a role in coincidence detection by shifting the dose-response curve with a change in maximal response
4 mMcalcium) and RALo = 1.25 X 10' for the curve coincidingwith the circles (corresponding to no added calcium). (b) Illustration of coincidence detection. For kinetic simulations (time in seconds) with an agonist concentration of 2 X 10-4M, only the response on the = 5.25 X lo5 (corresponding to 4 mM calcium), but not the response on left with RALO the right with "Lo = 1.25 X 10' (corresponding to n o calcium), reaches a hypothetical threshold for neuronal firing (dashed line). (c) A sliding threshold model. The normalized response is presented as a function of the stimulation frequency (Hz). For the system at "Lo = 5 X 10' it is assumed that the B # A equilibrium may be displaced by phosphorylation away from A and by dephosphorylation toward A. Entry of calcium proportional to the stimulation frequency produces activation of the relevant kinase and phosphate, but with the former more sensitive and the latter more cooperative. In this case is calculated with the standard equation (see the legend to Fig. 15) for "Lo = 5 X lo5 and a ligand concentration (0.17 mM) corresponding to 50% response in the absence of stimulation. The value of "Lo is assumed to vary with frequency of stimulation and subsequent calcium influx according to the equation wLi = "Lo (1 + 4y4)/(l+ cpf"'),where I$ and cp are coupling coefficients for the activation of kinase or phosphatase by calcium and ~ I # Jand ncp give the apparent Hill coefficients for the interactions, respectively. The biphasic depression-potentiation pattern requires I$ > cp and n 4 < ncp. For the simulation presented here, since phosphorylation leads to inhibition, I$ corresponds to the kinase and cp corresponds to the phosphatase (4 = 1.2 and cp = 0.6 in arbitrary units; nI$ = 1.2 and ncp = 1.9). Simulation for the mutant kinase with increased activity is obtained by setting 4 = 3. The simulation assumes that the receptor is partially phosphorylated in the absence of stimulation. (a) and (b) from Edelstein et al. (1996).
166
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
amplitude are observed for modulators such as steroids (Valera et al., 1992; Buisson and Bertrand, 1997). Allosteric effects may also involve modulation of desensitized states, for example, as is observed with substance P (Valenta et al., 1993) or following phosphorylation with CAMPdependent protein kinase (Huganir et al., 1986). Overall, any substance that alters the preexisting equilibria between states can exert an effect on synaptic efficiency or coincidence detection (see Changeux, 1990, 1996). More complex effects involving a sliding threshold (Bienenstock et al., 1982) may also be formulated (Fig. 24c), and the general implications for sliding thresholds are presented in Section VII1,B.
CONSIDERATIONS VIII. GENERAL
A. Evaluation of Mechanistic Models The simulations presented were used to determine conditions that could lead to experimentally testable differences in the predictions of the allosteric-type and sequential-type models. In addition, the simulations explored what additional distinctions could be furnished by measurements of single ligand-binding events, particularly for establishing the degree to which the binding sites are nonequivalent. The analysis of single binding events remains hypothetical at present compared to the single-channel measurements that have been extensivelydeveloped since the early applications of this technique (Neher and Sakmann, 1976; Sakmann et al., 1980). While a singlechannel event can be recorded because of the amplification derived from the flux of thousands of ions, no such amplification is produced by a single binding event. However, recording of single binding events are in principle feasible with fluorescence correlation spectroscopy (Elson and Magde, 1974; Magde et al., 1974; Ehrenberg and Rigler, 1974), in light of recent advances (Eigen and Rigler, 1994; Rauer et al., 1996; Edman et al., 1996; Schwille et al., 1997). As indicated by the simulations described here, such measurements could be utilized to evaluate critically the degree of nonequivalence of the ligand-binding sites. While singlechannel data have been interpreted both in terms of equivalent and nonequivalent sites (Edmonds et al., 1995),compensating effects in the values of the parameters result in similar predictions in the two cases (see Fig. 23a vs. Fig. 23c). However, larger differences are predicted for the dwell time profiles of the binding events at low agonist concentration (Fig. 23b vs. Fig. 23d)
ALLOSTERlC TRANSITIONS OF THE ACh RECEPTOR
167
that should readily distinguish between the equivalent and nonequivalent schemes if the appropriate experiments could be conducted. Concerning the extent of nonequivalence of the binding sites, an allosteric-type model with substantial nonequivalence was proposed by Jackson (1988) to account for the data of wild-type muscle receptors. The extent of nonequivalence of the ligand-binding sites for the B state was evaluated entirely on the basis of kinetic measurements derived from single-channel recordings. Other measurements, particularly the degree of cooperativity in dose-response curves, can also provide relevant information, where a Hill coefficient ( nH)> 1indicates positive cooperativity. For wild-type receptors, with the parameters deduced by Jackson for nonequivalent sites, and at ligand concentrations near the ECS0virtually all of the openings accompany binding of the second ligand molecule to the low-affinity site. As a result, the predicted dose-response curve is noncooperative ( n 1.0). In contrast, for wild-type receptors with equivalent sites (described by the parameters in Table I), the predicted dose-response curve is strongly cooperative: n = 1.7 (Edelstein et al., 1996). Similarly, for the ~T264Pmutant receptors, the relevant parameters (Fig. 23) predict highly cooperative dose-response curves, with n = 1.8. Hence, there is a discrepancy between the noncooperative dose-response curve predicted for wild-type receptors with strongly nonequivalent sites and the widely observed cooperativity characterized by n > 1.5 (Changeux and Edelstein, 1994). One possible explanation for this discrepancy in the predicted and observed Hill coefficients is the presence of a channel block that modifies the response in such as way as to generate an “apparent” cooperativity (Sine et al., 1990; Forman and Miller, 1988). Adding an ACh channel block with a dissociation constant of 1.3 mM (Colquhoun and Ogden, 1988) reduces the maximal amplitude of the response to about 50%. However, if the maximum ligand concentration examined for the doseresponse curve is limited to -1 mM, the reduced amplitude could be considered as a full response and scaled to 100%. In this case, the vertical “stretching” of the dose-response curve would result in an increase in the Hill coefficient to 1.5. Since the dose-response measurement is generally evaluated between arbitrary end points assumed to correspond to 0 and loo%, the presence of channel block could explain the degree of cooperativity generally observed, even if there is strong nonequivalence of the binding sites. Rapid desensitization that also impinges on the maximal amplitude of the response could in principle exert a similar effect (see Fig. lo), but measured desensitization rates following rapid ligand application are too slow to cause a significant attenuation of the amplitude (Franke et al., 1993).
168
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
B. Implications fw Synaptic Plasticity
The allosteric model with preexisting equilibria between a minimum of four conformation states, B, A, I, and D, satisfactorily accounts for the kinetic properties of the AChR in vivo and in uitro. Application of the model to single-channel events by conversion of kinetic constants into probabilities of microscopic events clarifies the effects of ligand binding on the patterns of interconversions between the various states. The formulation of the model in terms of single-channel events also opens the possibility for simulations at the level of a synapse, with a finite number (-lo3) of receptors. While many issues still need to be clarified in order to model a synapse more accurately, particularly with respect to quanta1 analysis (Bekkers, 1994))simulations with a full functional model of the type presented here could provide new insights, since previous efforts have not included desensitization (Bartol et al., 1991; Faber et al., 1992). With respect to artificial neural networks, an understanding of these aspects should lead to more realistic modeling. While synaptogenesis has also been considered and incorporated in some models (AdelsbergerMangan and Levy, 1994; Foldiak, 1990), detailed schemes based on experimental observation have been proposed mainly for the neuromuscular junctions (Changeux et al., 1973; Changeux and Danchin, 1976; GouzC et al., 1983). Other important features for the development of more biologically realistic modeling concern delays and oscillatory behavior (A. Herz et al., 1989; Kerszberg and Zippelius, 1990; Buonomano and Merzenich, 1995; Hopfield, 1995; Hangartner and Cull, 1996; Kerszberg and Masson, 1995), but these aspects have not as yet been brought to the molecular level with respect to ligand-gated ion channels and metabotropic receptors. In contrast, for the basic concept of synaptic plasticity (Hebb, 1949), as applied in numerous models (Bienenstock et al., 1982; Amit, 1989; Churchland and Sejnowski, 1992; Edelman et al., 1992; Montague and Sejnowski, 1994),plausible mechanisms based upon allosteric regulation of ligand-gated channels have been formulated, particularly in relation to synaptic triads (Heidmann and Changeux, 1982;Dehaene et al., 1987; Changeux and Dehaene, 1989; Dehaene and Changeux, 1989, 1991; Changeux, 1993). In this approach, signals produced by a synaptic terminal C acting on neuron B are assumed to regulate the efficacy of the postsynaptic synapse of A + B with an allosteric switch of postsynaptic receptors from neuron B. The regulatory effects could intervene to stabilize one of the allosteric conformations and as a consequence alter the corresponding rates of interconversion. Networks based on such
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
169
triads can learn and produce temporal sequences (Dehaene et al., 1987) and have been used to formalize neural networks able to perform a number of complex temporal processes, such as delayed-response tasks (Dehaene and Changeux, 1989) or the Wisconsin card sorting test (Dehaene and Changeux, 1991). The role of these synaptic triads in recognizing and producing temporal sequences is related to their capacity to function as coincidence detectors (Dehaene et al., 1987).Where the A+ B postsynaptic receptor can undergo transitions to short-lived ( I ) and long-lived (D) desensitized states, systems composed of such triads are capable of both short-term detection and long-term storage and retrieval. We assume that the multiple conformational states of the nAChR, as well as other ligand-gated receptors, could fulfill such functions. Such coincidence detection based on changes in receptor conformation may be contrasted with the conventional view (Montague and Sejnowski, 1994) that the NMDA receptor is responsible, based on its voltagedependent M$+ channel block (Wigstrom and Gustafsson, 1985). While such a mechanism can play a physiologically important role, it can only be short term in its direct effects and limited to the specific features of the NMDA channel; any subsequent learning processes must be dependent on other cellular components capable of long-term storage (see “LTP” below). Hence, coincidence detection based on conformational transitions would have the advantage of being applicable to virtually all members of the superfamily of ligandgated channels and capable of participating, in principle, in both shortterm discrimination and long-term storage. In a specific example of possible coincidence detection, we have suggested that allosteric effectors of the neuronal nAChR operating at sites distinct from the channel can provide a suitable mechanism (see Fig. 24b). This hypothesis assumes a direct effect of calcium on receptor properties. Indirect effects could also occur, as postulated in the model assigning a role to calcium in the modulation of intracellular phosphorylation responsible for the forms of hippocampal synaptic plasticity hiown as long-term potentiation (LTP) or long-term depression (LTD). This model proposes that LTD results from activation of phosphatases at low calcium influxes,whereas LTP results from activation of kinases at higher calcium influxes (Lisman, 1989, 1994). This biphasic pattern has been interpreted in terms of the sliding threshold model (Bienenstock et al., 1982) first developed to account for aspects of visual cortical development and subsequently observed in other neurological systems (Bear, 1995). The role of CaM kinase I1 in this form of plasticity has been addressed by the production of genetically modified mice. Mice with a targeted
170
STUART J. EDELSTEIN AND JEANPIERRE CHANGEUX
disruption of the CaM kinase a subunit lack LTD (Silva et al., 1992; Stevens et al., 1997).A specific application of the sliding threshold model has been reported for mice with a site-directed mutation (T286D) in CaM kinase I1 that possesses high constitutive activity in the absence of activation by calcium (Mayford et al., 1995). Compared to wild-type mice, the threshold for the transition form LTD to LTP in the CAI region of the hippocampus appears to move to higher frequencies in the transgenic mice with the T286D mutation of CaM kinase I1 (Mayford et al., 1995). While evidence for a role of phosphorylation is accumulating, the targets of the phosphorylation effects are not yet fully established. Although receptor phosphorylation may be involved, particularly AMPA (cu-amino-3-hydroxy-5-methylisoxazole4proprionate)-type glutamate receptors (Raymond et al., 1993; Barria et al., 1997), presynaptic processes, possibly triggered by retrograde signals, have also been implicated, as demonstrated in mice with reduced synthesis of NO leading to reduced LTP in the same region of the hippocampus (Son et al., 1996). While the exact role of phosphorylation in LTD and LTP remains unclear, it is of interest to note that phosphorylationdependent changes in the equilibria between allosteric states for a ligand-gated receptor can readily lead to biphasic responses and sliding thresholds, as illustrated in Fig. 24c. If phosphorylation activates the receptor, the requirement is simply that the appropriate phosphatase is activated at low calcium concentrations but with low cooperativity with respect to calcium, whereas the relevant kinase is activated at higher concentrations but with a more cooperative response to calcium. In this case, the phosphatase will dominate at low calcium (corresponding to low stimulation frequencies) to produce LTD, but the kinase will gradually become dominant as the calcium concentration rises (corresponding to higher frequencies), producing LTD. Such biphasic behavior can also be generated if phosphorylation inhibits the receptor, but in this case the kinase must be activated at lower calcium concentrations with low cooperativity and the phosphatase activated at higher calcium concentrations with high cooperativity. In the simulation presented in Fig. 24c, the threshold is displaced to higher frequencies by increased phosphorylation. This behavior is achieved by assuming that phosphorylation shifts the equilibria between the B and A states in favor of B, or in an equivalent manner by favoring one of the desensitized states (Huganir et al., 1986). It is further assumed that in the presence of a mutated kinase phosphorylation is enhanced, resulting in increased LTD and displacement of the threshold to the right. This simulation is based on the parameters of the nAChR, but the curves resemble the data presented for wild-type and mutant CaM
ALLOSTEXUC TRANSITIONS OF THE ACh RECEPTOR
171
kinase I1 (Mayford et al., 1995). Thus a sliding threshold based on the allosteric transitions of a ligand-gated channel could be responsible for the biphasic behavior reported, but considerable additional data will be required to establish the detailed molecular basis for such a mechanism in hippocampal LTD-LTP (Lisman et al., 1997).
C. Diseases and Nicotine Dependenq While additional insights should be obtainable in future studies of wild-type receptors under conditions where alternative models differ in their predictions, more critical experiments may be achieved with receptors resulting from mutations that produce strong perturbations. Site-directed mutations can play an important role in this respect (see below), but spontaneous mutations responsible for myasthenic syndromes have also provided new insights into the mechanism of muscle nAChR and clarified the pathology of these clinical syndromes (Ohno et al., 1995, 1996; Sine et al., 1995; Gomez and Gammack, 1995; Gomez et al., 1996; Vincent et al., 1997; Lena and Changeux, 1997a). Opening frequencies superior to the wild-type levels appear to cause cellular damage, probably due to an excessive influx of electrolytes, particularly calcium (Engel, 1993). Among the mutations responsible for the various myasthenic syndromes, the high-affinity eT264P mutant receptors examined here illustrate the physiological consequences resulting from facilitation of the transition to the open state. The results reported for eT264P mutant receptors (Ohno et al., 1995) are particularly striking, since the profile of open channel dwell times displays three peaks, compared to the single peak for wild-type receptors. Assuming that the energy required for the B + A transition is reduced, as represented by a value of "Lo = 100 for eT264P (Fig. 18), significant channel opening is predicted for receptors with no ligands or one molecule of ligand bound (Fig. 19b).The three peaks for the mutant receptors may then be readily interpreted with the MWC-type model (L phenotype) as reflecting non-, mono-, and biligand receptors, with reasonable agreement obtained between theory and experiment (Fig. 19b). The failure of the sequential-type model to accommodate these data stems principally from the absence of non- or monoliganded open states (Fig. 7). The eT264P mutation represents a class of high-affinity mutants lying in the channel, as first discovered for the L247T mutation of neuronal a7 AChR (Revah et al., 1991) and subsequently confirmed with muscle nAChR (Labarca et al,, 1995; Filatov and White, 1995), as well as with 5-HT3receptors (Yakel et al., 1993) and GABAA receptors (Chang et al., 1996). A Leu + Met mutation at the position in the p subunit of human
172
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
muscle AChR corresponding to L247 in a7 (see Fig. 2d) is responsible for a severe slow-channel myasthenic syndrome (Gomez et aL, 1996). The a 7 mutations such as L247T dramatically increase the sensitivity to ligand, convert the competitive antagonist DHPE to a partial agonist (Bertrand et aL, 1992), alter single-channel conductances (Revah et aL, 1991), and lead to spontaneous channel opening (Bertrand et aL, 1997). These results are most readily interpreted by a normally closed, highaffinity desensitized state that is rendered conducting by the Leu + Thr replacement in the channel, thereby constituting a y phenotype (Galzi et aL, 1996b). Related mutations such as V251T, characterized in a 7 receptors (Devillers-ThiCry et al., 1992; Galzi et aL, 1992) share some features with L247T but can be interpreted in terms of an L phenotype (Galzi et aL, 1996b). The myasthenic mutation ~T246Poccurs at the residue adjacent to the Val corresponding to the a7 site mutated in V251T (see Fig. 2d). Similarly,an Ile + Asn mutation in the deg-3 AChR gene that induces neurotoxicity in Caenmhabditis elegans occurs at the site in M2 equivalent to the Val mutated in a 7 V251T (Treinin and Chalfie, 1995). Neuronal nAChRs have also been implicated in inherited forms of epilepsy due to channel mutations in the neuronal AChR a 4 subunit (Steinlein et aL, 1995, 1997). In one case (Steinlein et al., 1995;Weiland et aL, 1996), mutation of Ser 248 to Phe at the homologous site of the chlorpromazine-labeled Ser in Torpedo receptor M2 (Giraudat et al., 1986; Hucho et al., 1986) causes a significant enhancement of desensitization and a reduction of maximal response at saturating ACh concentrations. In the other case, insertion of an additional Leu at the C-terminal end of M2 causes an increase of affinity associated with a lower calcium permeability (Steinlein et al., 1997). The sites of these mutations within the M2 domain are summarized in Fig. 2d. With respect to other neurological disorders, indirect evidence has linked a 7 to an inherited form of schizophrenia (Freedman et al., 1997). Finally, studies on the molecular properties of nAChR should clarify the molecular basis of nicotine addiction via smoking. Some insights may already be available from desensitization experiments. The utilization of the lowconcentration prepulse method to evaluate desensitization is shown to produce measurements of IC5,, that introduce systematic distortions, since the system is far from equilibrium unless prepulse duration times approach tens of minutes (Fig. 11). Because of these latter limitations for obtaining equilibrium IC5,,values, considerable additional data will be required with long prepulse times in order to define the parameters of the D state more fully. Recovery times from the D state can be extremely long (hours), since they will be limited by D'ko, which may be
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
173
as low s-’ (Table I; see also Fig. 9c). It is interesting to note that slow structural changes have been reported (Chang and Bock, 1977). Response times in this range could contribute to upregulation, downregulation, and other pharmacological effects associated with both chronic and acute nicotine administration (Ochoa et al., 1990; Lukas and Bencherif, 1992; Peng et d.,1994; Dani and Heinemann, 1996; Lena and Changeux, 1997a). Heavy and light smoking regimes may be related to maintaining the D state by the former and permitting stimulatory effects via action on the A state by the latter (Wonnacott, 1990). Nevertheless, a great deal of additional information will be required to establish the exact sites where nicotine exerts its pharmacological effect as a drug of abuse and which forms of the neuronal nAChR participate. Although a4P2 has been suggested as a likely target (Peng et al., 1994), a prime role for a 6 has recently been proposed (Le Novere et al., 1996). Studies using transgenic mice are also providing a powerful tool for the elucidation of specific effects of neuronal nAChR subunits (Picciotto et al., 1995, 1998). Ultimately, the allosteric scheme may contribute insights into diseases related to altered transitions between conformational states and aid in the understanding of drug addiction. ACKNOWLEDGMENTS The research described here from our own laboratories was supported by the Swiss National Science Foundation, the Association Franqaise contre les Myopathies, the CollPge de France, the Centre National de la Recherche Scientifique, the Institut National de la SantC et de la Recherche Medicale, the Direction des Recherches Etudes et Techniques, Human Frontiers, EEC Biotech and Biomed, and the Council for Tobacco Research.
REFERENCES Adelsberger-Mangan, D. M., and Levy, W. B. (1994).The influence of limited presynaptic growth and synapse removal on adaptive synaptogenesis. Biol. C y b m t . 71,451-468. Akabas, M. H., and Karlin, A. (1995). Identification of acetylcholine receptor channellining residues in the M1 segment of the a subunit. Biochemistly 34, 12496-12500. Akk,G., Sine, S., and Auerbach, A. (1996).Binding sites contribute unequally to the gating of mouse nicotinic aD200N acetylcholine receptors.J. Physzol. (London)496,185-196. Amit, D. J. (1989). “Modeling Brain Function.” Cambridge University Press, Cambridge, UK. Auerbach, A,, Sigurdson, W., Chen, J., and Akk, G. (1996). Voltage dependence of mouse acetylcholine receptor gating: Different charge movements in di-, mono-, and unliganded receptors. J. Physiol. (London) 494, 155-179. Aylwin, M. L., and White, M. M. (1994). Gating properties of mutant acetylcholine receptors. Mol. Phannacol. 46, 1149-1 155. Barria, A,, Muller, D., Derkach, D., Grimth, L. C., and Soderling, T. R. (1997). Regulatory phosphorylation of AMPA-type glutamate receptors by CaM-KII during long-term potentiation. Science 276, 2042-2045.
174
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
Bartol, T. M., Land, B. R., Salpeter, E. E., and Salpeter, M. M. (1991). Monte Carlo simulation of miniature endplate current generation in the vertebrate neuromuscular junction. Biophys. J. 59, 1290-1307. Bear, M. F. (1995). Mechanism for a sliding synaptic modification threshold. Neuron 15, 1-4. Bekkers, J. M. (1994). Quanta1 analysis of synaptic transmission in the central nervous system. Cum @in. Neurobiol. 4, 360-365. Bertrand, D., and Changeux, J.-P. (1995). Nicotinic receptor: an allosteric protein specialized for intracellular communication. Semin. Neurosci. 7, 75-90. Bertrand, D., Devillers-Thitry, A., Revah, F., Galzi, J.-L., Hussy, N., Mulle, C., Bertrand, S., Ballivet, M., and Changeux, J.-P. (1992). Unconventional pharmacology of a neural nicotinic receptor mutated in the channel domain. Proc. Nutl. Arad. Sci. U.S.A. 89, 1261-1265. Bertrand, S., DevilIersThikry, A., Palma, E., Buisson, B., Edelstein, S. J., Corringer, P.-J., Changeux, J.-P., and Bertrand, D. (1997). Paradoxical allosteric effects of competitive inhibitors on neuronal a7 nicotinic receptor mutants. NeuroRepot? 8, 3591-3596. Bienenstock, E. L., Cooper, L. N., and Munro, P. W. (1982). Theory for the development of neuron selectivity: Orientation specificity and binocular interaction in visual cortex. J. Neurosci. 2, 32-48. Blount, P., and Merlie, J. P. (1989). Molecular basis of the two nonequivalent ligand binding sites of the muscle nicotinic acetylcholine receptor. Neuron 3, 349-357. Bormann,J., RundstrBm, N., Betz, H., and Langosh, D. (1993). Residueswithin transmembrane segment M2 determine chloride conductance of glycine receptor h o m e and hetero-oligomers. EMBOJ. 12, 3729-3737. Brussard, A. B., Yang, X., Doyle, J. P., Huck, S., and Role, L. W. (1995). Nicotinic enhancement of fast excitatory synaptic transmission in CNS by presynaptic receptors. Science 269, 1692- 1696. Buisson, B., and Bertrand, D. (1997). Steroid modulation of the nicotinic receptor. In Neurosteroids: A New Regulatory Function in the Nervous System (E. E. Baulieu, P. Robel, and M. Schumacher, eds.). Humama Press, Totowa, NJ (in press). Buonomano, D. V., and Merzenich, M. M. (1995). Temporal information transformed into a spatial code by a neural network with realistic properties. Science267,1028-1030. Chang, H. W., and Bock, E. (1977). Molecular forms of acetylcholine receptor: Effects of calcium ions and a sulfhydryl reagent on the occurrence of oligomers. Biochemistly 16, 4513-4520. Chang, Y., Wang, R., Barot, S., and Weiss, D. S. (1996). Stoichiometry of a recombinant GABA, receptor. J. Neurosci. 16, 5415-5424. Changeux, J.-P. (1990). Functional architecture and dynamics of the nicotinic acetylche line receptor: An allosteric ligand-gated channel. In “Fidia Research Foundation Neurosciences Award Lectures” (J.-P. Changeux, R. R. Llinas, D. Purves, and F. F. Bloom, eds.), Vol. 4, pp. 17-168. Raven Press, New York. Changeux, J:P. (1993). A critical view of neuronal models of learning and memory. In “Memory Concepts-1993. Basic and Clinical Aspects” (P. Anderson, 0. Hvalby, 0. Paulsen, and B. Hokfelt, eds.), pp. 413-433. Elsevier, Amsterdam. Changeux, J.-P. (1996). Neurotransmitter receptors in the changing brain: Allosteric transitions, gene expression and pathology at the molecular level. In “The Nobel Symposium 1994: Individual Development over the Lifespan-Biological and Psychcsocial Perspectives” (D. Magnusson, ed.), pp. 107-138. Cambridge University Press, Cambridge, U K
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
175
Changeux, J.P., and Danchin, A. (1976). Selective stabilization of developing synapses as a mechanism for the specification of neuronal networks. Nature (London) 264, 705-712. Changeux, J.-P., and Dehaene, S. (1989). Neuronal models of cognitive functions. Cognition 33, 63-109. Changeux, J.-P., and Edelstein, S. J. (1994). On allosteric transitions and acetylcholine receptors. Trends Biochem. Sci. 19, 339-340. Changeu, J.-P., and Heidmann, T. (1987). Allosteric receptors and molecular models of learning. In “Synaptic Function” (G. Edelman, W. E. Gall, and W. M. Cowan, eds.), pp. 549-601. Wiley, New York. Changeux, J.P., and Rubin, M. M. (1968).Allosteric interactions in aspartate transcarbamylase. 111. Interpretations of experimental data in terms of the model of Monod, Wyman, and Changeux. Biochemistq 7, 553-561. Changeux, J.P., Thiiry, J.-P., Tung, T., and Kittel, C. (1967). On the cooperativity of biological membranes. Proc. Nutl. Acad. Sci. U.S.A. 57, 335-341. Changeux, J.-P., Courrege, P., and Danchin, A. (1973). A theory of the epigenesis of neural networks by selective stabilization of synapses. Roc. Natl. Acad. Sci. U S A . 70, 2974-2978. Changeux, J.-P., DevillerfThiCry, A., and Chemouilli, P. (1984). Acetylcholine receptor: An allosteric protein. Science 225, 1335-1345. Chatrenet, B., Trimeau, O., Bontems, F., Goeldner, M. P., Hirth, C. G., and Minez, A. ( 1990). Topography of toxin-acetylcholine receptor complexes by using photoactivatable toxin derivatives. Proc. Nutl. Acad. Sci. U.S.A. 87, 3378-3382. Chen, J., Zhang, H., Akk, G., Sine, S., and Auerbach, A. (1995). Activation kinetics of recombinant mouse nicotinic acetylcholine receptors: Mutations of a-subunit tyrosine 190 affect both binding and gating. Biophys. J. 69, 849-859. Churchland, P. S., and Sejnowski, T. J. (1992). “The Computational Brain.” MIT Press, Cambridge, MA. Clements, J. D., Lester, R. A. J., Tong, G., Jahr, C. E., and Westbrook, G. L. (1992). The time course of glutamate in the synaptic cleft. Science 258, 1498-1501. Colquhoun, D. (1973). The relation between classical and cooperative models for drug action. In “Drug Receptors” (H. P. Rang, ed.), pp. 149-182. Macmillan, London. Colquhoun, D., and Hawkes, A. G. (1995).The principles of the stochastic interpretation of ionchannel mechanisms. In “Single-Channel Recording” (B. Sakmann and E. Neher, eds.), pp. 397-482. Plenum, New York. Colquhoun, D., and Ogden, D. C. (1988). Activation of ion channels in the frog endplate by high concentrations of acetylcholine. J. PhysioL (London)395, 131-159. Colquhoun, D., and Sakmann, B. (1985). Fast events in singlechannel currents activated by acetylcholine and its analogues at the frog muscle end-plate. J. Physiol. (London) 369,501-557. Conroy, W., Vernallis, A. B., and Berg, D. K. (1992). The a5 gene product assembles with multiple acetylcholine receptor subunits to form distinctive receptor subtypes in brain. Neuron 9, 1-20. Cooper, E., Couturier, S., and Ballivet, M. (1991). Pentameric structure and subunit stoichiometry of a neuronal nicotinic acetylcholine receptor. Nature (London) 350, 235-238. Corringer, P.-J., Galzi, J.-L., Eisel6,J.-L., Bertrand, S., Changeux, J.P., and Bertrand, D. (1995).Identification of a new component of the agonist binding site of the nicotinic a7 homooligomeric receptor. J. Biol. C h a . 270, 11749-11752.
176
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
Corringer, P.-J., Bertrand, S., Bohler, S., Edelstein, S. J., Changeux, J.-P., and Bertrand, D. (1998). Identification of critical elements modulating desensitization of neuronal nicotinic receptors. J. Neurosci. 18, 648-657. Couturier, S., Bertrand, D., Matter, J.-M., Hernandez, M.-C., Bertrand, S., Millar, N., Valera, S., Barkas, T., and Ballivet, M. (1990). A neuronal nicotinic acetylcholine receptor subunit (a7) is developmentally regulated and forms a homc-oligomeric channel blocked by a-BTX. Neuron 5, 847-856. Czajkowski, C., Kaufmann, C., and Karlin, A. (1993). Negatively charged amino acid residues in the nicotinic receptor 6subunit that contribute to the binding of acetylcholine. Roc. Natl. Acad. Sci. U.S.A. 90, 6285-6289. Dani, J. A., and Heinemann, S. (1996). Molecular and cellular aspects of nicotine abuse. Neuron 16,905-908. Dehaene, S., and Changeux, J.-P. (1989). A simple model of prefrontal cortex function in delayed-response tasks. J. Cognit. Neurosci. 1, 244-261. Dehaene, S., and Changeux, J.-P. (1991). The Wisconsin card sorting test: Theoretical analysis and modeling in a neuronal network. Cereb. Cortex 1, 62-79. Dehaene, S., Changeux,J.-P., and Nadal, J.-P. (1987). Neural networks that learn temporal sequences by selection. Proc. Natl. Acad. Sci. U.S.A. 84, 2727-2731. Del Castillo, J., and Katz, B. (1957). Interaction at endplate receptors between different choline derivatives.J. Physiol. (London) 146, 369-381. Devillers-Thiiry, A,, Galzi, J.-L., Bertrand, S., Changeux, J.-P., and Bertrand, D. (1992). Stratified organization of the nicotinic acetylcholine receptor channel. NeuroReport 3, 1001-1004. Devillers-Thiiry, A., Galzi, J.-L., EiselC, J.-L., Bertrand, S., Bertrand, D., and Changeux, J.-P. (1993). Functional architecture of the nicotinic acetylcholine receptor: A prototype of ligand-gated ion channe1s.J. Membr. Biol. 136, 97-112. Eaton, W. A., Henry, E. R., and Hofrichter, J. (1991). Application of linear free energy relations to protein conformational changes: The quaternary structural change of hemoglobin. Proc. Natl. Acad. Sci. U.S.A. 88, 4472-4475. Edelman, G. M., Reeke, G. N., Gall, W. E., Tononi, G., Williams, D., and Sporns, 0. (1992). Synthetic neural modeling applied to a real-world artifact. Proc. Natl. Acad. Sci. U.S.A. 89, 7267-7271. Edelstein, S. J. (1972). An allosteric mechanism for the acetylcholine receptor. Biochem. Biophys. Res. Commun. 48, 1160-1165. Edelstein, S. J. (1975). Cooperative interactions of hemoglobin. Annu. Rev. Biochem. 44,209-232. Edelstein, S . J., and Bardsley, W. G. (1997). Contributions of individual molecular species to the Hill coefficient for ligand binding by an oligomeric protein. J. Mol. Biol. 267, 10-16. Edelstein, S. J., and Changeux, J.-P. (1996). Allosteric proteins after thirty years: The binding and state functions of the neuronal a 7 nicotinic acetylcholine receptor. Expm'entia 52, 1083-1090. Edelstein, S. J., Schaad, O., Henry, E., Bertrand, D., and Changeux, J.-P. (1996). A kinetic mechanism for nicotinic acetylcholine receptors based on multiple allosteric transitions. Biol. Cybernet. 75, 361-380. Edelstein, S. J., Schaad, O., and Changeux, J.-P. (1997a). Myasthenic nicotinic receptor mutant interpreted in terms of the allosteric model. C. R. Acad. Sci. Paris320,953-961. Edelstein, S. J., Schaad, O., and Changeux, J.-P. (199713). Single bindings versus single channel recordings: A new approach to ionotropic receptors. Biochemistly 36,1375513760.
W OST E R l C TRANSITIONS OF THE ACh RECEPTOR
177
Edman, L., Mets, U., and Rigler, R. (1996). Conformational transitions monitored by single molecules in solution. Proc. Natl. Acad. Sci. U.S.A. 93, 6710-6715. Edmonds, B., Gibb, A. J., and Colquhoun, D. (1995). Mechanisms of activation of muscle nicotinic acetylcholine receptors and the time course of endplate currents. Annu. Reu. Physiol. 57, 469-493. Ehrenberg, M., and Rigler, R. (1974). Rotational Brownian motion and fluorescence intensity fluctuations. J. Chem. Phys. 4, 39@-410. Eigen, M., and Rigler, R. (1994). Sorting single molecules: Application to diagnostics and evolutionary biotechnology. R o c . Natl. Acad. Sci. U.S.A. 91, 5740-5747. EiselC, J.-L., Bertrand, S., Galzi,J.-L., Devillers-ThiCry,A., Changeux, J.P., and Bertrand, D. (1993).Chimeric nicotinic-serotonergic receptor combines distinct ligand binding and channel specificities. Nature (London) 366, 479-483. Elgoyhen, A. B., Johnson, D. S., Boulter, J., Vetter, D. E., and Heinemann, S. (1994). a9: An acetylcholine receptor with novel pharmacological properties expressed in rat cochlear hair cells. Cell (Cambridge, Mass.) 79, 705-715. Elson, E., and Magde, D. (1974). Fluorescence correlation spectroscopy. I. Conceptual basis and theory. Biopolymm 13,l-27. Engel, A. G. (1993). The investigation of congenital myasthenic syndromes. Ann. N.Y. Acad. Sci. 681, 425-434. Faber, D. S.,Young, W. S., Legendre, P., and Kom, H. (1992). Intrinsic quanta1 variability due to stochastic properties of receptor-transmitter interactions. Science 258, 14941498. Fersht, A. R., Leatherbarrow, R. J., and Wells, T. N. C. (1986). Quantitative analysis of structure-activity relationship in engineering proteins by linear freeenergy relationships. Nature (London) 322, 284-286. Filatov, G . N., and White, M. M. (1995).The role of conserved leucines in the M2 domain of the acetylcholine receptor in channel gating. Mol. Phannacol. 48, 379-384. Foldiak, P. (1990). Forming sparse representations by local anti-Hebbian learning. Biol. Cybmet. 64, 165- 170. Forman, S. A., and Miller, K W. (1988). High acetylcholine concentrations cause rapid inactivation before fast desensitization in nicotinic acetylcholine receptors from Torpedo. Biophys. J. 54, 149-158. Franke, C., Parnas, H., Hovav, G., and Dudel, J. (1993). A molecular scheme for the reaction between acetylcholine and nicotinic channels. Biophys. J. 64, 339-356. Freedman, R., Coon, H., Myles-Worsley, M., Orr-Urtreger, A., Olincy, A., Davis, A., Polymeropoulos, M., Holik,J., Hopkins,J., Hoff, M., Rosenthal,J., Waldo, M. C., Reimherr, F., Wender, P., Yaw, J., Young, D. A., Breese, C. R., Adams, C., Patterson, D., Adler, L. E., Kruglyak, L., Leonard, S., and Byerley, W. (1997).Linkage of a neurophysiological deficit in schizophrenia to a chromosome 15 locus. A-oc. Natl. Acad. Sci. U.S.A. 94, 587-592. Fu, D. X., and Sine, S. M. (1994). Competitive antagonists bridge the a? subunit interface of the acetylcholine receptor through quaternary ammonium-aromatic interactions. J. Biol. Chem. 269, 26152-26157. Galzi,J.-L., and Changeux, J.-P. (1994).Neurotransmittergated ion channels as unconventional allosteric proteins. Cum. Opin. Struct. Biol. 4, 554-565. Galzi,J.-L.,and Changeux, J.-P. (1995).Neuronal nicotinic receptors: Molecular organization and regulations. Neuropharmacology 34,563-582. Galzi, J.-L., Bertrand, D., Devillers-ThiCry, A., Revah, F., Bertrand, S., and Changeux, J.-P. (1991a). Functional significance of aromatic amino acids from three peptide
178
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
loops of the a 7 neuronal nicotinic receptor site investigated by sitedirected mutagenesis. FEES Lett. 294, 198-202. Galzi, J.-L., Revah, F., Bouet, F., MCnez, A., Goeldner, M., Hirth, C., and Changeux, J.-P. (1991b).Allosteric transitions of the acetylcholine receptor probed at the amino acid level with a photolabile cholinergic ligand. Proc. Natl. Acad. Sci. U.S.A. 88,50515055. Galzi, J.-L., Devillers-ThiCry,A., Hussy, N., Bertrand, S., Changeux, J.-P., and Bertrand, D. (1992). Mutations in the channel domain of a neuronal nicotinic receptor convert ion selectivity from cationic to anionic. Nature (London) 359, 500-505. Galzi, J.-L., Bertrand, S., Corringer, P.-J., Changeux, J.-P., and Bertrand, D. (1996a). Identification of calcium binding sites that regulate potentiation of a neuronal nicotinic acetylcholine receptor. EMBO J. 15, 5824-5832. Galzi, J.-L., Edelstein, S. J., and Changeux, J:P. (1996b). The multiple phenotypes of allosteric receptor mutants. A o c . Natl. Acad. Sci. U.S.A. 93, 1853-1858. Girauddt,J., Dennis, M., Heidmann, T., Chang,J.Y., and Changeux,J.-P. (1986). Structure of the high affinity site for noncompetitive blockers of the acetylcholine receptor: Serine-262 of the delta subunit is labeled by [‘H]-chlorpromazine. Roc. Natl. Acad. Sci. U.S.A. 83, 2719-2723. Gomez, C. M., and Gammack, J. T. (1995). A leucine-to-phenylanalninesubstitution in the acetylcholine receptor ion channel in a family with the slowchannel syndrome. Neurology 45, 982-985. Gomez, C. M., Maselli, R., Gammack, B. S., Lasalde, J., Tamamizu, S., Cornblath, D. R., Lehar, M., McNamee, M., and Kuncl, R. W. (1996). A Psubunit mutations in the acetylcholine receptor channel gate causes severe slowchannel syndrome. Ann. Neur01. 39, 712-723. GouzC, J.-L., Lasry, J.-M., and Changeu:., J.-P. (1983). Selective stabilization of muscle innervation during development: A mathematical model. Bid. Cybernet. 46, 207-215. Gray, R., Rajan, A. S., Radcliffe, K. A., Yakehiro, M., and Dani, J. A. (1996). Hippocampal synaptic transmission enhanced by low concentrations of nicotine. Nature (London) 383, 713-716. Hangartner, R. D., and Cull, P. (1996). A ternary logic model for recurrent neuromime networks with delay. Biol. Cybernet. 73, 177-188. Hebb, D. 0. (1949). “The Organization of Behavior.” Wiley, New York. Heidmann, T., and Changeux, J.-P. (1978). Structural and functional properties of the acetylcholine receptor protein in its purified and membrane-bound states. Annu. Rev. Biochem. 47, 317-357. Heidmann, T., and Changeux, J.-P. (1979). Fast kinetic studies on the interaction of a fluorescent agonist with the membrane-bound acetylcholine receptor from Torpedo marmurata. Eur. J. Biochem. 94, 255-279. Heidmann, T., and Changeux, J.-P. (1980). Interaction of a fluorescent agonist with the membrane-bound acetylcholine receptor from Torpedo marmorata in the millisecond time range: Resolution of an “intermediate” conformational transition and evidence for positive cooperative effects. Biochem. Biophys. Res. Commun. 97, 889-896. Heidmann, T., and Changeux, J.-P. (1982). Un modde molCculaire de rkgulation d’efficacite au niveau postsynaptique d’une synapse chimique. C. R Acad. Sci. Paris 295, 665-670. Henry, E., Jones, C. M., Hofrichter, J., and Eaton, W. A. (1997). Can a two-state MWC allosteric model explain hemoglobin kinetics? Biochemistly 36, 65 11-6528. Herz, A., Sulzer, B., and Kuhn, R. (1989). Hebbian learning reconsidered: Representation of static and dynamic objects in associative neural nets. B i d . Cybernet. 60, 457-467.
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
179
Herz, J. M., Johnson, D. A., and Taylor, P. (1989). Distance between the agonist and noncompetitive inhibitor sites on the nicotinic acetylcholine receptor. J. Biol. Chem. 264, 12439-12448. Hopfield, J. J. (1995). Pattern recognition computation using action potential timing for stimulus representation. Nature (London) 376, 33-36. Hucho, F., Oberthur, W., and Lottspeich, F. (1986). The ion channel of the nicotinic acetylcholine receptor is formed by the homologous helices M2 of the receptor subunits. E B S Lett. 205, 137-142. Huganir, R. L., and Greengard, P. (1990).Regulation of neurotransmitter receptor desensitization by protein phospholylation. Neuron 5, 555-567. Huganir, R. L., Delcour, A. J., Greengard, P., and Hess, G. P. (1986). Phospholylation of the nicotinic acetylcholine receptor regulates its rate of desensitization. Nature (London) 321, 774-776. Imoto, R , Busch, C., Sakmann, B., Mishina, M., Konno, T., Nakai, J., Bujo, H., Mori, Y., Fukuda, R , and Numa, S. (1988).Rings of negatively charged amino acids determine the acetylcholine receptor channel conductance. Nature (London) 335, 645-648. Jackson, M. B. (1984). Spontaneous openings of the acetylcholine receptor channel. Proc. Natl. Acad. Sci. U.S.A. 81, 3901-3904. Jackson, M. B. (1988). Dependence of acetylcholine receptor channel kinetics on agonist concentration in cultured mouse muscle fibers. J. Physiol. (London) 397, 555-583. Jackson, M. B. (1989). Perfection of a synaptic receptor: kinetics and energetics of the acetylcholine receptor. Proc. Natl. Acad. Sci. U.S.A. 86, 2199-2203. Jackson, M. B. (1993).Activation of receptors directly coupled to channels. In “Thermodynamics of Membrane Receptors and Channels” (M. B. Jackson, ed.), pp. 249-293. CRC Press, Boca Raton, FL. Jackson, M. B., Imoto, K, Mishina, M., Konno, T., Numa, S., and Sakmann, B. (1990). Spontaneous and agonist-induced openings of an acetylcholine receptor channel composed of bovine a-,p- and &subunits. @ u p s Arch. Eu7.J. Physiol. 417,129-135. Jencks, W. P. (1985). A primer for the Bema Hapothle: An empirical approach to the characterization of changing transition-state structures. Chem. Rev. 85,511-527. Karlin, A. (1967). On the application of “a plausible model” of allosteric proteins to the receptor of acetylcholine. J. Thew. Biol. 16, 306-320. Karlin, A. (1993). Structure of nicotinic acetylcholine receptors. Cum Opin. Neurobiol. 3, 299-309. Karlin, A,, and Akabas, M. H. (1995).Toward a structural basis for the function of nicotinic acetylcholine receptors. Neuron 15, 1231-1244. Katz, B., and Thesleff, S. (1957).A study of “desensitization” produced by acetylcholine at the motor end-plate. J. Physiol. (London) 138, 63-80. Kerszberg, M., and Masson, C. (1995). Signal-induced selection among spontaneous oscillatory patterns in a model honeybee olfactory glomeruli. Biol. Cybernet. 72, 487-495. Kerszberg, M., and Zippelius, A. (1990). Synchronization in neural assemblies. Phys. Scr. T33,54-64. Koshland, D. E., Nemethy, G., and Filmer, D. (1966).Comparison ofexpenmental binding data and theoretical models in proteins containing subunits. Biochemistry 5,365-385. Kuffler, S. W., and Yoshikami, D. (1975). The distribution of acetylcholine sensitivity at the post-synaptic membrane of vertebrate skeletal twitch muscles: Iontophoretic mapping in the micron range. J. Physiol. (London) 244, 703-730. Labarca, C., Nowak, M. W., Zhang, H., Tang, L., Desphande, P., and Lester, H. A. (1995). Channel gating governed symmetrically by conserved leucine residues in the M2 domain of nicotinic receptors. Nature (London) 376, 514-516.
180
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
Langosh, D., Laube, B., Rundstrbm, N., Schmieden, V., Bormann, J., and Betz, H. (1994). Decreased agonist affinity and chloride conductance of mutant glycine receptors associated with human hereditary hyperekplexia. EMBOJ. 13, 4223-4228. Leffler, J. E. (1953). Parameters for the description of transition states. Science 117, 340-341. Lena, C., and Changeux, J.-P. (1993).Allosteric modulations of the nicotipic acetylcholine receptor. Trends Neurosci. 16, 181-186. Lena, C. and Changeux, J.-P. (1997a). Pathological mutations of nicotinic receptors and nicotine-based therapies for brain disorders. Cum Opin. Neurobiol. 7, 674-682. Lena, C. and Changeux, J.-P. (1997b). Role of Ca2+ions in nicotinic facilitation of GABA release in mouse thalmus. J. Neurosci. 17, 576-585. Le NovCre, N., and Changeux,J.-P. (1995).Molecular evolution of the nicotinic acetylcholine receptor: an example of multigene family in excitable cells. J. Mol. Evol. 40, 155-172. Le Novere, N., Zoli, M., and Changeux, J.-P. (1996). Neuronal nicotinic receptor a6 subunit RNA is selectively concentrated in catecholaminergic nuclei of the rat brain. Eur. J. Neurosci. 8, 2428-2439. Levitan, I. B. (1994).Modulation of ion channels by protein phosphorylation and dephosphorylation. Annu. Rev. Physiol. 56, 193-212. Li, Y. H., Li, L., Lasalde, J., Rojas, L., McNamee, M., Ortiz-Miranda, S. I., and Pappone, P. (1994). Mutations in the M4 domain of Torpedo calijimica acetylcholine receptor dramatically alter channel function. Biophys. J. 66, 646-653. Lindstrom, J. (1996). Neuronal nicotinic acetylcholine receptors. In “Ion Channels” (T. Narahashi, ed.), pp. 377-450. Plenum, New York. Lingle, C. J., Maconochie, D., and Steinbach, J. H. (1992). Activation of skeletal muscle nicotinic acetylcholine receptors. J. Membr.Biol. 126, 195-21 7. Lisman, J. (1989).A mechanism for the Hebb and anti-Hebb processes underlying learning and memory. Proc. Natl. Acad. Sci. U.S.A. 86, 9574-9578. Lisman, J. (1994).The CaM kinase hypothesis for the storage of synaptic memory. Trends Neurosci. 17, 406-412. Lisman, J., Malenka, R. C., Nicoll, R. A., and Malinow, R. (1997). Learning mechanisms: The case for CaM-KII. Science 276,2001-2002. Lo, D. C., Pinkham, J. L., and Stevens, C. F. (1991). Role of a key cysteine residue in the gating of the acetylcholine receptor. Neuron 6 , 31-40. Leutje, C. W., Piattoni, M., and Patrick, J. (1993). Mapping of ligand binding sites of neuronal nicotinic acetylcholine receptors using chimeric alpha subunits. Neuron 44,657-666. Lukas, R. J., and Bencherif, M. (1992). Heterogeneity and regulation of nicotinic acetylcholine receptors. Int. Rev. Neurobiol. 34, 25-131. Lynch, J. W., Rajendra, S., Pierce, K. D., Handford, C. A., Barry, P. H., and Schofield, P. R. (1997). Identification of intracellular and extracellular domains mediating signal Wdnsduction in the inhibitory glycine receptor chloride channel. EMEO ,I. 16, 110-120. Machold, J., Weise, C., Utkin, Y., Tsetlin, V., and Hucho, F. (1995). The handedness of the subunit arrangement of the nicotinic acetylcholine receptor from Torpedo rali/ornica. Eur. J. Biochem. 234, 427-430. Magde, D., Elson, E., and Webb, W. (1974). Fluorescence correlation spectroscopy. 11. An experimental realization. Biopolymm 13, 29-61. Mayford, M., Wang, J., Kandel, E. R., and O’Dell, T. J. (1995). CaMKII regulates the frequency-response function of hippocampal synapses for the production of both LTD and LTP. Cell (Cambridge, Mass.) 81, 891-904.
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
181
Milone, M., Wang, H.-L., Ohno, K., Fukudome, T., Pruitt, J. N., Bren, N., Sine, S. M., and Engel, A. G. (1997). Slowchannel myasthenic syndrome caused by enhanced activation, desensitization, and agonist binding affinity attributable to mutation in the M2 domain of the acetylcholine receptor a subunit. J. Neurosci. 17, 5651-5665. Monod, J., Wyman,J., and Changeux, J.-P. (1965). On the nature of allosteric transitions: A plausible model. J. Mol. Biol. 12, 88-118. Montague, P. R., and Sejnowski, T. J. (1994). The predictive brain: Temporal coincidence and temporal order in synaptic learning mechanisms. Learn. M a . 1, 1-33. Mulle, C., Lena, C., and Changeux, J.-P. (1992). Potentiation of nicotinic receptor response by external calcium in rat central neurons. Neuron 8, 937-945. Neher, E., and Sakmann, B. (1976). Single channel currents recorded from membrane of denervated frog muscle fibers. Nature (London) 260, 799-802. Neubig, R. R., and Cohen, J. B. (1980). Permeability control by cholinergic receptors in Torpedo post-synaptic membranes: Agonist dose-response relations measured at second and millisecond times. Biochemistry 19, 2770-2779. Ochoa, E. L. M., Li, L., and McNamee, M. G. (1990). Desensitization ofcentral cholinergic mechanisms and neuroadaptation to nicotine. Mol. Neurobiol. 4,251-287. Ohno, K., Hutchison, D. O., Milone, M., Brengman, J. M., Bouzat, C., Sine, S. M., and Engel, A. G. (1995). Congenital myasthenic syndrome caused by prolonged acetylcholine receptor channel openings due to a mutation in the M2 domain of the E subunit. Proc. Natl. Acad. Sci. U.S.A. 92, 758-762. Ohno, K., Wang, H.-L., Milone, M., Bren, N., Brengman, J. M., Nakano, S., Quiram, P., Pruitt, J. N., Sine, S. M., Engel, A. G. (1996). Congenital myasthenic syndrome caused by decreased agonist binding affinity due to a mutation in the acetylcholine E subunit. Neuron 17, 157-170. Ohno, K., Quiram, P., Milone, M., Wang, H.-L., Harper, M. C., Pruitt, J. N., 11, Brengman, J. M., Pao, L., Fischbeck, K. H., Crawford, T. O., Sine, S. M., and Engel, A. G. (1997). Congenital myasthenic syndromes due to heteroallelic nonsense/missense mutations in the acetylcholine receptor epsilon subunit gene: Identification and functional characterization of six new mutations. Hum. Mol. Genet. 6, 753-767. O’Leary, M. E., and White, M. M. (1992). Mutational analysis of ligand-induced activation of the Torpedo acetylcholine receptor. J. Biol. Chem. 267, 8360-8365. Oswald, R. E., and Changeux,J:P. (1982).Crosslinking of a-bungarotoxin to the acetylcholine receptor from Torpedo rnannwata by ultraviolet light irradiation. IEBS Lett. 139, 225-229. Palma, E., Bertrand, S., Binzoni, T., and Bertrand, D. (1996). Neural nicotinic a 7 receptor expressed in Xenopus oocytes presents five putative binding sites for methyllycaconitine. J. Physiol. (London) 491, 151-161. Pedersen, S. E., and Cohen,J. B. (1990). ATubocurarine binding sites are located at a-y and a-6 subunit interfaces of the nicotinic acetylcholine receptor. A-oc. Natl. Acad. Sri. U.S.A. 87, 2785-2789. Peng, X., Gerzanich, V., Anand, R., Whiting, P. J., and Lindstrom, J. (1994). Nicotineinduced increase in neuronal nicotine receptors results from a decrease in the rate of receptor turnover. Mol. Phannacol. 46, 523-530. Perutz, M. F. (1989). Mechanisms of cooperativity and allosteric regulation in proteins. Q. Rev. Biophys. 22, 139-236. Picciotto, M. R., Zoli, M., Lena, C., Bessis, A., Lallemand, Y.,Le Nov*re, N., Vincent, P., Merlo Pich, E., Brtilet, P., and Changeux, J.-P. (1995).Abnormal avoidance learning in mice lacking functional high-affinity nicotine receptor in the brain. Nature (London) 374, 65-67.
182
STUART J. EDELSTEIN AND JEAN-PIERRE CHANGEUX
Picciotto, M. R., Zoli, M., Rimondini, R., Lena, C., Marubio, L. M., Merlo Pich, E., Fuxe, K., and Changeux, J.-P. (1998). Acetylcholine receptors containing the p 2 subunit are involved in the reinforcing properties of nicotine. Nature (London) 391, 173-177. Prince, R. J., and Sine, S. M. (1996). Molecular dissection of subunit interfaces in the acetylcholine receptor: Identification of residues that determine agonist selectivity. J. Biol. Chem. 271, 25770-25777. Rajendra, S., Lynch,J., Pierce, K. D., French, C. R., Barry, P. H., and Schofield, P. R. (1994). Startle disease mutations reduce the agonist sensitivityof the human inhibitory glycine receptor. J. Biol. Chem. 269, 18739-18742. Rajendra, S., Lynch, J., Pierce, K. D., French, C. R., Barry, P. H., and Schofield, P. R. (1995). Mutation of an arginine residue transforms Palanine and taurine from agonists into competitive antagonists. Neuron 14, 169-175. Ramirez-Latorre,J.,Yu, C. R., Qu, X., Perin, F., Karlin, A., and Role, L. (1996). Functional contributions of a5 subunit to neuronal acetylcholine receptor channels. Nature (London) 380, 347-351. Rang, H. P., and Ritter, J. M. (1970). O n the mechanism of desensitization of cholinergic receptors. Mol. Phannacol. 6, 357-382. Rathouz, M. M., Vijayaraghavan, S., and Berg, D. K. (1996). Elevation of intracellular calcium levels in neurons by nicotinic acetylcholine receptors. Mol. Neurobiol. 12, 117-131. Rauer, B., Neumann, E., Widengren, J., and Rigler, R. (1996). Fluorescence correlation spectrometry of the interaction kinetics of tetramethylrhodamin a-bungarotoxin with Twpedo calijbmica acetylcholine receptor. Biophys. Chem. 58, 3-12. Raymond, L. A., Blackstone, C. D., and Huganir, R. L. (1993). Phosphorylation of amino acid neurotransmitter receptors in synaptic plasticity. Trends Neurosci. 16, 147-153. Revah, F., Bertrand, D., Galzi. J.-L., Devillers-ThiCry, A., Mulle, C., Hussy, N., Bertrand, S., Ballivet, M., and Changeux, J.-P. (1991). Mutations in the chanel domain alter desensitization of a neuronal nicotinic receptor. Nature (London) 353, 846-849. Role, L. W., and Berg, D. K. (1996). Nicotinic receptors in the development and modulation of CNS synapses. Neuron 16, 1077-1085. Rubin, M. M., and Changeux,J.4’. (1966). On the nature of allosteric transitions: Implications of nonexclusive ligand binding. J. Mol. Biol. 21, 265-274. Sakmann, B., Patlak, J., and Neher, E. (1980). Single acetylcholine-activated channels show burst-kinetics in presence of desensitizing concentrations of agonist. Nature (London) 286, 71-73. Sargent, P. B. (1993). The diversity of neuronal nicotinic acetylcholine receptors. Annu. Rev. Neurosci. 16, 403-443. Sawicki, C. A., and Gibson, Q. H. (1976). Quaternary conformational changes in human hemoglobin studied by laser photolysis of carboxyhemoglobin. J. Biol. Chem. 251, 1533- 1542. Schmieden, V., Kushe, J., and Betz, H. (1992). Agonist pharmacology of neonatal and adult glycine receptor alpha subunits: Identification of amino acid residues involved in taurine activation. EMBO J. 11, 2025-2032. Schwille, P., Meyer-Almes, F.-J., and Rigler, R. (1997). Dualcolor fluorescence crosscorrelation spectroscopy for multicomponent diffusional analysis in solution. Biophys. J 72, 1878-1886. Shiang, R., Ryan, S. G., Zhu, Y. Z., Hahn, A. F., O’Connell, P., and Wasmuth, J. J. (1993). Mutations in the alpha1 subunit of the inhibitory glycine receptor cause the dominant neurologic disorder, hyperekplexia. Nut. Genet. 5, 351-358.
ALLOSTERIC TRANSITIONS OF THE ACh RECEPTOR
183
Sigworth, F. J., and Sine, S. M. (1987). Data transformations for improved display and fitting of singleihannel dwell time histograms. Biophys. J. 52, 1047-1054. Silva, A. J., Stevens, C. F., Tonegawa, S., and Wang, Y. (1992). Deficient hippocampal long-term potentiation in a-calcium-calmodulin kinase I1 mutant mice. Sciace 257,201-206. Sine, S. M., and Claudio, T. (1991). y- and &subunits regulate the afTinity and the cooperativity of ligand binding to the acetylcholine recept0r.J. Biol. C h . 266,1936919377. Sine, S. M., Claudio, T., and Sigworth, F. J. (1990). Activation of Torpedo acetylcholine receptors expressed in mouse fibroblasts: Single channel current kinetics reveal distinct agonist binding a n i t i e s . J. Gen. Physiol. 96, 395-437. Sine, S. M., Ohno, K., Bouzat, C., Auerbach, A., Milone, M., Pruitt, J. N., and Engel, A. G. (1995).Mutation of the acetylcholine receptor a subunit causes a slowchannel myasthenic syndrome by enhancing agonist binding finity. Neuron 15, 229-239. Son, H., Hawkins, R. D., Martin, K., Kiebler, M., Huang, P. L., Fishman, M. C., and Kandel, E. R. (1996). Long-term potentiation is reduced in mice that are doubly mutant in endothelial and neuronal nitric oxide synthase. Cell (Cumbridge, Mass.) 87,1015-1023. Steinfeld,J. I., Francisco,J. S., and Hase, W. L. (1989).“Chemical Kinetics and Dynamics.” Prentice-Hall, Englewood Cliffs, NJ. Steinlein, 0. K., Mulley, J. C., Propping, P., Wallace, R. H., Phillips, H. A., Sutherland, G. R., Scheffer, I. E., and Berkovic, S. F. (1995). A missense mutation in the neuronal nicotinic receptor a4 subunit is associated with autosomal dominant nocturnal frontal lobe epilepsy. Nut. Genet. 11, 201-203. Steinlein, 0. K., Magnusson, A., Stoodl, J., Bertrand, S., Weiland, S., Berkovic, S. F., Nakken, K. O., Propping, P., and Bertrand, D. (1997). An insertion mutation of the CHRNA4 gene in a family with autosomal dominant nocturnal frontal lobe epilepsy. Hum. Mol. Genet. 6, 943-947. Stevens, C. F., Tonegawa, S., and Wang, Y. (1997). The role of calcium-calmodulin kinase I1 in three forms of synaptic plasticity. CUT. Biol. 4, 687-693. Szabo, A. (1978). Kinetics of hemoglobin and transition state theoly. Roc. Natl. Acud. Sci. U.S.A. 75, 2108-2111. Tomaselli, G. F., McLaughlin, J. T., Jurman, M., Hawrot, E., and Yellen, G. (1991). Mutations affecting agonist sensitivityof the nicotinic acetylcholine receptor. Biophys. J. 60, 721-727. Treinin, M., and Chalfie, M. (1995). A mutated acetylcholine receptor subunit causes neuronal degeneration in C. &guns. Neuron 14,871-877. Unwin, N. (1993a). Neurotransmitter action: Opening of the ligand-gated ion channels. Neuron 10, 31-41. Unwin, N. (1993b). The nicotinic acetylcholine receptor at 9 A reso1ution.J Mol. Biol. 229, 1101-1124. Unwin, N. (1996). Projection structure of the nicotinic acetylcholine receptor: Distinct conformations of the a! subunits. J. Mol. Biol. 257, 586-596. Valenta, D. C., Downing, J. E. G., and Role, L. W. (1993). Peptide modulation of ACh receptor desensitization controls neurotransmitter release from chicken sympathetic neurons. J. Neurophysiol. 69, 928-942. Valera, S., Ballivet, M., and Bertrand, D. (1992). Progesterone modulates a neuronal nicotinic acetylcholine receptor. Roc. Nutl. Acad. Sci. U.S.A. 89, 9949-9953. Vernallis, A. B., Conroy, W. G., and Berg, D. K. (1993). Neurons assemble acetylcholine receptors with as many as three kinds of subunits while maintaining subunit segregation among receptor subtypes. Neuron 10, 451-464.
184
STUART J. EDELSTEIN AND JEAN-PIERRECHANGEUX
Vernino, S., Amador, M., Luetje, C. W., Patrick, J., and Dani, J. A. (1992). Calcium modulation and high calcium permeability of neuronal nicotinic acetylcholine r e c e p tors. Neuron 8, 127- 134. Vincent, A,, Newland, C., Croxen, R., and Beeson, D. (1997). Genes at the junction: Candidates for congenital myasthenic syndromes. Trends Neurosci. 20, 15-23. Wang, H.-L., Auerbach, A., Bren, N., Ohno, K., Engel, A. G., and Sine, S. M. (1997). Mutation in the M1 domain of the acetylcholine receptor alpha subunit decreases the rate of agonist dissociation. J Gen. Physiol. 109, 757-766. Weiland, S., Witzemann, V., Villarroel, A., Propping, P., and Steinlein, 0. (1996). An amino acid exchange in the second transmembrane segment of a neuronal nicotinic receptor causes partial epilepsy by altering its desensitization kinetics. K B S Lett. 398,91-96. Wigstrom, H., and Gustafsson, B. (1985). On long-lasting potentiation in the hippocampus: A proposed mechanism for its dependence on coincident pre- and postsynaptic activity. Acta Physiol. Scand. 123, 519-522. Wonnacott, S. (1990). The paradox of nicotinic acetylcholine receptor upregulation by nicotine. Trends Phannacol. Sci. 11, 216-219. Wonnacott, S. (1997). Presynaptic nicotinic ACh receptors. Trends Neurosci. 20, 92-98. Wyman, J. (1948). Heme proteins. Adv. Protein Chem. 4, 407-531. Wyman,J. (1964).Linked functions and reciprocal effects in hemoglobin: A second look. Adv. Protein Chem. 19,223-286. Yakel, J. L., Lagrutta, A., Adelman, J. P., and North, R. A. (1993). Single amino acid substitution affects desensitization of the Shydroxytryptamine type 3 receptor expressed in Xenopus oocytes. Proc. Natl. Acad. Sci. U.S.A. 90, 5030-5033. Zhang, Y., Chen, J., and Auerbach, A. (1995). Activation of recombinant mouse acetylcholine receptors by acetylcholine, carbamylcholine and tetramethylammonium. J. Physiol. (London) 486, 189-206.
DECIPHERING THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
.
By GARY K ACKERS Department of Biochemistryand Molecular Biophysics. Washington University School of Medicine. St Louis. Missouri 63110
.
I . Introduction ..................................................... I1 . Overview ........................................................ A. Research Leading to the Symmetry Rule ........................ B. Structural Probes and Heterotropic Responses ................... C. Microstate Cooperativity Components of HbCO and HbOp . . . . . . . . 111. Binding Curves and Stoichiometric Information ...................... A . The Adair Binding Function . .............. B Using Dimer-Tetramer Assem Binding Cooperativity ......................................... C. The Hill Coefficient .......................................... IV. Site-Specific Aspects of Oxygen Binding ............................. A. Microstate Components of the Adair Function ................... B. Implications for Allosteric Model Analysis ....................... C. Impediments to Microstate Resolution .......................... V . Experimental Determination of Site-Specific Cooperativity Terms ...... A . Active Site Analogs of Oxygenation . . . . . ..................... B. Linkage Cycle Analysis of the Partially Ligated Intermediates . . . . . . C. Statistical Weights of the Tetramer Binding Function . . . . . . . . . . . . . D. Hybridization Techniques and Free Energy Distributions .......... VI. How the Molecular Code Was Deciphered ........................... A. Cooperative Free Energies of HbOpAnalogs ..................... B. Oxygen-Binding Tests of the Consensus Function . . . . . . . . . . . . . . . . C. Prediction of Site-Specific Binding Parameters for Native HbOp .... D . Structure-Sensitive Probes and Heterotropic Responses . . . . . . . . . . . E. Quaternary Assignments and Molecular Code .................... F. Confirmation of the Molecula e Partition Function .................. and HbCO Intermediates ... VII. Concluding Remarks .............................................. References . . . . . . . . . . . . . . . ...................
.
185 190 191 196 197 198 198 201 205 206 206 209 210 211 211 212 216 218 221 221 227 229 231 241 243 246 248
I. INTRODUCTION Molecular biologists and biochemists have long been intrigued by the concept of molecular codes for the structure and function of biological macromolecules. i.e., a set of rules which translate the multiple combinations of simple structural elements into a more complex phenomenon of biological function . The landmark solution of the genetic code. whereby triplet combinations of DNA base pairs were found to dictate amino 185 ALIVANC.G IN PROTEIN C H M S l R Y . Vol. 51
Copyright 0 1998 by Academic Press. All righrs of reproduction in any form reserved. 0065-3238/98 525.00
186
GARY K. ACKERS
acid sequences of proteins (cf. Khorana, 1968;Judson, 1979), has been followed by extensive ongoing efforts to find rules that predict how primary sequences and secondary structural features of proteins may determine their transitions into compactly folded conformations (cf. Gierasch and King, 1990; Sauer et al., 1990; Kim and Berg, 1996). A third category of potential molecular codes has been for multisub unit protein assemblies that switch their functional behavior under the control of ligand binding (or covalent reaction) at multiple combinations of sites (Ackers and Smith, 1986). These “macromolecular switches,” originally exemplified by the allosteric enzymes and respiratory oxygen carriers (hemoglobins and hemocyanins), have also been found among gene regulatory proteins (Steitz, 1993; Muller-Hill, 1996), signal transduction cascades, and membrane ion channels (Jackson, 1988). Their regulatory functions frequently entail changes in quaternary interaction and tertiary conformation in response to external signals (e.g., reaction with a specific metabolite, transcription factor), while the protein assembly acts as a transducer for the free energy costs of regulation. Such transduction of active site binding energy (via protein structure changes) is also central to the extraordinary rate enhancements of enzyme catalysis (Lumry, 1959;Jencks, 1969; Weber, 1975). Regulation occurs when two or more active site processes exert mutual influences on each other by using the macromolecule as a common transducer. A most important framework for understanding the mutual influences among coupled macromolecular reactions and their driving forces has been the theory of “linked functions,” which was created and developed extensively by Jeffries Wyman (1948, 1964; Edsall and Wyman, 1958; Wyman and Gill, 1990). Wyman’s ideas had a wide-ranging influence on early attempts to understand cooperativity of hemoglobin oxygen binding as well as the ligand-mediated regulation of multisubunit enzymes. During the period 1935-1966, two mechanistic lines of thought had emerged. The earliest model, by Linus Pauling (1935), and its later extensions by Daniel Koshland and colleagues (Koshland, Nemethy, and Filmer, 1966; hereafter called the KNF model), used the concept of ligand-induced conformation changes that were coupled by nearest neighbor interactions among the subunits of multimeric protein structures. This “sequential” concept was also advanced by Jacques Monod and his associates (1963) in their original allosteric model. However, a sharply contrasting concept had also been proposed by Wyman as early as 1948. In his analyses of the hemoglobin Bohr effect (Wyman, 1948; Wyman and Allen, 1951), Wyman postulated an equilibrium between alternative conformational forms of the hemoglobin molecule, with preferential binding of ligands to one of them. This “concerted transition”
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
187
concept was incorporated into the famous “two-state model” paper of Monod, Wyman, and Jean-Pierre Changeux (1965; hereafter called the MWC model). The early crystallographic work of Max Perutz supported Wyman’s concept by unequivocally demonstrating that two quaternary forms of the tetrameric molecule (Fig. 1) designated “T” and “ R ’ had 0,driven switching properties that validated Wyman’s early conjecture (Perutz, 1970, 1976, 1990; Wyman and Gill, 1990). Subsequent to these germinal developments much understanding has been achieved regarding the general nature of allosteric regulation and the associated structural transitions in a number of important systems. The formal models of Monod et al. (1965) and of Koshland et al. (1966) contributed monumentally by defining plausible rules by which protein conformation changes could regulate the mutual influences of binding reactions among substrates, activators, and inhibitors. Cooperative interactions among widely separated binding sites have thus been explained either as a sequence of “ligand-induced” changes in tertiary conformation that alter pairwise interactions among nearest neighbor contacts of the ligated subunit (KNF), or alternatively as a “symmetry-conserved”
FIG. 1. Subunit structure of the Hb tetramer. (a) Front view showing the a’and P2 subunits (light gray) in front of the az and P’ subunits (dark gray). At center lies the alPPcontact region, which, along with the a1a2 contact, forms half of the a’/?’ interface. The other half of the interface includes the azP’and aZdcontacts. (b) Side-view cartoon illustrating the major CUPdirner movement from the quaternary T * R structure.
188
GARY IL ACKERS
transition where the subunits switch in concert from their low- to highaffinity “states” (MWC). Wyman (1972) showed how both the MWC and KNF models could be understood to be special limiting cases of a more general model in which sequential reaction cascades were “nested” within each global T and R conformational state. Extensive research has established that both sequential and concerted processes occur in real biological systems and that hybrid combinations of the two are also evident (e.g., Wyman, 1972; Viratelle and Seydoux, 1975; Levitski, 1978; Lipscomb, 1994; Kurganov, 1982; Neet, 1983,1995;Leslie and Wonacott, 1984; Robert et al., 1987; Schachman, 1988; Louie et al., 1988; Kantrowitz and Lipscomb, 1988; Decker and Sterner, 1990; Perutz, 1990; Wyman and Gill, 1990;Acharya et al., 1991; Brzovic et al., 1994; Auzat et al., 1995; Muller-Hill, 1996; Steitz, 1993; Van Holde, 1995; Grant et al., 1996). The landmark crystallographic research of Max Perutz established major structural features which underlie cooperative O 2 binding by mammalian hemoglobins and the modulation of affinity by heterotropic effectors [2,3-diphosphoglycerate,C 0 2 , chloride] (Perutz, 1970, 1976, 1979, 1990; Perutz et al., 1960, 1987) (Figs. 1 and 2). These features include the deoxy “T” and oxy “ R ’quaternary structures (cf. Baldwin and Chothia, 1979; Fermi, 1975), which differ by a 15” rotation of one dimeric half-molecule (alp’) relative to the other (a2P2) as shown in Fig. lb. Correlation of the T + R structural transition with the cooperative (sigmoidal) shape of the O2 binding curve and with its modulation by organic phosphates (Arnone, 1972) established a key structural role of the T + R switch in mediating physiological cooperativity. Extensive research on native and mutationally altered human hemoglobins has also revealed the following mechanistic features:
1. Crystallographic and spectroscopic data have shown that partially ligated tetramers which have the quaternary T dimer-dimer orientation may nevertheless contain subunits in the “oxy” (r) and/or “deoxy” (t) tertiary conformations (Brzozowski et al., 1984; Arnone et al., 1986; Luisi et al., 1990; Ho, 1992; Mukerji and Spiro, 1994;Jayaraman et al., 1995). The presence of mixed tertiary conformations among subunits within the same quaternary structure violates the MWC model’s postulated “symmetry conservation” rule (Monod et al., 1965) but is required by rules of the KNF model (however, KNF does not postulate the Hb molecule’s global T + R transition). 2. Noncovalent bonds (i.e., salt bridges and H bonds) were identified (Perutz, 1970; Kilmartin, 1976) that break upon oxygenation of an individual subunit or upon T + R switching. Breakage of these bonds generates positive free energy and thus contributes to cooperativity by
THE M O L E C U M CODE OF HEMOGLOBIN ALLOSTERY
189
reducing net affinity of the accompanyingO2binding steps. Many studies on mutant human Hbs have supported these phenomena (cf. Bunn and Forget, 1985;Turner et al., 1992; LiCata et al., 1993;Abraham et al., 1997). 3. An “allosteric core” of structural elements within each subunit was identified by Karplus and colleagues (Gellin and Karplus, 1977; Gellin et al., 1983) that can mediate Orinduced tertiary conformation changes ( t + r) of the respective (Y and /3 subunits (Fig. 2). Their computational analysis showed two unstrained conformations for each of the allosteric cores, i.e., when an unligated subunit (t) is within a T quaternary structure, and alternatively when a ligated subunit (r) is within an R tetramer. Conformational strain that accompanies ligation of subunits within the T tetramer provides positive free energy (in addition to that arising from noncovalent bond breakage) and favors quaternary switching to the R interface (whereupon the Oz-induced tertiary conformational strain is relieved). Reciprocally, a deligation step within R would promote switching to T. 4. An important methodology for translating the crystallographic,and computationally derived structural features of the Hb molecule into statistical thermodynamic properties that are predicted to control its functional behavior was pioneered by Karplus and associates (Szabo and Karplus, 1972; Lee and Karplus, 1983; Lee et al., 1988). These studies have helped elucidate the roles of tertiary structure changes induced by heme site ligation of individual (Y and /3 subunits within the two quaternary forms of the tetramer. Additional relationships which connect the structural findings of Perutz and those of Karplus have been discovered by the work on partially ligated intermediates that is reviewed here. The possibility that oxygenation-induced changes of subunit tertiary conformation might generate cooperativity among the Hb subunits in the absence of quaternary T + R switching had been suggested by Perutz (1976) but remained an open question. The model of Lee and Karplus (1983; Lee et al., 1988) extended the earlier work (Szabo and Karplus, 1972) by allowingvariable Opdriven energetic contributions from salt-bridge breakage compared with those from tertiary conformational strain (Lee and Karplus, 1983). A principal result of the work discussed in the present chapter is that 0,driven tertiary strain is coupled within the symmetry-related dimeric half-molecule (a’fi’or azfi2) and this coupling is reflected in the observed binding cooperativity between the intradimeric a and /3 heme sites prior to the quaternary T + R transition. The significance of intradimeric coupling within the T tetramer (Ackers et al., 1992; Huang et al., 1996a) lies in promoting the T -+ R transition at specific reaction steps of the Hb tetramer’s oxygen binding sequence. The new findings have thus
190
GARY K. ACKEFS
FIG.2. Movement of F-helix and FG corner residues (proximal side of heme) upon ligation. Deoxy-Hb (black) and COHb (gray) structures are from Kavanaugh et al., 1992, and Silva et al., 1992, respectively. a-Carbons are indicated on the deoxy F helices; bound CO is shown on the distal side of the (gray) heme. The mean planes of the hemes were superimposed mathematically. When bound CO (or 0,) is released, the iron atom moves about 0.6 A from a position within the heme plane toward the proximal side. The His residue bonded to the heme Fe (position F8) likewise moves away from the heme plane. This motion is translated to the FG corner residues, which interact with residues on the opposite dimer within the Hb tetramer, i.e., across the dimer-dimer interface. This series of motions may form the basis of the structural communication between the heme and the dimer-dimer interface.
extended the mechanistic understanding of Hb, while remaining consistent with the earlier discoveries summarized above. 11. OVERVIEW This chapter reviews developments between 1985 and 1997 of concep tual strategies and experimental databases that have led to a determina-
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
191
tion of site-specificcontributions to O2binding cooperativity by the eight partially ligated intermediates of Hb (Fig. 3) and that have correlated the site-specific binding steps with tertiary and quaternary structural transitions of the H b tetramer and with the interactions of heterotropic effectors. This research has revealed specific rules of liganddriven T -+ R quaternary switching and of tertiary cooperativity that do not coincide with either the concerted MWC or the sequential KNF models and may be viewed as extensions to the more detailed structure-based models of Perutz and of Karplus cited above. The Hb allosteric mechanism has been found to employ elements of both the classical models (i.e., MWC and KNF); however, the elements are combined in specific ways that carry previously unrecognized structural implications. These discoveries resulted from a 12-year sequence of investigations (1985-1997) that has occurred in two stages.
A . Research Leading to the Symmetq Rule Using thermodynamic linkage cycles (Ackers and Halvorson, 1974; Smith and Ackers, 1985), our research group developed a strategy for
FIG.3. Symmetry rule: Ligand binding to either the a! or /3 subunits of microstate species 01 is communicated to the unligated subunit within the alp’ dimer in quaternary T structures. When at least one subunit is ligated on each dimeric half-molecule, the resulting tertiary changes disfavor the T interface and promote quaternary switching to R. While most of the binding cooperativity results from the quaternary transition, its distribution is controlled by the six “switchpoint” binding steps that implement the symmetry rule.
192
GARY K. ACKERS
evaluating the free energy contributions to cooperativity by all eight partially ligated Hb intermediates (Fig. 4; see Smith and Ackers, 1983, 1985; Smith et al., 1987; Perrella et al., 1990a; Ackers, 1990; Speros et al., 1991) for three well-established O2analog systems, i.e., Fe'+/FeR+CN, FeS+/Mn9+, and Cozf/Fe2+CO, that were known to mimic stereochemistry of the native Fe2+/Fe2'O2system (Perutz, 1979) and to exhibit the native T + R structural transition on overall ligation (Moffat et al., 1976; Fermi et al., 1982; Smith and Simmons, 1994; Silva et al., 1992). These experimental microstate free energies (Table I, cols. 2-5) provided an unprecedented opportunity to analyze the characteristics of contributions by the eight tetrameric binding intermediates. In each analog system the microstate cooperativity terms were found to exhibit similar distributions, from which a consensus function was deduced. The common relationships among cooperative free energies of the three analog systems were then used to calculate the distribution of native H b 0 2 microstates (Table I, col. 6) from independently measured stoichiomet-
FIG. 4. CdSCdde of stepwise ligation reactions for Hb microstate tetramers in linkage with representative constituent dimers. The ligated heme sites are denoted by X; the unligated ones are empty. Linked dimer-tetramer assembly reactions (left to right) are used to measure energetic costs of tetrameric cooperativity for the various steps. Molecular species are depicted topographically, with dimers assembling to form the nip2(or dimerdimer) interface (vertically bisecting each tetramer). Stepwise ligation accompanied by tertiary constraint is denoted by reactions 1 and 2; ligation with simultaneous quaternary T-+ Rswitching is denoted by reactions 3,5,7,8,10,and 11; and ligation not accompanied by either is indicated by reactions in gray.
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
193
Table I. Cooperative Free Energies of Partially-Ligated Human Hemoglobin
“Experimental conditions: pH 7.4; 21.5OC; 0.18 M CI-. Cols. 2-5,7,8: Microstate species [ Zjl of five ligation analogs. Col. 1: Stoichiometric cooperative free energies of the native HbOs system. Col. 6: The microstate free energies predicted from the consensus distributions of cols. 2-5, which were constrained to also conform with stoichiometric terms (col. 1) of the native system. Cols. 7,8: Additional analogs that were subsequently characterized. Col. 9: Microstate distribution for the native HbOs system, which was determined independently of data from cols. 2-5 (used in col. 6 prediction). References: col. 1, Ackers et al., 1992; Chu et al., 1984; col. 2, Smith and Ackers, 1985; Perrella et nl., 1990a; col. 3, Daugherty et al., 1991, 1994; col. 4, Ackers, 1990; col. 5 , Speros et al., 1991; Huang and Ackers, 1996; col. 6, Ackers et al., 1992; col. 7, Huang and Ackers, 1996; cols. 8, 9: Huang et al., 1996b.
’
ric HbOBconstants (Fig. 5; Table I, col. 1) (Ackers et al., 1992; Ackers and Hazzard, 1993; Holt and Ackers, 1995). Using structuresensitive probes it was found that the T R quaternary transition occurs at the six reaction steps of the binding cascade (Fig. 4) that yield ligated heme sites on both of the symmetry-related halfmolecules alp’ and aZpp(see Fig. 3). This “symmetry rule” provided a simple molecular code that translates reactions of the Hb binding cas-
194
GARY K. ACKERS
(a) 0.8 0.6
v
0.4 0.2
0.0
0.4
0.8
1.2
[o,]
(b)
- 14.3
2D
- 11.4
2DX 2(-8.3{
)
2DX, - 33.2
total
kcal
1.6
2.0
2.4
lo5 Cooperative Free Energy
b Tet - 5.4
+ 2.9
- 5.7
+ 2.6
-6.7
+Im6
- 9.1
-0.8
- 26.9
+ 6.3
kcal
J
- 8.8 kcal
Tea,
-7.2 kcalb
- 8.0 kcal
l,, J
W Tea,
COST OF STRUCTURAL REARRANGEMENT = + 6.3 kcal
THE MOLECULAR CODE OF HEMOGLOBIN W O S T E R Y
195
cade into the quaternary switchpoints that mediate cooperativity. Significant cooperativitywas also found to accompany specific binding steps within the half-tetramer (alp’ or a‘p’) prior to the global T + R structure change. These findings have supported the view that the quaternary and tertiary “switches” already known to mediate Hb cooperativity (Perutz, 1970, 1976; Gellin et ul., 1983) do so by specific rules of configurational site occupancy that were previously unanticipated. Consideration of these new insights in context with the earlier discoveries cited above has led to the following general picture: 1. Initial ligation on either of the symmetry-related half-tetramers
(a’P’ or aspncf. Fig. 1) generates a tertiary conformation change involving both subunits of the ligated dimer (Fig. 3). The associated conformutional work is resisted by the T interface, which remains intact, while the accompanying unfavorable free energy (designated “tertiary constraint”) reduces net binding affinity relative to that of the isolated subunits.
FIG. 5. (a) Binding isotherms at decreasing Hb concentrations (right to left). The rightmost curve pertains to tetramers T ( % = 3.2), and the leftmost curve to dissociated a’P’ dimers D (nH = 1.0). Intermediate curves were measured at H b concentrations between 4 X and 0.5 rnMheme (left to right): T = 21.5”C; pH 7.4; 0.1 rnMTris-HC1 buffer plus 0.1 M NaCl in the presence of 0.1 mM NazEDTA. Data were fit globally to the composite equation:
where Z? and Z, are the binding partition functions of dimers and tetramers, respectively (Ackers and Halvorson, 1974). Here Z, = 1 + KZIx+ &x2 and Z, = &x + &f + &x9 + K44# [cf. Eqs. ( 1 ) and (2) for definitions and discussion]; Z,l = &,x 2K2$ and Z,!= K4,x 2K& + 3K& 4K&. The K2,and &,are Adair binding constants of the dimer and the tetramer, respectively:
+
+
+
where the brackets denote concentrations of the respective protein species. (b). Linkage diagram showing stoichiometric free energy components of cooperativity. Binding of ligand X to tetramers is depicted on the right and that to dimers on the left. Dimertetramer assembly reactions at the five degrees of ligation are indicated by horizontal arrows. Gibbs energy values are given for X = Oz.
196
GARY K ACKERS
2. A second ligation on the same dimer is accompanied by a generally smaller tertiary constraint effect and hence occurs with increased net binding affinity. The T interface remains essentially intact so that there is tertialy cooperatiuity without quaternaly switching. This intradimeric cooperativity occurs between heme sites that are separated by -34 and disappears on dissociation of the tetramer into dimers. 3. Binding at heme sites of both dimers (in any combination) generates unfavorable free energy, i.e., “dimer-dimer anticooperativity,” which triggers the T + R transition. This quaternary switch releases dimer-dimer interactions of the T interface along with conformational strain of the assembled dimeric half-molecules. It contributes energetically through breaking the T interface bonds and forming the alternative R interface bonds at each of the six switchpoints specified by the symmetry rule (Fig. 3).
A,
The formation and release of tertiary constraint is thus a fundamental driving force of cooperative binding in Hb. Whereas the T interface can withstand one dimer having tertiary constraint, it cannot accommodate two such perturbed dimers. Since the movement of the iron into the heme plane upon O2 binding has been termed a trigger for tertiary conformation change (Perutz, 1970), the trigger for the T+R switch must be the structural event that causes dimer-dimer anticooperativity. In their extensive review of H b stereochemistry Perutz et al. (1987, p. 314) noted that “the shape of the alpl and anp2dimers is altered by changes in tertiary structure: on oxygenation the distance between the a carbons of residues FGla, and FGlP, shrinks from 45.6 to 41.3 A. These changes make an alp’ dimer that has the tertiary oxy structure a misfit in the quaternary T structure, and an alp’ dimer that has the tertiary deoxy structure a misfit in the quaternary R structure.” This observation undoubtedly provides an important clue to the structural origins of the energetic coupling that occurs between a and /3 heme sites of the same half-tetramer (a’ and p’) prior to the quaternary T + R transition (LiCata et al., 1993), and thus contributes to this ‘‘second trigger. ”
B. Structural Probes and Heterotropic Responses 1. Structure-sensitive probes have provided quaternary assignments of the intermediates that have supported the symmetry rule’s predicted switchpoints within the binding cascade (cf. Section VI,D) . 2. Modulation of the microstate energetics by heterotropic allosteric effectors and by temperature has also been found to exhibit symmetry rule behavior.
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
197
C. Microstate Cooperativity Components of HbO and HbCO
Studies on other ligation systems of Hb have provided strong support for the symmetry rule mechanism (also designated the “molecular code”) that was proposed in 1992: 1. The application of new methods for evaluating energetic perturbations by metal substitution at the hemes (Huang and Ackers, 1996) has delineated the contributions to cooperativity for all eight partially ligated intermediates of carboxy-Hb (Fe/FeCO) . These cooperative free energies showed the characteristic distribution predicted by the molecular code (Ackers et al., 1992). 2. Extension of these methods to the native H b 0 2system (Fe/Fe02) yielded site-specific cooperativity terms (Huang et aL, 1996a) in excellent agreement with those predicted earlier from direct O2 binding results (e.g., Figs. 5-7) using the molecular code partition function (Ackers et al., 1992).Contributions to cooperativityby the O2binding intermediates that were determined independently of data from the three initial analogs, and also independent of direct O2 binding data, were found to yield the correct highly cooperative O2 binding curve of native Hb to within the accuracy of its experimental measurement (Huang et aL, 1996a). This chapter reviews the current status of work on the hemoglobin molecular code, including methodology and discoveries from the initial
Fractional Saturation FIG.6. Stoichiometric fractions h, of tetramers with i ligands bound as a function of 0, saturation. Conditions: pH 7.4; 21.5”C;NaCl concentrations of 1.40 M (solid curves) and 0.08 M (dashed curves); 0, 1, 2, 3, 4 indicate the number of oxygens bound. (Doyle et aL, 1997).
198
GARY K. ACKERS
three analogs (1985-1992), subsequent findings on the partially ligated intermediates of O2and CO binding to native FeHb, and on the modulation of partially ligated intermediates by heterotropic effectors and other probes. Also discussed are arguments that have been proposed against this approach for elucidating the Hb allosteric mechanism. CURVES AND 111. BINDING
A.
STOICHIOMETRIC INFORMATION
The Adair Binding Function
A fundamental characterization of tetrameric hemoglobin’s O2 binding affinity and cooperativity has been the determination of equilibrium binding constants K , ( i = 1, 2, 3, 4) by numerical fitting of O2 binding data. Beginning with the classic researches of Roughton (Roughton et al., 1955; Roughton and Lyster, 1965; Rossi-Bernardi and Roughton, 1967),the “Adair constants” K i have been evaluated from measurements of the fractional saturation Y,i.e., the molar ratio of O2 reacted with the heme-binding sites; 7 is determined over a range of dissolved O2 concentration, x, and the data are fit to the tetramer Adair equation (Adair, 1925): -
Y=
+ 2 K 2 x 2+ 3K3x3+ 4K4x4 4(1 + Klx + K2x2+ K3x3+ K4x4) K,x
(1)
Each resolved Adair constant, defined by K , = [HbX,]/ [Hb] x’, reflects the molar free energy (AGi = -RTln K , ) of reacting the deoxy tetramer with a stoichiometric number i of O2 molecules. Thus, i = 0, 1, 2, 3, 4 ( K O = l ) , corresponding to the possible numbers of ligated hemes, irrespective of their site confgurutions within the tetrumer. Each K , contains a numerical factor, i.e. n ! / [ i ! ( n- i)!], which accounts for statistical degeneracy of the reaction product [Hblx,. Thus division of the respective equilibrium constants K , by 1, 4, 6, 4,and 1 yields the “intrinsic” constants [cf. also Antonini and Brunori, 1971; Edsall and Gutfreund, 1973; Imai, 1982; Wyman and Gill, 1990, for discussion of Eq. (1) and related functions]. “Stepwise” binding energies A G;, (corresponding to successive numbers of bound sites) are plotted in Fig. 7 at a sequence of heterotropic NaCl concentrations. The denominator of Eq. (1),excluding the factor of 4,is the fundamental grand partition function Z(x) for the four-site binding system of tetrameric Hb (cf. Hill, 1985), comprising the “statistical weights” K,x’ for successive degrees of ligation (for discussion of the statistical weight
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
199
concept in biopolymers, see Zimm and Bragg, 1959; Flory, 1969; Cantor and Schimmel, 1980). The way in which Z(x) varies with the concentration of dissolved O2determines the shape and x coordinate “position” of the binding curve, since -
Y = 1/4[d In Z / d In x]
(2)
By comparison with the noncooperative curve of its dissociated dimers (Fig. 5), the Hb tetramer’s binding curve exhibits a sigmoidal shape and “right-shifted” midpoint. These features facilitate the unloading of O2from red cells in the peripheral circulation and also the efficient O2 uptake in the lungs (cf. Bunn and Forget, 1985). Such physiologically important characteristics of the O2 binding curve, which are found in mammals, fish, birds, and many invertebrate organisms, reflect the structure-based changes in Adair constant values with the successive numbers i of bound 02.An unchanging affinity over the successive degrees of binding would yield a 7 vs. x curve with hyperbolic shape like that of dissociated dimers shown in Fig. 5. The resolution of unique values for the four K iparameters from numerical analysis of the highly cooperative binding curves of human Hb has always placed severe demands on experimental accuracy because of the high statistical correlation among K , values in the data-fitting problem (Johnson et al., 1976; Imai, 1990; Gill et al., 1987; Doyle and Ackers, 1992a) combined with the low abundance of partially ligated species (e.g., Fig. 6), which is characteristic of cooperative systems (cf. Edsall and Gutfreund, 1983). Modern instrumental methods developed by Imai (1982) and by Gill and colleagues (see Dolman and Gill, 1978; Gill et al., 1987) have contributed greatly by facilitating the determination of high-quality binding curves over extended ranges of conditions and sample types. Adair constants K , and other properties of the Hb binding curves have thus been evaluated extensively by nonlinear regression analysis (cf. Imai, 1990;Johnson et aA, 1976; Poyart et al., 1978; Gill et al., 1987; Doyle and Ackers, 1992a; Doyle et aL, 1994, 1997). The prodigious work of Imai has generated extensive and systematic determinations of Hb Adair constants over wide ranges of conditions, effectors, and structure modifications, leading to numerous mechanistic correlations, as summarized in his important monograph (Imai, 1982). Oxygen binding studies from the present author’s laboratory during the period 1975-1997 have generated findings that are in general accord with those of Imai and of Gill and colleagues. Most notably, a striking difference between the enthalpy for oxygenating tetramers anP2compared with their dissociated dimers was accounted for by the heat terms for 02-
200
GARY K. ACKERS
linked Bohr proton release during the first three tetramer-binding steps (Imai, 1979; Mills and Ackers, 1979a,b; Mills et al., 1979; Imai et al., 1980; Chu et al., 1984). Figure 7 shows the four stepwise O2 binding free energies of normal human Hb at a series of NaCl concentrations (Doyle et al., 1997), obtained using the elegant technique pioneered by K Imai (1982), and corrected for the contributions of all dimeric species. Structural Implications. The Adair constants (and their ratios) have been widely recognized as reflecting the Ordriven structure changes that accompany the saturation process (Perutz, 1970; Monod et al., 1965; Imai, 1968, 1982; Weber, 1972; Szabo and Karplus, 1972; Ackers and Halvorson, 1974; Mills et al., 1976).Thus the binding of each stoichiometric number of oxygens ( i = 1, 2, 3, 4) may generate changes in the molecule’s tertiary and/or quaternary structure (plus altered interactions with heterotropic effectors, solvent, etc.) for which the “free energy costs” are balanced (by the Hb molecules) against the chemical affinity of O2 bonding onto the heme iron atoms. Because of this central role of the Adair constants in reflecting structure-energy responses of the Hb molecular mechanism and the fact that physiological uptake and delivery of O2is essentially controlled by the equilibrium thermodynamic binding curve (cf. Bunn and Forget, 1985),a “predictive” understanding of the structural, energetic, and dynamic origins of the Adair constants would arguably constitute a solution to the classic problem of the Hb mechanism. The discoveries reviewed in this chapter have shown that such understanding requires each stoichiometric Adair constant to be
-9 -10
-=.,
- w’ I
I
, I
Y
--
4,,m--w-I
I
I
I
I
I
I
I
I
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
201
dissected into its constituent “microstate constants” in order for the 02-driven tertiary and quaternary transitions to be related individually to the site-specific components of the binding cascade (Fig. 4). Strategies and techniques by which thermodynamic contributions to 7 have been elucidated for the 10 unique ligation “microstates” (Fig. 3) are described below beginning with Section IV.The methodology employed was an extension of the cooperativefree energy approach that was developed earlier in HbOpstudies at the stoichiometric level (Ackers and Halvorson, 1974; Mills et al., 1976;Johnson et al., 1976; Chu et al., 1984; Doyle et al., 1992b, 1994, 1997). This approach evaluates free energy “costs” of the binding cooperativity by simultaneously utilizing the “Wyman linkages” that connect heme site ligation and dimertetramer assembly with resolution of Adair constants for the interacting tetrameric and dimeric species. The following subsections summarize aspects of this methodology that have formed the basis for its extension to the Hb microstate level (Section II1,B) and the relationship of these stoichiometric-level parameters to the traditional Hill coeficient that has been widely employed as a descriptor of cooperativity (Section 111,C).
B.
Using Dimer-Tetramer Assembly to Evaluate the Energy Costs of Binding Cooperativity
Determinations of thermodynamic parameters for the oxygen-linked dimer-tetramer assembly reactions (e.g., Fig. 5a) have provided a sensitive probe of Hb allosteric properties (Ackersand Halvorson, 1974; Mills et al., 1976;Johnson et aL, 1976; Atha et aL, 1979; Chu et al., 1984; Doyle et al., 1991, 1992a,b, 1994, 1997). This strategy is based on the following rationale: (1) Oxygenation-induced interactions that generate cooperativity within the Hb molecule are decoupled by dissociation of the tetramers into dimers. (2) The difference between the dimer-tetramer assembly free energy of a tetramer with i bound oxygens and that of the unligated tetramer reflects the net energetic cost from the subsystem tertiary conformation changes (t + r), the pairwise interactions (salt bridges, H bonds), and quaternary transition (T + R) that accompany binding ioxygens onto the tetramer’s a and p subunits. In principle these cooperativitycomponents may be estimated equivalently as differences in successive binding free energies, or as the corresponding differences in dimer-tetramer assembly energies (Fig. 5b). Thus a cooperutivefree energy may be defined for each degree of binding:
‘AG:= - R T I n ( ’ K f )
(3)
where ‘K: = ‘K;/’K; and each equilibrium constant ( i = 0,1, 2, 3, 4) characterizes formation of the tetrameric species having i ligated
202
GARY K. ACKERS
heme sites from ap dimers bearing appropriate ligated hemes (Ackers and Halvorson, 1974;Johnson et al., 1976).Independent determinations of dimer-tetramer equilibrium constants for the end-state species (i.e., OK; and 4K;) have been carried out extensively using analytical gel chromatography (Ackers, 1970, 1975; Valdes and Ackers, 1979; Turner et al., 1992) in combination with stopped-flow kinetics and haptoglobin trapping kinetics (Ip et al., 1976; Ip and Ackers, 1977; Turner et al., 1981). These methods have been used to characterize more than 60 mutant and chemically modified human hemoglobins (Pettigrew et al., 1982; Doyle et al., 1992a,b; Turner et al., 1992). Values for the dimertetramer assembly constants at partial degrees of oxygenation ( ‘ K ; ,‘K;, and 9 K ; )were resolved from analysis of concentration-dependent binding curves determined at a sequence of Hb concentrations, also yielding Adair constants of the linked dimers and tetramers (cf. Figs. 5 and 7). The difficulties in obtaining unique values of the four Adair constants (cf. Imai, 1982, 1990; Di Cera et al., 1987; Kister et al., 1987; Doyle and Ackers, 1992a) had led Marden et al. (1989) to conclude that unique affinities for partially ligated species “may be impossible.” This conclusion was in accord with early analyses (Ackers et al., 1975) that stimulated our research group to develop the linkage strategy based on Wyman principles, which incorporates multiple dimensions for resolving the system’s thermodynamic properties, even at the stoichiometric level (cf. Chu et al., 1984). It should be noted that the tetramer-dimer dissociation rate of normal deoxy-Hb is remarkably slow (e.g., tlIP= 10 h at 21.5”C), corresponding to an Arrhenius activation energy of 33 kcal/mol (Ip and Ackers, 1977). Such slow kinetic properties of the quaternary T tetramer’s disassembly reactions are totally consistent with the more rapid processes of tertiary and quaternary transition that are driven by heme site binding onto the already assembled tetramers (Eaton and Hofrichter, 1990; Henry et al., 1997). When tetramer-dimer dissociation reactions are linked to those of heme site binding in order to evaluate the thermodynamics of cooperativity, the Wyman linkage methodology is rigorous and the resulting information on cooperativity of the bindingenergeticsis path independent. This great utility of the linked functions approach as applied to subunit assembly/hybridization techniques has been exploited extensively in the analyses of Hb microstate cooperativity that are reviewed in this chapter. Relationships to the Tetramer Adair Function. Central to the usage of thermodynamic linkage cycles for evaluating contributions to binding cooperativity at both the stoichiometric and microstate levels is the realization that when cooperative free energies are calculated from a
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
203
ratio of dimer-tetramer assembly constants (Eq. 3) they also provide a measure of the free energy contributions from tertiary and quaternary structural transitions that accompany the tetrameric binding in excess of those from binding to the same sites of dissociated dimers. The distribution of iAG: terms will also reflect the OTlinked free energies of heterotropic effector binding (Bohr protons, DPG, C 0 2 , or chloride) in excess of the respective heterotropic effects on O4 binding by the dissociated dimers (Chu et al., 1984; Doyle et aZ., 1997). Reciprocally, a determined set of Adair constants Ki (i.e., uncorrected for statistical factors) can also be used in combination with binding constants evaluated - for the dissociated dimers, to yield the same cooperativity terms, 'AGL, since
where k, and k, are respective binding constants to a and p subunits of the dissociated dimers, i.e., Eqs. (4a), (5a), (6a), and (7a) (Pettigrew et aL, 1982; Doyle et al,, 1997).When the dimeric sites have equal binding affinities (Mills et aL, 1976; Chu et al., 1984; Doyle et al., 1997), k , k, = kp, yielding Eqs. (4b), (5b), (6b), and (7b). These formulas have provided a framework whereby the free energy costs of tetrameric cooperativity may be evaluated either through determinations of 'Kp and OK2 or equivalently from the tetramer binding constants K, in combination with those of the constituent dimers. Experimental implementation of this strategy, which links the dimer's heme site binding reactions to those of the same sites within the assembled tetramers, has made it possible (1) to evaluate and correct for the effects of dissociated dimers on O2binding curves, increasing the accuracy of the resolved tetrameric properties, and ( 2 ) to combine experimental information from multiple
204
GARY K. ACKERS
techniques, thus minimizing the impact of systematic bias of any single method. This methodology has also been used to analyze the Orlinked subunit assembly reactions of H b containing cobalt-substituted hemes (Doyle et al., 1991) and with mutant and chemically modified species (LiCata et al., 1990; Atha et al., 1979; Doyle et al., 1992a,b). An especially illuminating example of the interplay among these reciprocal linked functions is the study by Riggs and associates on the mutant H b Kansas where the affinity of /3 subunits k , is greatly reduced by the amino acid replacement @lo2 asnAthr) (see Atha et al., 1979). Anticipating the extension of this approach to the microstate level (Section IV),we can reformulate the tetramer Adair function, Eq. ( l ) , in terms of the cooperativity constants and affinities of the dissociated dimer sites ( k , and k,, or k d ) . When a and /3 subunits of the dimers have essentially identical intrinsic affinity k d (as with normal Hb over a wide range of conditions) the tetramer Adair function may be written -
Y=
+ ['K:]( k , , ~ ) ~ [TI( k d x ) + 3 [ F ] ( k d x ) ' + 3C-I (1 + 4[=] ( k d x ) + 6[?K,'] (&x)' + 4['KK:] ( k , , ~ ) + [TI( k d x )'}
(8)
If subunits within the dissociated dimers have differing affinities, the corresponding function may be written to incorporate formulas (4a), (5a), (6a), and (7a). This method of transforming the classical Adair binding function into a combination of dimer affinities and cooperativity terms that are derived from subunit assembly data played a crucial role in the early solution of the Hb molecular code (Ackers et al., 1992). At the stoichiometric level, Eq. (8) specifies how a knowledge of the 'K: values and dimer affinities k d will always provide a correct set of the Adair constants K , and partition function Z( x) . Extension of this concept to each of the ligation microstates (Smith and Ackers, 1985) was a major feature of the strategy (to be described in Section VI) that eventually yielded cooperative free energies for all intermediates of the HbOB system (Ackers et al., 1992). This usage of dimer-tetramer assembly energies for evaluating cooperativity constants 'K, of tetramers having i ligated heme sites is valid whenever sites of the dissociated dimers d o not interact cooperatively; validity of the method does not require the dimer sites to have identical affinities with any of the tetramer-binding steps. The cooperative free energies ;AGC- RTln('K,) listed in Table I, col. 1, will be compared with those of the corresponding microstate tetramers in subsequent sections of this chapter. Quaternaly Enhancement. The discovery that assembled Hb molecules may exhibit higher O2affinity than their dissociated subunits (i.e., "qua-
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
205
ternary enhancement”) was initially made for the association of Hb p subunits into tetramers (Valdes and Ackers, 1978; Kurz and Bauer, 1978). ) O paffinity at the fourth In the case of the normal tetramer ( a 2 p 2the binding step had been inferred earlier from model analysis (Szabo and Karplus, 1972) to be higher than that of isolated (Y or /3 subunits. Experiments with normal Hb have found the tetramer’s affinity for the fourth O2to exceed that of the dissociated dimers (or isolated monomers) over wide ranges of conditions (Mills et al., 1976; Mills and Ackers, 1979a,b; Imai and Yonetani, 1975; Di Cera et al., 1987; Chu et al., 1984; Ackers andJohnson, 1990; Doyle and Ackers, 1992b; Doyle et al., 1992b, 1994). The recent discovery that quaternary enhancement can be titrated to a negligible magnitude by increasing [NaCl] is illustrated in Fig. 7 (Doyle et al., 1997). Recent time-resolved circular dichroic studies of HbOB photolysis intermediates by Kliger and associates (GhelichKhani et al., 1996) have also supported the findings of quaternary enhancement in human Hb. Arguments against quaternary enhancement (Gibson and Edelstein, 1987) have been made on grounds that (1) measured O2rebinding rates (after flash photolysis) did not require a sufficiently high equilibrium constant for quaternary enhancement (i.e., based on expected values for the “off” rates) and that (2) quaternary enhancement was thought to violate the concerted MWC model, which was believed to have a greater preponderance of supporting evidence than did the quaternary enhancement effect. By contrast, however, it had been demonstrated earlier (Ackers and Johnson, 1981) that thermodynamic constants determined for the linkage scheme (Fig. 5b) were fully compatible with the constraint requiring tetramers to follow a two-state MWC model, while their dissociated dimers constituted a “third state” in the sense of having an O2 affinity different from either the MWC model’s T or R tetramers (Ackers and Johnson, 1981). The aforementioned independent findings that dissociated dimers may bind O2with lower affinity than the fourth tetramer step do not constrain the possible models of tetrumm’callostery.The O2 affinities of dissociated dimers per se have no mechanistic bearing on their use as “built-in” reference reactions for gauging “thermodynamic distances” between the tetrameric species with which they share linked equilibria.
C. The Hill CoefJient A traditional measure of cooperative interaction among the binding sites within a protein is the Hill coefficient nH = d ln[F/(l - F ] / d In x, which is usually determined as the slope of a logarithmically trans-
206
GARY K. ACKERS
formed binding curve (cf. Gutfreund and Edsall, 1978; Wyman and Gill, 1990). The maximum value of n H ,or its value at half-saturation, is frequently used as an index of the degree to which ligation events exert their mutual influences. Common usage of the above formula (Hill, 1910) has fostered the notion of nH as a measure of “the number of cooperating sites.” In Hill’s original derivation, a (nonintegral) number n Hof Hb sites were assumed to react in concert with ligand (cf. Fersht, 1985). However, the above nHfunction was shown independently by K. Linderstrom-Lang to be a normalized statistical variance of the species abundances over all stoichiometric degrees of binding ( i = 0, 1, 2, 3, 4 for tetrameric Hb), such that n H = [y‘ - ( y ) ‘ ] / [ y ( l - P)] (cf. Cohn and Edsall, 1943; Edsall and Gutfreund, 1983, pp. 182-201; Wyman and Gill, 1990). Thus, cooperativity, as determined by the Hill coefficient is a purely statistical characterization of the population distribution, and not a stoichiometric “number of cooperating subunits” within the tetramer. In principle, the nH vs. x function, in combination with higher statistical moments of the binding curve, can be used to estimate the four Adair K,values (cf. Wyman and Gill, 1990). However, such characterizations of the global y vs. x binding curve cannot be used to obtain the unique contributions by individual microstates, for the reasons detailed below in Section IV. The methods by which this classical limitation has been overcome are a major focus of the present chapter. Whereas neither the Adair constants nor Hill’s nHcan yield the specific intramolecular cooperativity of subunits within the Hb tetramer, the site-specificthermodynamic methods reviewed here have accomplished this fundamental goal.
IV.
SITE-SPECIFIC ASPECTS OF OXYGEN
BINDING
A. Microstate Components of the Adair Function Each tetrameric species ijof Fig. 3 may be characterized thermodynamically as the reaction product of i ligands binding onto heme sites of the “deoxy” species 01 in the particular configuration ij. The resulting site-spec@ equilibrium constant is defined by k , = [ VHbX ,] / [ “Hb] x’, where [‘’Hb] and x are thermodynamic activities (or ideal concentrations) of the unligated tetramer (species 01) and the unreacted heme site ligand, respectively, while [ ‘HbX ,] denotes the concentration of ligation microstate i j ( kol = 1 ) . The fraction of binding sites occupied at each ligand activity x (proportional to partial pressure for gaseous ligands) may be formulated by the law of mass action to yield
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
-
Y=
2 [ k l l + k 1 2 ] X + 2 [ 2 ( k 2 1+ k z ) + k23 f 4{1 + 2[kll + k l s ] + ~ [2(k21 + k 2 2 ) + k23
k241.x’
6[k3 + k3z]XS
207 4k44X4
+ k443X2 + 2[k31 + k 3 2 1 ~ ’ + k41x4} (9)
Equation (9) is a site-specific formulation of the classic Adair function [Eq. ( l ) ] and explicitly accounts for the free energy contributions (AG,,-= -RTln k l l ) of all microstate tetramers to the fractional saturation Y (Ackers et al., 1992; Doyle and Ackers, 1992a; Huang and Ackers, 1995a). Comparison between Eqs. (1) and (9) shows that each Adair constant K, is a weighted sum of contributions from the configurational isomers ij, each having i ligands bound:
[2kii
+ 2k121
K3 = [2k31
2k321
K1
=
K2 = [2k21
K4 =
+ 2k22 + k23 + k241 (10)
[k4,I
The numerical factor g, preceding each k , denotes the number of ways that species ij can be formed by reacting i ligands with the unligated species 01 (Fig. 3). Thus the 10 microstate tetramers contribute to the binding curve according to their statistical weights g g k g x ’ .Table I1 depicts the 16 formal tetramer configurations (col. d) and their relationships to the 10 unique combinations of heme site occupancy (col. c). Also shown are the correspondences between the 16 terms bearing statistical weights and the stoichiometric cooperativity constants K . Since each Adair K, is a weighted sum of the particular “microconstants” k , for forming configurational isomers having i ligands bound, it follows that all combinations of the nine k , values which sum, within the respective brackets of Eq. (lo), to the same four K , values will speca3 a n identical binding curue us. x. In principle, then, an infinite number of sets of the nine microstate constants k , can predict each experimental binding curve [Eq. (9)] since the exact form of the curve is determined solely by the numerical coefficients of successive powers of x. Thus, while numerical fitting of a measured vs. x data set (e.g., by nonlinear regression) can resolve only the four “best fit” K, parameters, such analysis cannot provide unique values of the nine constituent microconstants. This observation is a simple matter of principle that does not depend on the accuracy of experimental data nor on the statistical confidence limits of parameter estimates, etc. (what is not possible in principle cannot be improved in practice). As discussed in Section 111, this limitation must also apply to other functions that depend solely on y, including the Hill coefficient n H ,and other “statistical moments” of the P function (Wyman and Gill (1990), pp. 74-76).
r
r
TABLE I1 Relationships between Cooperativity Constants of Hemoglobin Intermediates
a. Number of sites occupied
(0)
b. Stoichiometric statistical weight
1
c. Microstate species
d. Site configuratior
ij
[Cu'/3'ayP]
01
0000
Site-specific statistical weight I I
e. Gen.
1
f. Experimental consensus
I 1
)
h. Molecular code parameters
g. Quaternary
structure
1
I
11 4('K:)s
6(Tc)s'
(3)
(4)
4(3K:)s3
(4K:)sd
1000 0010
................................................ 12
0100 0001
21
1100 001 1
22
1001 0110
................................................ 23 1010 ................................................ 24
0101
31
1101 011 1
I .........................I ''kc * s "k, * s "k,.
KS KS
s
I
KS
"kc. s
I
KS
]
4KS'
T T T T
}
4KC5
]
4K,',K,s2
]
K$
.................. L ........................... I as!! I 4s'
R
"k, * s2 j qs2 ......................... 24kc * s2 qs2
R R
"kr * s2 "kc * sz
.................1.......
R
................................................ 32
1110 1011
R
41
1 1 1 1
R
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
209
B. Implications fw Allosteric Model Ana2ysi.s Analyses of experimental binding curves (Pvs. x) in terms of allosteric models have often been conducted by reformulating the Adair constants K , as combinations of model parameters (e.g., L, c, and kR of MWC) which portray assumptions of the particular model chosen. Then, model parameter values that are consistent with each experimental binding curve may be estimated by fitting the data (Bvs. x) to the transformed Adair equation. This strategy provides an important means of estimating contributions of the molecular processes within an already-assumed model (e.g., Szabo and Karplus, 1972; Ackers and Johnson, 1981; Gill et al., 1987; Di Cera, 1990) but seldom affords a unique way of discriminating between alternative models such as MWC, KNF, or more detailed extensions thereof (seeJohnson et al., 1984; Lee et al., 1988, for discussion of these issues).Such ambiguity may also result from the aforementioned limitation that data-fitting algorithms are capable only of analyzing a given binding curve to yield coefficients of the four powers of x, i.e., the stoichiometric Adair constants K , or their equivalent. Since the four resolved K , values cannot be used to calculate a unique set of constituent microstate constants k , , neither can they provide unique tests for models whose rules impose microstate-level properties. As an example of the last point, the concerted MWC model (Monod et al., 1965) postulates two alternative quaternary structures (T and R) along with the special rule that all subunits in quaternary structure “T” have the same tertiary conformation (designated t), whereas they have a second tertiary conformation (designated r) in the alternative quaternary form “R.” The changing net affinities at successive binding steps are generated solely through a progressively increasing abundance of the high-affinity (R) species relative to those with the low (T) affinity; i.e., this (“two-state”) model assumes cooperativity properties that vary with the number of bound ligands, irrespective of site configuration. Critical tests of the simplest two-state model for tetrameric hemoglobin would thus require determination of cooperativity properties for all the configurational isomers at each stoichiometric degree of heme site ligation. By contrast, the sequential KNF model postulates that changes in affinity arise during the binding sequence from altered near-neighbor subunit interactions as a result of liganddriven tertiary conformation changes. These divergent mechanisms have long been known to represent O4 binding isotherms with equal accuracy (Koshland et al., 1966), demonstrating that a unique mechanism cannot be established from classical ligand saturation curves alone. It has been proposed (Edelstein, 1971) that a specific bell-shaped relationship between maximal nH values for a series of mutant Hbs and
210
GARY K. ACKERS
their apparent MWC L values provided a rigorous proof of the MWC model vs. the KNF model. While this claim has been challenged on the basis that a wide range of experimental data do not show the purported bell-shaped relationship (Minton, 1971; Bunn and Guidotti, 1972; Imai, 1973), it is also clear that the limited information content of the n H function noted above and in Section V,C would by itself invalidate such proposals. Only partial relief from the ambiguities of stoichiometric-levelresolution may be gained by combining databases that simultaneously reflect additional dimensions of functional behavior such as temperature, pH, [DPG], and [NaCl], or from hemoglobins bearing mutationally altered residues Uohnson et al., 1984; Lee et al., 1988). The realization that a unique evaluation of thermodynamic properties for the partially ligated Hb microstates is not possible from global binding curves has motivated the development of methods that explicitly resolve the microstate contributions (e.g., Yonetani et al., 1974; Imai et al., 1977, 1980; Perrella and Rossi-Bernardi, 1981; Miura and Ho, 1982; Simolo et al., 1985; Smith and Ackers, 1983, 1985; Shibayama et al., 1987, 1995). In principle, such methods permit the researcher to first conduct a model-independent determination of cooperativity parameters at the level of the ligation microstates and subsequently to assess the relationships between heme site ligation, tertiary and quaternary structural transitions, and the responses to heterotropic effectors, temperature, etc. Experimental correlation between the energetic responses and the structural transitions when each microstate species binds an additional ligand onto one of its vacant sites provides information at a level of molecular detail that is comparable to the postulates of traditional allosteric models (Ackers, 1990).The determined microstate contributions may therefore be tested against mechanistic assumptions with greater discrimination than is ever possible at the stoichiometric (Adair) level.
C, Impediments to Microstate Resolution Given the inherent desirability of analyzing HB cooperativity at the microstate level of resolution, why were determinations of the nine sitespecific constants k , [Eq. (lo)] not carried out during the 25 years subsequent to publication of the first Hb crystal structures (i.e., Perutz et al., 1960)? From the standpoint of experimental feasibility the answer to this question is that the following obstacles had to be overcome: ( 1 ) Abundances of the partially ligated intermediates are generally low relative to the end-state species 01 and 41, making them difficult to analyze (e.g., see Fig. 6). (2) Rapid dissociation of heme-bound O2 has
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
21 1
precluded isolating and studying the native H b 0 2 intermediates in pure form. (3) Dissociation of partially ligated H b tetramers into dimers leads to reassembly reactions that form tetramers with different combinations of occupied sites. Next, in Section V, we consider conceptual strategies and experimental methods that have circumvented these barriers and have led to the determination of cooperativity terms for the nine microstates of human hemoglobin. DETERMINATION OF SITE-SPECIFIC COOPERATMTY TERMS V. EXPERIMENTAL
A. Active Site Analogs of Oxygenation To circumvent the obstacles just noted pure samples of the “symmetric” doubly ligated species 23 and 24 have often been prepared and studied using (a) tightly bound oxygenation analogs that mimic the native heme sites (e.g., CO, CN-met, or NO); (b) metal-substituted hemes that mimic the stereochemistry of natural Fez+heme, e.g., replacement by Co2+,Zn2+,MnS+,or Cr3+;o r (c) metal-substituted hemes in combination with non-oxygen ligands (e.g., Co2+/FeCN).Usage of these analog systems has had a logic similar to that of systems which mimic enzymesubstrate complexes. While such analogs are often physiologically unacceptable due to the formation of “dead-end” complexes, inappropriate reaction rates, or deleterious side reactions, their stereochemistry has been exploited powerfully to elucidate both active-site mechanisms and allosteric regulation. The allostei-ic enzyme aspartate transcarbamylase (ATCase) has thus been characterized extensively through binding of the analog PALA (N(phosphonacety1)-L-aspartate) which mimics the native enzymesubstrate complex (cf. Schachman, 1988). For hemoglobin, the structural features of surrogate “oxy” heme sites must mimic those of the Fe2+02site within the native molecule, whereas surrogates of the “deoxy” site must have structural features of the native Fe2+Hb heme, in accord with stereochemical requirements that were delineated by Perutz (1976, 1979, 1990; see also Pemtz et al., 1987). When these requirements are met, the analogcontaining hemoglobins have been observed to assume the respective T or Rquaternary structures. The first three analog systems for which complete microstate free energy distributions were determined [Fe2+/FeS+CN; Fe2+/Mn3+;Co2+/Fe2+C0 (see Ackers, 1990)] had previously been shown to have similar tertiary and quaternary structures to normal Hb and to elicit normal quaternary structural response to heme site ligation (Fermi et al., 1982; Yonetani et al., 1974; Imai et al., 1977; Ikeda-Saito and Verzilli, 1981; Hoffman et
212
GARY K. ACKERS
al., 1975; Moffat et al., 1976). These analog systems were thus used as prototypes for initially resolving the rules of HB cooperativity, even though quantitative differences from native Fez+/Fez+02were expected (Ackers, 1990). While it is reasonable to expect that analogs which conform to normal Hb stereochemistry and execute the overall T + R quaternary response to ligation would also follow rules of the native HbOzsystem at intermediate states of ligation, such analogs generally show quantitative deviations from native behavior (as with enzyme-substrate analogs that exhibit altered K , and kc,, while mimicking the native catalytic mechanism). It has thus been central to the strategy of the present authors’ research program to analyze a range of chemically diverse analogs that were known to execute the T +Rquaternary structural transition upon overall heme site ligation. The molecular code mechanism for Hb oxygenation (Ackers et al., 1992) was thus not formulated under the premise that mechanistically valid O2 analogs would necessarily have functional responses that are quantitatively identical to those of the native system or of each other. By contrast, the following strategy was used: (1) The common characteristics for a range of analogs were determined at the microstate level. (2)A general (consensus) binding function was formulated, incorporating the characteristic functional and structural relationships exhibited by ligation intermediates of the various analog systems. (3) The native H b 0 2 parameters were evaluated by constraining the consensus relationships to conform with Adair constants from direct oxygen binding. It was emphasized in Ackers (1990, p. 380) that “in our choice of the CN-met system for tlie first complete resolution of the intermediate species, we may have serendipitously forced the molecule to reveal, in bold caricature, a state that is usually manifested with more subtlety in other ligands.” These issues, which lie at the very heart of efforts to obtain a site-specificsolution to the mechanism of Hb allostery (and of other complex macromolecular systems), are revisited for more detailed and specific consideration in subsequent sections of this chapter.
B. Linkage Cycle Analysis of the Partial4 Ligated Intermediates Contributions to cooperativity by all eight ligation intermediates have been evaluated extensively using an experimental strategy that takes advantage of the natural dissociation of Hb tetramers into their constituent ap dimers (i.e., a’@ and a2p2,cf. Fig. 1). This strategy, which uses thermodynamic reference cycles (Ackers and Halvorson, 1974; Smith and Ackers, 1985), had been applied previously to the stoichiometric level of HbOz cooperativity, as shown in Fig. 5 (Mills et al., 1976; Mills and Ackers, 1979a,b; Atha et al., 1979; Chu et al., 1984; see also Doyle
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
213
and Ackers, 1992a; Doyle et al., 1997), and was subsequently extended to the microstate level (Smith and Ackers, 1983, 1985). It was first established that the dissociated dimers bind heme site ligands noncooperatively (Mills and Ackers, 1979a; Ackers and Johnson, 1990; Doyle and Ackers, 1992a) with affinities for O2that are essentially identical to those of separated a and p subunits (Chu et al., 1984; Doyle et al., 1997). Thus ligation of the dimeric hemes provides a “built-in” reference reaction for assessing energetic contributions to cooperativity that accompany the 16 tetramer ligation steps of the binding cascade (Fig. 4).Cooperativity arising from ligation-induced tertiary and quaternary structure changes (including the breakage of noncovalent bonds) may thus be reflected in the energetic “penalties” for ligating a specific set of tetrameric sites, compared with the same sites on dissociated dimers. The thermodynamic linkage cycle of Eq. (11) has provided a means of determining the cooperutivity constant, qkc from measured dimer-tetramer assembly equilibrium constants qk2 and 0 1 k 2 , which are combined according to Eq. (12) (experimental techniques by which qk2 parameters of the microstate tetramers have been determined are discussed in Section V,D) :
“k, k,/ ( k a )p ( kp) 9 = ,kp/O1kp= ,kc
(12) In Eq. (11), ’’kcand 0 1 k 2 are the equilibrium constants for assembly of species ij and 01 from their constituent dimers; k, and kp are binding constants for a and p subunits of the dissociated dimers; and p and q are respective numbers of ligated a and /3 subunits within the dimers that serve as the reference reaction for determination of each particular qkk,(Smith and Ackers, 1985;Ackers et al., 1992; Doyle and Ackers, 1992b; Huang and Ackers, 1995b). By path independence of free energy, each ‘JAGcterm obtained by measuring ‘Jk2and O’k2 [i.e., YAG, = -RT ln(~k2/01k2)] is identical to that which would be found if one could directly measure the reaction processes on the right and left of Eq. (12). The strategy of Eq. ( l l ) , whereby measured dimer-tetramer assembly free energies are employed to determine “thermodynamic distances” between the 10 microstate
214
GARY K. ACKERS
tetramers (i.e., each distance being relative to species O l ) , is rigorously exact and is independent of the specific magnitudes of dimer-binding energies regardless of whether the site affinities of dissociated dimers (k,, k,) are equal or whether affinities are different for a and /3 subunits within the tetramers. Such differences are automatically taken into account in the calculation of y k , by Eq. (12). All other methods of calculating free energy distances between the tetrameric microstates from these Ilk2 databases (e.g., Di Cera, 1995) are entirely equivalent as a consequence of the path independence of free energy. 1 . Nondissociating Heme Site Analogs
For nondissociating ligands or analog heme sites (e.g., created by metal substitution or surrogate ligands that are covalently bonded to the heme) a value of ‘AG, determined from dimer-tetramer assembly data represents the molar free energy for replacement of the “deoxy sites” having configuration ij within the tetramer by their (surrogate) “oxy sites” minus the free energy of similar replacement reactions at corresponding sites of the dissociated dimers. The energetic contributions of such “replacement reactions” in thermodynamic cycles like Eq. (11) are conceptually rigorous for evaluation of the microstate cooperativity terms. Even when the replacement reactions per se are not practicable, the correct value of qkk,is still determined by measurements on the assembly reactions of species ij and 01. Each VAGrvalue measures the free energy for i mol of (equivalent) site replacement within tetrameric species 01, which produces the surrogate species ij, in excess of that for similar “replacement” at corresponding sites of the dissociated reference dimers. A striking example of nondissociable ligation analogs is that of ruthenium carbonylporphyrin H b (Ishimori et al., 1989),which has been a model for carbon monoxide ligation even though the “ligand” is nondissociably bonded to the heme site. Although this assembly-linkage approach to determining cooperativity constants ‘’k,for tightly bound, o r nondissociable, O2analogs is thermodynamically rigorous, the resulting values cannot be assumed identical with *jk,terms for the native H b 0 2system. It has therefore been necessary to develop strategies for “translating” the y k r distributions into those of the native H b 0 2 system, that d o not rely on any assumed identity of Yk, values. Strategies that have been successful in achieving this goal are discussed in Section VI.
2. Structural Origtns .f the Thermodynamic Ligation-Assembly Linkages The measured affinity for binding O2 (or another ligand) onto the a or p site of a dissociated dimer (a’p’)is a net resultant of the chemical
THE MOLECULAR CODE OF HEMOGLOBIN WOSTERY
215
bonding energy for O2onto the heme iron, balanced against the unfavorable energy of conformation change which the heme-plus-protein must undergo to accommodate stereochemical requirements of the O2 bonding product (Fig. 2) (cf. Perutz, 1976; Gellin and Karplus, 1977; Perutz et al., 1987). This net free energy of reacting O2at a or p heme sites of the dissociated dimer is found to be -8.3 kcal/mol at a standard set of conditions (Mills et al., 1976; Chu et aL, 1984; Doyle et al., 1997). Whereas the dissociated dimers bind O2noncooperatively with affinities nearly equal to their (monomeric) subunits (Mills and Ackers, 1979b), these affinities are modulated when the dimers are subject to additional structural constraints including the intersubunit H bonds and salt bridges that must be overcome to accommodate bonding at the heme site, as follows:
a. Free energy oftertiary constraint. In an unligated quaternary T tetramer, the alp’ dimer is tightly associated with a second dimer (a2P2) by noncovalent interactions worth -14.3 kcal in free energy (Ip and Ackers, 1977). However, the subunit tertiary structures are not under significant strain from these interface bonds in the absence of ligation (Gellin and Karplus, 1977). Strain is induced when the first heme site is ligated and energy of the protein’s conformational accommodation opposes the Fe-O,, energy. Net free energy of the tetramer’s O2 binding reaction is less favorable than for binding to the dissociated dimers because the ligation-induced conformation change required of the protein now also includes conformational work against the structural constraints imposed by the dimer-dimer interface. Net free energy from the binding reaction would be additionally reduced by positive free energy from breakage of the noncovalent bonds (Perutz, 1970). Conformational work against the interface constraints and/or from bond breakage is reflected in the 3 kcal reduction of net binding energy, i.e., yielding the -5.4 kcal that is observed for the initial O2 bound onto a tetramer vs. the -8.3 kcal/ mol for the same reaction onto a dissociated dimer (Mills et al., 1976). This 3 kcal “penalty” for initial ligation within the T tetramer has been designated by the term “tertiary constraint energy” (LiCata et al., 1993). b. Quaternary transition. When successive ligation steps lead to quaternary T + R transitions, those steps are also accompanied by “penalties” that reflect the net difference between the free energies of breaking the T interface bonds and of forming those of the R interface. c. The cooperativity effect. Increasingly favorable (net) binding free energies at successive steps results from the progressive accumulation of such ligationdriven free energy “penalties.” After a tetramer has been ligated once there may be a penalty of smaller magnitude for the next subunit
216
GARY K. ACKERS
ligated, i.e., on the basis of altered conformational strain, bond breakage, or a combination of both. If the magnitude of the penalty depends on which of the possible steps is taken, the “cooperative energies” will have a “combinatorial” character through the binding cascade (e.g., Fig. 4). The findings reviewed in this chapter have shown that cooperativity in human hemoglobin follows this combinatorial pattern. When cooperativity (positive or negative) occurs in the binding sequence along any pathway of the reaction cascade (as evidenced by altered affinities during progressive ligation), values of qA G, will reflect the ligation-linked contributions from the tertiary and quaternary structural transitions plus any coupling between them. Experimental YAG, values may also reflect energetic contributions from linked reactions of heterotropic effector binding and release, including those known to occur at protein sites remote from the hemes (i.e., for Bohr protons, DPG, NaCl, C 0 2 ,etc.). Solvation energy changes may also be implicit contributors to these processes.
C. Statistical Weights of the Tetramer Binding Function
By incorporating the experimental vkk,terms that are evaluated by Eq. (12) into the site-specificbinding function, Eq. (9), the relationships of Eq. (10) that connect microstate parameters to the Adair constants [also yielding Eq. (S)] may be written
Thus Eq. (9) may be reformulated using statistical weights that occur as respective products of the dimeric site affinities k, or k,, and the tetramericcooperativity terms Ykcto yield the general site-specific isotherm, Eq. (14):
Equation (14) and its equivalent forms (Ackers et al., 1992; Doyle and Ackers, 1992a;Huang and Ackers, 1995a) comprise the essential thermodynamic framework for connecting ligation-linked subunit assembly data
217
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
with the site-specific hemoglobin binding function, Eq. (9). From Eq. (14) it follows that a determination of Yk, terms for all H b 0 2microstates, along with O2 affinities of the dissociated dimers (k,, k,), would completely specify the k , terms. This goal was accomplished in 1992 using a strategy described in Section VI. The relevant formulas may first be simplified by noting that, when k, = k, = kd, as in the case of O2binding to normal human Hb (Mills et al., 1976; Mills and Ackers, 1979a), Eqs. (13) and (14) simplify to relations (15) and (16), respectively:
K, = 2[”k, + 12k,]kd
K2 = [2 ( 21kr+ 22k,) + ” k ,
K3 = 2 [31k, + 32k,] k!
K4 = 41k,k4d
+ “k,]
k% (15)
Equation (16) provides a site-specific breakdown of Eq. (8) into microstate statistical weights for the case of identical a and p affinities of the dimers, while Eq. (14) portrays the more general case. This formulation of the tetrameric Hb binding problem, whereby independently resolved linkage cycles [Eq. (12)] are incorporated into the site-specific binding function to yield Eqs. (14) or (16) (Smith and Ackers, 1985), was a logical extension of the analogous strategy (Ackersand Halvorson, 1974) that had been developed earlier for analysis of Hb oxygenation linkages at the stoichiometric level of resolution (Mills et al., 1976; Johnson et al., 1976; Atha et al., 1979; Chu et al., 1984; Doyle et al., 1997). Correspondence between linkage cycle formulations of the Hbbinding partition function at the stoichiometric, and site-specific levels is summarized by recognizing that the denominators of Eqs. (8) and (16) must be equal so that 4
2 (y) i=O
4
[’K:]k;x’
g,(qkl)
k)xi
= i=O
[ijl
This fundamental relationship [or its more general version, obtained by incorporating Eq. (14) on the right-hand side] could have served as a starting point for the present chapter. However, its most general
218
GARY K. ACKERS
applications, which are considered below in Section VI,have been motivated by the experimental and historical developments presented in the foregoing sections. Table I1 provides a spreadsheet of relationships between the various terms of Eq. (17) that were used to evaluate free energy components of the Hb “molecular code” partition function.
D. Hybridization Techniques and Free Energy Distributions Using stable analogs of hemoglobin’s native oxygenated and deoxygenated heme sites, experimentalists have long been able to create and study four of the Hb microstate species in pure form (i.e., species 01, 23,24, and 41). Taking advantage of their dissociation into dimeric halftetramers (cf. Fig. la) that subsequently undergo assembly into hybrid combinations (Fig. 8) researchers have successfully analyzed thermodynamic properties of the remaining six tetrameric species (Fig. 8). The
83 2m
H 01
w
23
AB
AA
z /z
83
2E
11
p!
24
12
22
41
21
32
31
01
23
24
BB
/
2
41
PARENT BB FIG.8. Hybridization scheme for H b microstates via dimer-tetramer dissociation and reassembly. By use of nonlabile ligands or ligand analogs, the “parent” species, AA or BB, are each prepared in pure form since their dissociation and reassembly does not rearrange the combinations of site occupancy. Six microstate tetramers are formed by hybridization of dissociated parent tetramers, as illustrated for the species 21 tetramer (upper right).
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
219
combined usage (and further development) of four experimental techniques proved essential to these studies: (1) analytical gel permeation chromatography (Ackers, 1970, 1975); (2) stopped-flow kinetics (Nagel and Gibson, 1972; Kellett and Gutfreund, 1970; Ip et al., 1976; Ip and Ackers, 1977); (3) haptoglobin kinetics (Ip et al., 1976; Ip and Ackers, 1977;Turner et aL, 1992);and (4) low-temperatureelectrophoresis methods pioneered by Michele Perrella (see Perrella et al., 1981,1983,1990a; also LiCata et aL, 1990). In various combinations, these techniques have provided a versatile repertoire for determining dimer-tetramer assembly free energies of the asymmetric hybrid species 11, 12, 21, 31, and 32 in the presence of their respective “parent species,” with which they are equilibrated. An example of the resolution that is routinely achieved for quantitation of microstate species in hybrid equilibria by the cryogenic isoelectric focusing (cryo-IEF) method is shown in Fig. 9. Initial data on microstate tetramers of Fe‘+/FeCN (Smith and Ackers, 1985),Fe*+/Mn(III) (Smith et al., 198’7) and Co2+/FeC0 (Speros et al.,
.03
.02 .01
0
.02
.01
0
Distance FIG.9. Cryo-IEFof hybrid mixtures forming CN-met species 21 (panel a) and 22 (panel b) at pH 7.4, 21.5”C (cf. Ackers et al., 1997). Species 21 was formed by mixing deoxyHbS (species 01) with CN-met HbA (species 41) and the mixture was incubated for 119 h. Species 22 was formed by mixing CN-met HbA species 23 with CN-met species 24 (HbS), and the mixture was incubated for 46 h.
220
GARY K. ACKERS
1991) had uniformly indicated that the cooperativity mechanism was “combinatorial,” i.e., dependent on specific pathways through the reaction cascade (Fig. 4), and that species 21 made a different free energy contribution from that of species 23 or 24, as shown in Table I. While this finding was inconsistent with rules of the concerted “two-state’’ model (cf. Ackers, 1990), the discovery that CN-met species 22 had characteristics like species 23 and 24 (Perrella et aL, 1990b) helped greatly to clarify the analog databases that were systematically under study. A series of studies on the effects of pH, mutational modification, and other structure-sensitive probes on the CN-met intermediates (Daugherty et al., 1991) then provided a basis for interpreting the thermodynamic information in terms of quaternary switchpoints (Daugherty et al., 1991) that led to the “symmetry rule” (or molecular code) mechanism (Ackers et al., 1992; Doyle and Ackers, 1992b; LiCata et al., 1993). Resolved Microstate Distributions. Table I summarizes the distributions of cooperative free energies YAG, = RTln gk, for six Hb ligation systems that were characterized under comparable solution conditions. Columns 2-5 list values for the nine ligation microstates of the initial three analog systems that had yielded reliable data by 1991: an initially incorrect value of the CN-met species 22 assembly free energy that was evaluated from kinetic data (Smith and Ackers, 1985) was detected by cryogenic electrophoresis (Perrella et al., 1990a) and traced (by M. A. Shea) to misassignment of one of the kinetic phases. This corrected data set for the nine CN-met Hb species (col. 2) and corresponding data at pH 8.8 (col. 3), plus values on the other two analog systems (cols. 4 and 5 ) were used in making thermodynamic inferences of the 1992 analysis (Ackers et al., 1992). Column 1 lists cooperative free energies at the stoichiometric O2 binding levels for native Hb, ‘AG, = -RT ln[K,/ ( A d ) ‘I, as calculated from direct 0, binding data (Chu et al., 1984; Doyle and Ackers, 1992a; Doyle et al., 1997); K, are tetramer Adair constants, and kd is the O:, binding constant for sites of the dissociated (noncooperative) dimers. Thermodynamic relationships between stoichiometric cooperativity constants F,and their constituent microstate terms Ykk,are listed in Table 11. Based on Eqs. (8) and (14)-(17), these relations provided modelindependent constraints for reconciling microstate parameters with the stoichiometric ones obtained by analysis of binding curves measured at a sequence of total Hb concentrations (Fig. 5 ) , in combination with independent dimer-tetramer assembly measurements on species 01 and 41.
THE MOLECULAR CODE OF HEMOGLOBIN WOSTERY
221
VI. How THE MOLECULAR CODEWASDECIPHERED A.
Cooperative Free Energies of HbOpAnalogs
During 1989-1992, extensive efforts were aimed at applying the conceptual framework discussed above to the emerging microstate databases in order to determine whether a common site-specific partition function could accommodate both a consensus of OAGc distributions -from the HbOe analogs and the stoichiometric cooperativity terms iAGcof native Hb04. Based on the previously established structural and functional properties of the three analogs, it was assumed that they would each manifest essential rules of the native system. It was not assumed, however, that they would exhibit cooperativity in a quantitatively identical fashion to that of native Hb02,nor with respect to each other (see Ackers, 1990). These expectations were consistent with the earlier findings that O p binding to cobalt-substituted Hb (Co2+/Fe2+02) exhibits significantly altered affinity and cooperativity compared to native HbOz (Yonetani et al., 1974; Imai, 1990; Doyle et al., 1991), but nevertheless exhibited a T + R quaternary transition and Bohr effect on oxygenation (Imai et al., 1977; Ikeda-Saito and Yonetani, 1980; Ikeda-Saito and Verzilli, 1981). With regard to the question of whether CN-met ligation of Hb is an appropriate analog for studying mechanisms of the native allosteric response, it was noted by Unzai et al. (1996) that (1) the crystal structure of CN-met Hb closely resembles that of oxy-HbA (Deatherage et al., 1976); (2) the cyanide-bound ferric heme is low spin, like that of normal oxygenated subunits (Scheidt and Reed, 1981); (3) CN-met hybrid species 23 and 24 had shown nuclear magnetic resonance (NMR) spectra that were similar to the spectrum of fully ligated species 41; and (4) the ferrous subunits of CN-met species 23 and 24 had shown rapid ligand-binding kinetics and high affinity for oxygen. Unzai et al. (1996) pointed out, however, that certain techniques of direct oxygenation might not be applied to CN-met hybrid systems with reliability because of (a) autoxidation of the ferrous subunits, coupled with (b) unwanted reduction of met hemes that could result from efforts to control autoxidation by reducing agents or enzyme reduction systems. These cautions regarding direct O4 binding studies on partially ligated CNmet hemoglobins are in accord with the findings of Doyle and Ackers b( 1992a). However, these problems were eliminated in the Doyle and Ackers study by (a) appropriate usage of KCN, which suppresses loss of CN3- from the ligated heme sites that otherwise leads to autoxidation and electron exchange. This problem was solved by the cryogenic electro-
222
GARY K. ACKERS
phoresis techniques developed by Perrella and colleagues (Perrella and Rossi-Bernardi, 1981, 1994; Perrella et al., 1990b, 1994, 1998), which have also been used extensively and developed further in the present author’s laboratory (e.g., LiCata et al., 1990; Speros et al., 1991; Daugherty et al., 1991; Doyle and Ackers, 1992b; Huang et al., 1997). The clyogenic quenching procedure, which stabilizes partially ligated CN-met tetramers against subunit dissociation, also affords a clear-cut way to determine the extent of “valency exchange” that may have occurred during sample incubation (see Ackers et al., 1997). It was recently claimed (Shibayama et al., 1997) that in the hybridization experiments performed in the laboratories of Perrella and of Ackers to determine assembly free energy of the species 21 tetramer, cyanide release from the ligated heme sites might have allowed extensive electron exchange among heme sites and thus compromised the equilibrium free energies reported from the cryogenic determinations (i.e., Daugherty et al., 1991). However, it was recently reaffirmed and experimentally demonstrated (Ackers et al., 1997) that neither cyanide loss nor electron exchange occurs with the protocols that have been used for studies by the laboratories of Perrella (e.g., Perrella et al., 1983, 1990b) or ofAckers (LiCata et al., 1990; Daugherty et al., 1991) and that differ critically from the experiment presented by Shibayama et al. (1997) in support of their claim. These authors also suggested that the molecular code mechanism (Ackers et al., 1992) was based solely on the properties of a single analog (i.e., CN-met Hb) and that, since this analog does not perfectly mimic the cooperativity of HbOs, it was concluded that the molecular code mechanism is incorrect for the native HbOs system. By contrast, the sitespecific binding function that was deduced coequally from data on the three diverse analogs of Table I, cols. 2-5, used only their common qualitative behavior and not their magnitudes per se. Their consensus function was then used to solve the stoichiometric HbO:, values F, for their component microstate terms gk,. While no single analog should be trusted to portray all aspects of the native system, the “free energy” of -10.1 kcal that was suggested by these authors (Shibayama Pt al., 1997) would have led to the same result that was found (Table I, col. 6) for the native HbOB system by use of Eq. (19), described below in Section VI,A,2 (see Ackers et al., 199’7, for additional discussion of these issues). 1. Characteristics of the Analog Microstate Distributions Based on the conceptual framework of Eqs. (9)- (17) and the expectation that cooperative free energies VAG, would reflect the heme sitedriven tertiary and quaternary structural transitions plus any coupling
THE MOLECULAR CODE OF HEMOGLOBIN ALLOSTERY
223
between them (see Section V,B), it was assumed (Ackers et al., 1992) that oxygenation analogswhich conform to the native system’smolecular rules would exhibit qualitatively similar distributions among their nine YAG0.1 M) , most solutes (whether electrolytes or nonelectrolytes) have a perturbing effect, which may be stabilizing or destabilizing, on the conformational transitions and binding interactions of all types of biopolymers. For electrolytes, these are called Hofmeister effects (see von Hippel and Schleich, 1969a,b;Arakawa et al., 1990; Collins, 1996; Baldwin, 1996, and references therein). Processes which increase the amount of protein surface exposed to the electrolyte solution (including dissolving and unfolding a native protein and dissociating a multimeric protein assembly) all exhibit similar rank orders of effects of different anions and cations. Compared at a fixed salt concentration, locally accumulated anions like I- or Br- favor processes in which additional protein surface is exposed to water, whereas locally excluded anions like F- favor processes that reduce the amount of protein surface exposed to water. Changes in the nature or concentration of the anion typically have more significant effects than do changes in the nature or concentration of the cation. In general, effects of high solute concentrations on biopolymer processes can have several origins. Increasing the concentration of any solute above 0.1 M (i.e., increasing the osmolarity of the solution) significantly reduces the thermodynamic activity of water, and thereby favors processes in which water of hydration is displaced from biopolymer surfaces (Tanford, 1969; Record et al., 1978; Colombo et al., 1992; Timasheff, 1993; Parsegian et al., 1995; Record and Anderson, 1995, and references therein). The analysis of solute concentrationdependent effects on such processes is particularly simple when the solute is completely excluded from the vicinity of both product and reactant biopolymer surfaces, in which case only the osmotic effect of the solute (i.e., its effect on water activity) is involved. If the solute is not completely excluded from both products and reactants, then changes in its preferential interactions in the process must also be taken into account. Quantitative treatments of effects on biopolymer processes of both excluded and accumulated perturbing solutes are reviewed in Sections VI-VIII. D. Characteristic Functional Forms of Dependences on Solute Concentration of Observed Equilibn'um Constants (Koba)and Transition Temperatures (T,n)
Changes in the concentration of a perturbing solute (or ligand) cause variations in the stoichiometric quotients of product and reactant con-
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
293
centrations (Kobs)that characterize biopolymer equilibria and in midpoint transition temperatures ( T,) that characterize biopolymer conformational changes. The dependences of these thermodynamic variables on solute concentration exhibit a variety of characteristic functional forms and magnitudes, especially when the perturbing solute concentration is relatively low (4 1 M) .These functional forms depend on whether the solute is a nonelectrolyte or electrolyte and on whether the regions of the biopolymer affected by the conversion of reactants to products are part of a high-charge density polyelectrolyte (such as DNA) or are part of a low-charge (and/or low-charge density) short oligonucleotide, short oligopeptide, or protein. The dependence of Kobson solute concentration is particularly simple when the solute is a strongly site-bound ligand L, which binds to different numbers of sites on the reactants and products of a biopolymer equilibrium. If the ligand concentration is sufficiently large so that all the binding sites are fully saturated but is still small enough so that nonligating interactions of unbound solute molecules with products or reactants make negligible contributions to the solute-concentration dependence of & , s , then Kobsexhibits a power-law dependence on ligand activity uL (Kobs u ! “ ~ ) ,where uL may often be approximated by the free ligand concentration. Here the exponent ANL is the (stoichiometrically weighted) difference between the numbers of ligand-binding sites on the biopolymer product(s) and reactant(s). For this situation, In Kobs is a linear function of In uLwith slope ANL. At low enough ligand activities, the ligand-binding sites on one or more of the biopolymer participants in the equilibrium of interest will not be saturated and the power-law dependence of K o b s on ul. will no longer be observed. In contrast to this effect of a strongly site-bound ligand, consider processes involving only uncharged (or weakly charged) biopolymers, perturbed by a solute that does not site-bind but is locally accumulated at or excluded from the surface(s) of reactants or products. If the amount of biopolymer surface area that interacts with this solute changes in the process, then In K o b s for the corresponding equilibrium typically is found to be an approximately linear function of the concentration or activity of this perturbing solute. At low concentrations of the perturbing solute (below -0.1 M ) , the variation of &,,with solute concentration is typically undetectably small. “Hofmeister” effects of the nature and concentration of salts on protein processes exhibit this behavior. Another contrasting example of a solute effect is provided by processes involving only small ionic species with small numbers of charges, investigated in the presence of an excess of a 1:l electrolyte. Use of the Debye-Hiickel approximation [in linearized Poisson-Boltzmann (PB) theory] predicts
294
M. THOMAS RECORD, JK.,ET AL
that In Kohsvaries linearly with the square root of the ionic strength (i.e., with the square root of the 1:l salt concentration). This effect is predicted to become undetectably small at low salt concentrations. In striking contrast to both Hofmeister and Debye-Huckel effects of salt concentration on Kobsfor processes that involve low-charge density or weakly charged species, equilibria in which the axial charge density of highly charged nucleic acid polyions is locally or globally affected exhibit strong power-law dependences of Kobron salt concentration (typically investigated over a portion of the range 10-3-10-’ M ) . The exponents characterizing these power-law dependences often are large in magnitude and relatively constant over the range of low salt concentrations investigated. In these situations, In Kohs is a strong and approximately linear function of the logarithm of the salt concentration (or activity),and the effect of a given percentage change in salt concentration is almost equally large over the entire experimentally accessible range. Even though the ions of the salt do not site-bind to the nucleic acid (cf. Braunlin, 1995, for a review), the magnitude and functional form of the salt concentration dependence of Kohsare equivalent to the effects on Kohs that would be expected to be produced by changes in the concentration of a ligand that is bound with high affinity and high stoichiometry to sites on the higher axial charge density state of the nucleic acid and is displaced in any process which reduces that axial charge density locally (e.g., by oligocation binding) or globally (e.g., by strand dissociation). These effects of salt concentration on equilibria involving polyanionic nucleic acids are therefore distinct from Hofmeister and Debye-Huckel (ionic strength) salt effects, both in their functional form and in their persistence at low salt concentrations. Even in systems where the concentration of some excess 1:l electrolyte is numerically indistinguishable from the ionic strength, as calculated by the classic Debye-Huckel formula, salt effects on biopolymer equilibria cannot be analyzed accurately on the basis of Debye-Huckel (linearized PB) theory. Ionic strength should never be assumed to be a relevant composition variable for interpreting polyelectrolyte, Hofmeister, and osmotic effects of salt concentration on biopolymers, because all these effects result from preferential interactions that cannot be accurately approximated by Debye-Huckel theory. The use of “ionic strength” as a composition variable in analyzing salt effects on biopolymer equilibria has no fundamental theoretical basis and therefore can lead to serious quantitative errors. For example, two solutions-one containing NaC1, the other MgCl,-at equal ionic strengths are not equivalent with regard to the effects that are produced by these strong electrolytes of different charge types on processes involving biopolymers (see, e.g., Record et al., 1977, 1978, 1981; Lohman, 1985).
[SALT]AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
295
Conformational transitions of high-charge density nucleic acid polyions and of weakly charged globular proteins exhibit characteristic differences in the effects of salt concentration on midpoint temperatures, T,. The effect of salt concentration on order-disorder transitions (denaturation) of nucleic acids persists even at very low (lo-* M) salt concentration (e.g., Record, 19’75).For a variety of order-disorder transitions that reduce the average axial charge density of polyanionic nucleic acids, T, is found to increase linearly with the logarithm of the concentration (or mean ionic activity) of the salt, from M to concentrations in the range 0.1-1 M (e.g., Krakauer and Sturtevant, 1967;Privalov et al., 1969). By contrast, Hofmeister effects of salts on T, for the denaturation of globular proteins become significant only above -0.1 M, and T, exhibits an approximately linear, rather than logarithmic, dependence on salt concentration [increasing or decreasing, depending on the nature of the salt (von Hippel and Schleich, 1969a,b;Baldwin, 1996,and references therein)]. At salt concentrations above 1 M, where polyelectrolyte behavior is suppressed by screening of coulombic interactions, salt effects on DNA denaturation analogous to the Hofmeister salt effects on protein denaturation are observed (Hamaguchi and Geiduschek, 1962). In summary, ligand-binding equilibria and conformational transitions of nucleic acid polyelectrolytes exhibit characteristic functional forms and magnitudes of dependences of Kobsor T, on electrolyte concentration, which differ significantlyfrom those observed for the corresponding processes involving short nucleic acid oligoelectrolytes, protein polyampholytes, or small ions. These differencesjustify the applicability of the term “polyelectrolyte effect” (Record et aL, 1991; Zhang et aL, 1996a) to describe the characteristic thermodynamic signature (i.e., the typically large magnitude and distinctive functional form) of the effect of salt concentration on processes that affect the axial charge density of polyanionic nucleic acids. This usage parallels that of the term “hydrophobic effect,” which describes the distinctive thermodynamic signature (i.e., the large IACil and its consequences for AH”, AS”, AGO, and Kob) of processes that alter the amount of nonpolar surface exposed to water (Tanford, 1980; Spolar and Record, 1994). 111. PREFERENTIAL INTERACTION COEFFICIENTS AS FUNDAMENTAL MEASURES OF THERMODYNAMIC EFFECTS DUE TO SOLUTE-BIOPOLYMER INTERACTIONS A.
Background
The definitions, thermodynamic relationships, and molecular interpretations of preferential interaction coefficients,represented in various
296
M. THOMAS RECORD, JR.,ET AL.
forms, have been extensively discussed from various perspectives (Eisenberg, 1976; Schellman, 1990; Timasheff, 1992; Record and Anderson, 1995). Experimental determinations of these coefficients have been implemented using different kinds of thermodynamic control variables and constraints (Timasheff, 1992, 1993; Eisenberg, 1994; Schellman, 1990; Zhang et al., 1996b, and references therein). In uitro methods have been applied generally to an aqueous solution containing a biopolymer and an excess of some relatively small perturbing solute, in order to evaluate the solute-biopolymer preferential interaction coefficient pertaining to the limit of infinite dilution of the biopolymer over a range of solute concentrations. Knowledge of the sign, magnitude, and concentration dependence of a preferential interaction coefficient forms a firm basis for constructing physical interpretations and for testing theoretical calculations of the observable effects of solute-biopolymer interactions (Schellman, 1990; Timasheff, 1992; Record and Anderson, 1995; Anderson and Record, 1995, and references therein). These coefficients also provide the most direct route to analyzing and interpreting how changes in the concentration of the perturbing solute (which may also act as a ligand) cause changes in the equilibrium distribution of reactants and products in processes involving biopolymers (Wyman, 1964; Record et al., 1978; Anderson and Record, 1993, 1995; Record and Anderson, 1995). The simplest and most frequently investigated type of system that exhibits preferential interactions contains only three components: solvent water (component 1); a dilute solute (component 2), consisting of either an uncharged biopolymer or a charged biopolymer together with a charge-balancing number of oppositely charged small ions; and a solute (component 3) of much lower molar mass, consisting of either molecules or dissociated salt ions that are much more abundant than the biopolymer in solution. [The physically uninformative numerical designations 1,2,3,which were introduced by Scatchard (1946), have become conventional (Eisenberg, 1976)l. Here component 2 generally is referred to as a polymer, but the discussion in the following sections pertains equally well to systems where it is an oligomer, or any other type of nondiffusible solute whose concentration is sufficiently dilute in comparison to that of component 3. Thermodynamic analyses and molecular interpretations can be constructed most readily for the preferential interaction coefficient defined as the following partial derivative (Anderson and Record, 1993; Record and Anderson, 1995).
r,q*=mlim (~m,/~m,)T,P,,, e 0
(1)
In Eq. ( l ) ,m3 and m2 are the molal analytical concentrations of solute
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
297
and biopolymer, respectively, and p3 is the solute chemical potential. Henceforth in this article the superscript “0” denoting the limit of infinite dilution of component 2 will not be shown, because in all situations of interest here m2 is sufficiently dilute that the limiting value of r 3 , 2 , which does not depend on m2, has been attained. [The constraints of constant temperature (T) and pressure (P) also are subsequently not explicitly indicated except where necessary for clarity.] The definition of r3,2 given by Eq. (1) may not appear obviously related to solutebiopolymer interactions. The connection may be clarified by recognizing that, although in general m3and m2could be varied independently, the constraints specified by the partial derivative that defines r 3 , 2 [Eq. (1)] require that m3and m2must be changed together in a way that maintains the constancy of p3 at constant T and P. Consequently, this coefficient reflects all concentrationdependent sources of nonideality due to solute-biopolymer interactions, ranging from site binding to the weaker interactions that cause the local concentration of mobile solute ions or molecules near the biopolymer surface to differ from the corresponding bulk concentration. Particular values of r 3 . 2 can be interpreted at the molecular level in terms of the “twodomain” model (e.g., see Timasheff, 1992;Record and Anderson, 1995) depicted schematicallyin Fig. 1,where for simplicity the biopolymer is shown as a sphere. The two domains are a “local” region of solution surrounding the surface of each biopolymer, and a “bulk” region whose thermodynamic characteristics are defined on the basis of the fundamental properties of a macroscopic membrane dialysis equilibrium (Record and Anderson, 1995). Mathematical details of the twodomain model for nonelectrolyte and electrolyte solutes are reviewed in Sections N a n d V, respectively. If a solute is preferentially accumulated in the local domain relative to its concentration in the bulk, r3,* is positive. In this case, according to Eq. (I), addition of biopolymer to a solution containing an accumulated solute requires concomitant addition of the solute (i.e., an increase in m3)to maintain the constancy of its chemical potential ( p 3 ) Negative . values of I?3,2 indicate preferential exclusion of solute (preferential solvation) so that its local concentration is less than its bulk concentration. In this case, addition of biopolymer would require a concomitant reduction in m3to keep p3constant. For the (uncommon) situation where the local and bulk solute concentrations happen to be identical, r3,2 = 0. This quasi-ideal condition does not imply the absence of solute-biopolymer interactions, but rather that they are effectively balanced by solvent-biopolymer interactions. Such a balance cannot persist at all compositions of a real solution.
298
M. THOMAS RECORD, JR., ET AL.
4
-
-
0
'
Localdomain
\
\
n
I II
\
,o /
I
. \
/
\
\ I I I
Biopolymer
\
\
\
B3solute molecules B, solvent molecules
bulk
bulk
Bulk domain: n3 moles solute; n, moles solvent /
0
/
-
. \
Localdomain
/
I I I \
-
\
\
0
\
\
I I
Biopolymer
', \
B3solute molecules B, solvent molecules
I
/
FIG.1. Schematic of the two-domain model of uncharged solute-uncharged biopolymer interactions in solution. The local domain surrounding each biopolymer contains (on average) Bs molecules of solute and Bl molecules of solvent (per molecule of biopolymer). The bulk domain contains @Ik mol of solute and n:"Ik mol of solvent. If B , / B , > > 0). nplk/nplk,the solute is preferentially accumulated
The thermodynamic consequences of preferential interactions are perhaps most readily understood by considering a dialysis equilibrium between two solutions separated by a semipermeable membrane, because in this situation p3, Tand (at low polymer concentration) P c a n be held constant, as specified by the definition of r3,2 given in Eq. (1).Concepts from equilibrium dialysis also prove of central importance in the formulation of the two-domain model (reviewed in Sections IV and V), though in practice dialysis is less satisfactory as an experimental means of determining Ts2 than are other thermodynamic methods (e.g., densimetry, osmometry) (see Eisenberg, 1994; Zhang et aL, 1996b). Of the two solutions equilibrated across a dialysis membrane, one (designated P ) contains only the membrane-permeable solute (component 3) and sol-
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
299
vent; the other (designated a ) contains also the membrane-impermeable solute (component 2), which generally has a higher molar mass and larger molecular surface area than does the other solute component. The thermodynamic analysis of a dialysis equilibrium reviewed in the following subsections depends in detail on whether solute components 2 and 3 are uncharged or charged.
B.
Uncharged Solutes: DeJinition in the Context of Dialysis Equilibrium
Provided that the osmotic pressure difference across a membrane at dialysis equilibrium is small enough (at low enough concentrations of component 2), the condition of dialysis equilibrium can be expressed by equating the thermodynamic activities of the membrane-permeable uncharged solute component in the two solutions:
Here m!$$ is the total molal concentration of uncharged component 3 in solution a (including bound and locally accumulated as well as bulk solute). The experimentally accessible thermodynamic distribution coefficient r;;!‘characterizes solute-polymer interactions in a solution containing a finite polymer concentration at dialysis equilibrium:
From Eqs. (2) and (3), we have
where
At dilute polymer concentrations, 7 3 . 2 may be interpreted as the contribution to the nonideality of solute 3 in solution a arising from its interactions with the biopolymer. At sufficiently small values of m2,rS ceases to be a function of m2 and hence equals the limiting value r3,2 [as defined in Eq. (l)].In this
300
M. THOMAS RECORD, JR, ET A L
limit the transmembrane quotient of differences Am3/Am, [Eq. (3)] may be equated to the derivative [eq. (l)]:
In a dialysis equilibrium experiment, F ~rather , than P, is maintained constant while the composition of the three component solution (a)is changed. Under the conditions of interest here, the justification for replacing pl with P, the constraint specified by the partial derivative in Eq. (l),has been considered in detail (Eisenberg, 1976; Schellman, 1990). Accumulation (relative to solution p ) of component 3 in the polymer-containing solution (a)reflects preferential accumulation of that solute in the vicinity of polymer molecules; exclusion of component 3 from the polymer-containing solution (equivalent to preferential accumulation of solvent) reflects exclusion of that solute from the vicinity of each polymer molecule.
C. Charged Solutes: Preferential Interaction (Donnan) Coefficientsfor Electroneutral Solute Components and fm Single-Ion Species The Donnan membrane equilibrium provides a useful experimental means of characterizing the thermodynamics of interactions of ions with charged polymers. Thermodynamic analysis of the Donnan equilibrium is analogous to that of equilibrium dialysis involving uncharged polymers and solutes (see Section II1,B above). New effects arise, however, because the membrane-permeable component (3) consists of dissociated ions and the membrane-impermeable polymer is charged. For a charged polymer-salt solution, at dialysis equilibrium, these effects include ( 1) equilibrium transmembrane differences in concentrations of the low molar mass salt ions (M+ and X-), determined by the most random mixture of these diffusible species that is consistent with the thermodynamic effects of all intersolute interactions and with (virtually exact) electroneutrality of the solutions on both sides of the dialysis membrane; (2)an equilibrium membrane potential, resulting from otherwise negligible deviations from electroneutrality in each of the solutions separated by the semipermeable membrane; and (3) an equilibrium osmotic pressure difference, which arises from the unequal chemical potentials of the nondiffusible polymer component on the two sides of the membrane. In analyses of the Donnan ion distribution that begin with the transmembrane equilibrium condition for the electroneutral salt component, the Donnan osmotic pressure difference and any microscopic deviations
[SALT] AM) [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
30 1
from electroneutrality can be neglected, provided that the concentration of the nondiffusible component is sufficiently low. Then the Donnan equilibrium condition is
where (Y and p designate the polyioncontaining and polyion-free compartments, respectively, as in the previous treatment of nonelectrolyte solutes. When the concentration of salt is in sufficient excess over the concentration of charges on the polyanion, to an excellent approximation the condition of electroneutrality can be applied to each solution:
where 2, is the net structural charge on the polyion and mJ is its molal concentration, which is equal to the molal concentration of the electroneutral component, mZJ.Consistent with the notation in Anderson and Record (1993), the subscript 25 is used in the present chapter exclusively to designate the electroneutral poly-(or oligo-)electrolyte component, defined to consist of a poly-(or oligo-)ionic species J and an equivalent number of counterions, in this case univalent. [Use of this notation in Record and Anderson (1995) was unfortunately ambiguous, and we have attempted to be consistent in the present chapter.] Often the monomeric molal concentration = 12,1mJis used intead of mJ,because thermodynamic properties of “linearchain” polyelectrolytes typically become independent of the degree of polymerization when it is sufficiently large that oligoelectrolyte behavior is no longer detectable. By the definition of components,
From Eqs. (7-9), one obtains the following Donnan equilibrium condition, where differences between and ytCs,generally cannot be neglected:
From Eq. ( l o ) , at low mJ, it follows that
302
M. THOMAS RECORD, JR.,ET AL..
where [cf. Eq. (5)]
To obtain Eq. (11) from Eq. ( l o ) , truncation of the expansion of the radical requires excess salt ( m3(a) lZ,l mJ) . In Eq. (12),as in the nonelectrolyte case [Eq. ( 5 ) ] , y3,2may be interpreted as the contribution to the nonideality of the electrolyte in compartment a that arises from interactions of electrolyte cations and anions with the charged polymer. The experimental thermodynamic salt-distribution coefficient (the Donnan quotient), which characterizes solute concentrationdependent nonideality due to electrolyte-charged polymer interactions, is defined by analogy to Eq. (3) for the nonelectrolyte case:
+
c$
As in the development of Eq. ( 6 ) ,the quotient of transmembrane concentration differences becomes .equal to a derivative at sufficiently small polyion concentration, so
The thermodynamic saltdistribution coefficient TsZJ, called the Donnan coefficient, is an experimentally accessible quantity which, under all conditions of interest in this review, is numerically indistinguishable from the preferential interaction coefficient defined in Eq. ( 1 ) .Justification for the approximation implied by replacing the constraint of constant p1with constant Pfor the partial derivative in Eq. (14) has been discussed (for example) by Anderson and Record (1993). From Eqs. (9-12).
In the ideal limit of Eq. (15) where Y ~= ,1, r"g' ~ = -0.5lZJl. (The ideal value of the Donnan coefficient may be approached in solutions containing weakly charged polyions at sufficiently low salt concentrations.) Generally Y ~< ,1 (i.e., ~ the net effect of ion-polyion interactions is favorable) and > -O.SlZ,l.
r-,J
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBIUA
303
For the counterion (here a positive ion):
exp where lim,,8wol?t,J = TtJ. In the ideal limit, where 73.2 = 1, P$’ = 0.512$. Generally 1 51 2 T+J 2 0.5151. On the basis of these single-ion preferential interaction coefficients, counterion accumulation in and coion exclusion from the polyion-containing compartment are specified relative to the usual ideal reference state for a Donnan equilibrium (the most random electroneutral mixture). Because the solution is macroscopically electroneutral, and r- are always related:
r+
Expressed per polymer charge, Eqs. (13) and (17) are
Iv.
PREFERENTIAL INTERACTIONS OF NONELECTROLWE MOLECULES WITH AN UNCHARGED BIOPOLYMER A.
General Two-Domain Analysis of Preferential Interactions of Nonelectrolytes
Record and Anderson (1995) analyzed a two-domain model for preferential interactions of an unchargedsolutewith a dilute uncharged polymer in solution. In this model, as depicted in Fig. 1, a “local” domain containing only solute (component 3) and solvent surrounds each polymer molecule, and a “bulk” domain, generally characterized by a different ratio of solute to solvent, separates all local domains. The two-domain model assumes that the distribution of solute and solvent within the vicinity of a given polymer is not affected by interactions with any other polymer. To meet this condition the solution must be sufficiently dilute in the polymer. By definition the local domain contains (on average) & molecules of solute and B, molecules of solvent per polymer molecule. The bulk domain contains mol of solvent and f l mol of solute. Any preferential interactions with component 2 experienced by compo-
304
M. THOMAS RECORD, JR.,ET AL.
nents 3 and 1, including but not limited to site binding of component 3 and solvation (e.g., hydration) by component 1, cause their mole ratio in the local domain (&/Bl) to differ from their mole ratio in the bulk domain ( n y / n p ) ,which in a sufficient excess of solute is indistinguishable from their overall ratio of molalities (m3/ml). (For water, ml = 55.5 mol/kg). If component 3 is preferentially accumulated in [excluded from] the local domain, B3/BI is greater [less] than m 3 / m l ,and accordingly r3,2 is positive [negative]. The (generally observed) inequality of &/BI and m 3 / m l is the microscopic counterpart of the macroscopic observation (at dialysis equilibrium) of unequal mole ratios of components 3 and 1 in two solutions separated by a dialysis membrane impermeable only to component 2. As reviewed in the preceding section, this analogy has been exploited to derive explicit expressions for r3,2 for uncharged and for charged species (Record and Anderson, 1995). Explicit expressions for preferential interaction coefficients that characterize nonelectrolyte solutebiopolymer nonideally are reviewed in the remainder of this section, and the corresponding expressions for electrolyte solutes and charged biopolymers are reviewed in Section V. In the polymer solution, the total molal concentration of solute 3 is
The bulk concentration of solute component 3, m y , is given by bulk
n3 m y=m l m nl A thermodynamic criterion for distinguishing the local from the bulk domain can be introduced. Consistent with the specification on the bulk domain given above, an ideal dialysis equilibrium condition is used to equate the molality of solute in the bulk domain to the value it would have in a polymer-free solution (0) in dialysis equilibrium with the polymer-containing solution ( a ):
Therefore
According to Eq. (22), the contribution to nonideality of component 3
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
305
due to interactions with component 2 ( Y ~ ,is~ accounted ) for entirely by the accumulation or exclusion (relative to the composition of the bulk solution) of component 3 in the local domain surrounding each polymer. The molal concentration of polymer in solution a is
From Eqs. (3) and (21-23), the following twodomain expression for
ril;pis obtained:
If water is the solvent, ml = 55.5. Equation (24) is perfectly general with respect to the concentration of solute component 3, but the solution must be sufficiently dilute in polymer component 2 so that the osmotic pressure contribution to Eq. (2)can be neglected and so that a bulk polymer-free domain can exist. As m2 --* 0,
This expression formally is the same as presented by Timasheff and Inouye (1968) and Inouye and Timasheff (1972) for the nonelectrolyte case. A similar expression was derived by Schellman (1990) for a particular site-binding model of local solute-solvent exchange. Equation (25) may be recast to recognize explicitly the interdependence of B, and due to solute-solvent exchange in the local domain. Let ,P, be the maximum number of solvent molecules in the local domain of an isolated molecule of component 2. When mS = 0, & = B y , provided that solvation of component 2 is not increased by introducing solute component 3 into the solution. The positive exchange coefficient Sl,3= ( B Y - I l l ) / & is defined as the cumulative stoichiometry of solvent displaced per solute accumulated in the local domain:
= For a completely excluded solute, 4 = 0 and r3,2
-4 ( m 3 / m l ) .For
306
M. THOMAS RECORD, JR.,El’ AL
strongly excluded solutes (such as glycine betaine, reviewed below in Section IV,B), BS = 0 over at least some finite range of m3, and hence vs. m3. BYmcan be evaluated directly as the slope of a linear plot of r3,* For accumulated solutes under typical conditions, BY’m3/ml cannot be neglected and 4 varies with m3. Separate evaluation of & and B y may still be feasible by applying Eq. (26) to experimental values of T3,ndetermined for the accumulated solute, but only if B4is not simply proportional to m3 when S1,3m3/ml 1. However, m3-dependent values of & for an accumulated solute can be determined most reliably via the twodomain model if By” has been evaluated independently for a completely excluded solute. An application of this approach to analyze the preferential interactions of the accumulated solute urea, with the value of BI;,, determined for the excluded solute glycine betaine, is reviewed in Section IV, C. In the context of the twodomain picture, further analysis of the m3 dependence of B3 requires additional model assumptions. An explicit as a function of m3 has been developed from a parameterization of detailed model for preferential interactions assumed to arise from a competitive one-for-one (1 : 1) “interchange” of two different types of solvent species (Schellman, 1990). Either the primary solvent, component 1, or the (less abundant) “cosolvent,” component 3, is bound independently at sites that comprise the entire accessible surface of the highly dilute polymeric component 2. These sites need not all be thermodynamically equivalent (have the same relative affinity for components 1 and 3). In particular, some may be occupied only by the principal solvent. However, each site must cover the same amount of surface area on component 2, because the “independent binding” stipulated by this model requires that the identity of the species located at any site does not affect that of the species located at any other site. Despite the presumed equality of site sizes, the model does not require that the individual molecules of components 1 and 3 have equal sizes. [Possible structural scenarios that meet this condition are shown in Fig. 1 of Schellman (1990).] In the simplest situation where the 1 : 1 interchange model can be applied, all N sites are thermodynamically equivalent (have the same affinity for component 3 relative to component 1),and any site occupied by the principal solvent in the total absence of cosolvent is open to occupancy by this component when it is added to the system. More generally,some subset ( N l )of the total number of sites ( N ) on the surface of component 2 that are occupied by the principal solvent (usually water) are not accessible to occupation by component 3 (Timasheff, 1992). If all of the remaining sites, here designated Nl.3have the same binding
+
r9,2
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
307
affinity for component 3 relative to water, then the dependence of r3,2 on m3 can be represented, according to the 1 : 1 interchange model, as
r3,2 = &(K-
1/55.5)m3/(l
+ Km3) - N1m3/55.5
(27)
Here the “practical” equilibrium quotient, K, expressed in units of reciprocal molality, generally has some dependence on solution composition, because it incorporates the stoichiometric quotient of activity coefficients. Note that, for consistency with the notation adopted in this chapter, some of the symbols in Eq. (27) differ from those appearing in expressions published previously [cf. Eqs. (7) and (12) of Schellman (1990), Eq. (21) of Timasheff (1992), and Eq. (6) of Zhang et al. (1996b).] According to the 1 : l interchange model from which Eq. (2’7) was derived, in general K, N,, and Nl,3all may have a role in determining the sign and magnitude of l?3,2.At a given m3, the minimum (most is - (Nl,3+ Nl)m3/ml, which corresponds negative possible) value of r3,2 to total exclusion of component 3, or an infinite preferential affinity for water (i.e., K = 0) at all sites on component 2. If component 3 is strongly but not totally excluded, K may be small enough so that the total number of sites accessible to water (Nl,3+ Nl) can be evaluated vs. m3. The affinity of component from the slope of a linear plot of r3,* 3 relative to component 1 at the Nl.3 sites may be so strong in some systems that K is much larger than both l / m l and l/m3. Nevertheless, in this situation component 3 is preferentially accumulated (r3,2 > 0) only if the number of sites for which it has strong affinity is large enough > Nlm3/ml). This requirement arises because is a measure of net preferential interactions with the entireaccessible surface of component 2 (Timasheff, 1992). Equations (26) and (27) are consistent if 4 = N,,&m,/(l + Km,), Bf” = Nl,3+ Nl and SlS3= 1. These equalities do not uniquely validate the 1 : 1 interchange model, because some form of exact mathematical correspondence must follow from the thermodynamic generality of the twodomain description. If the interchange is not 1 : 1 and/or if site occupancy is not strictly independent, Sl,3may differ from 1 and depend on m3. In any case, the expression for r3,2 given by Eq. (26) is general, provided that the basic requirements for applicability of the two-domain model are fulfilled.
r3,2
B.
Two-Domain Analysis of Interactions of Glycine Betaine with Bovine Serum Albumin (BSA): PreferentialExclusion
Glycine betaine, an osmoprotectant found at significant levels in many prokaryotic and eukaryotic cells, appears to be preferentially excluded
308
M. THOMAS RECORD, JR.,ET AL.
from thevicinity of protein surfacesboth in vitro (Arakawaand Timasheff, 1983; Zhang et aL, 1996b) and in vivo (Cayley et al., 1992). In particular, densimetry (Arakawa and Timasheff, 1983) and vapor pressure osmometry (VPO; Zhang et al., 1996b) measurements demonstrate that glycine betaine is highly excluded in BSA solutions at all m3 examined and that rg,2 is directly proportional to glycine betaine concentration. As shown in Fig. 2, r3,2 = -49( 2 4 ) m3 for betaine concentrations in the range 01.6 M. From a twodomain interpretation of the VPO results summarized in Fig. 2, Zhang et al. (1996b) concluded that glycine betaine is completely excluded from the local domain of water surrounding BSA and that this local domain is composed of a monolayer of water on the protein surface, as indicated by comparisons with predictions of calculations based on structural models. For a completely excluded solute in water, the term in Eq. (26) is by definition zero, so that r3,2 is proportional to m3 with a proportionality constant of -By"/m,. If glycine betaine were completely excluded from BSA, the slope of the plot of l?&vs. mg (-49 5 4) indicates that the hydration of BSA, is 2700 2 200 mol H20/mol protein (on a weight basis, B y = 0.74 2 0.06 g H20/g protein). If exclusion of glycine betaine were incomplete, the true hydration would exceed this estimate, which lies above the average,
m",
m3 (m)
r3,2
FIG.2. Solute concentration dependences of preferential interaction coefficients defined by Eq. ( 1 ) for interactions of BSA with urea and glycine betaine, calculated from vapor pressure osmometry data. (Modified from Zhang et al., 1996b.)
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
309
but within the range, of values for protein hydration calculated from hydrodynamic measurements: 0.12-1.04 g H20/g protein, with an average value of 0.53 ? 0.26 g H,O/g protein (Squire and Himmel, 1979). A previous study of protein hydration based on the preferential interactions of glycerol with proteins (Gekko and Timasheff, 1981) reports hydration numbers smaller than those deduced from hydrodynamic measurements, probably because glycerol is not completely excluded from the proteins considered in that study. Osmometric measurements indicate that glycerol is accumulated near BSA to a greater extent than is glycine betaine (E. Courtenay, unpublished results). To test the hypothesis of complete exclusion of glycine betaine and to provide a molecular picture of the local domain surrounding BSA, the thermodynamic extent of hydration of BSA may be compared with a prediction of its water-accessible surface area. From the observation (Janin, 1976;Teller, 1976;Miller et al., 1987) that water-accessible surface areas of native proteins increase as a power function of the protein molecular weight, Zhang et al. (1996b) estimated the water-accessible If the 2700 ? 200 surface area of native BSA to be 2.0 (20.1) X lo4 H 2 0 molecules of hydration (i.e., B Y ) constitute a surface layer, the surface density of water molecules on BSA is 0.14 5 0.01 H20/Az.This result is plausible, as it lies between the densities predicted for hexagonal closest packing of 1.4 radius water molecules on the protein surface (-0.15 H 2 0 per Az)and for a less dense packing with the same crosssectional area per water as in bulk water (0.11 H 2 0 per Az;Gill et al., 1985). The complete exclusion of glycine betaine from protein surfaces is in accord with the proposed osmoprotective mechanism of glycine betaine in E. coli K-12 cells, where the greater osmotic effect of uptake of glycine betaine suggests that it is more excluded from cytoplasmic biopolymer surfaces than are other osmolytes (Cayley et al., 1992). Most natural osmolytes and amino acids are found to be excluded from proteins to some degree (Liu and Bolen, 1995; Arakawa and Timasheff, 1983).
Az.
A
C.
TweDomain Analysis of Urea-BSA Interactions: Preferential Accumulation
Urea, an osmolyte in some organisms (Lin and Timasheff, 1994), is widely used as a protein denaturant at high concentrations. The destabilizing effect of urea on protein structure generally is thought to be due to its preferential accumulation in the vicinity of amides and other polar functional groups, more of which are exposed to the solution when the protein is denatured (Simpson and Kauzmann, 1953; Schellman, 1990;
310
M. THOMAS RECORD, JR., ET AL.
Liepinsh and Otting, 1994). Studies by various experimental methods indicate that the preferential interactions of most (uncharged) solutes with proteins are intermediate between those exhibited by glycine bemine and urea (Arakawa and Timasheff, 1983; E. Courtenay, u n p u b lished) . As in the case of glycine betaine, values of for urea from W O measurements shown in Fig. 2 are proportional to urea concentration in the range examined [ 1, S, + (6 - 1)2, so that Eq. (30) acquires the is directly proportional to b and simple form of Eq. (29),wherein r3,+ independent of a (Anderson and Record, 1980).At very high salt concentrations the contact concentrations of both cations and anions approach the bulk molar salt concentration C3, so that S, + 45vuC,and + -C3qh. Thus for this case also rg; at sufficiently high C3 is directly proportional to b, but now it has a quadratic dependence on a as well. In the experimental salt concentration range, C3vucontributes significantly to r3,u, but becomes dominant only at high salt concentrations. As calculated from solutions of the cylindrical PB equation, the quantity rs; - C3v,also depends significantly on C,. The accuracy of the PB equation as a basis for calculating thermodynamic effects due to the interactions of salt ions with ds DNA is demonstrated in Fig. 3. Donnan data for T,,. are compared with the corresponding results of PB calculations for a model cylindrical polyion with the axial charge density and radius of ds DNA. In general, for a high-charge density (6 > 1) cylindrical polyion either PB or MC calculations of the
vu
r:
314
M. THOMAS RECORD, JK.,ET AL.
0.6-
I
I
I
I
I
I
I
I
0.5 -
< =.
0.40.30.2 -
0.1
0.0
dependence of Ta,uon by the expression
-
5 and on salt concentration can be represented
where in a PB analysis the salt concentration-dependent correction to limiting law (low salt) behavior, a, is defined as the difference between S,,at a specified salt concentration and its low-salt (limiting law) value [SkL= (6 - 1)*]. Among the theoretical methods currently available to calculate TS," for a given model of a nucleic acid solution, the two most frequently used are based either on some form of the PB equation (Stigter, 1975, for example) or on grand canonical Monte Carlo (GCMC) simulations (Mills et al., 1986; Vlachy and Haymet, 1986). Four distinct approaches from solutions of the PB equation and/ have been taken to evaluate r3,,, or MC simulations. For an isolated cylindrical polyion the second virial coefficient can be evaluated by integrating ion distributions calculated either from MC simulations ( H . Ni, unpublished) or from numerical solutions of the cylindrical PB equation (Stigter, 1975). An analytic relationship derived from the PB cell model for a cylindrical polyion by Anderson and Record (1980) can be differentiated directly, under the constraints of constant electrolyte activity and temperature, to obtain ) , ~ , , C, is the molar concentration an expression,:?I = ( I ~ C , / I ~ C , ,where
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRlA
315
of polyion monomer (Anderson and Record, 1983). Alternatively,under the constraints of constant a, and T [and if variations in pressure are negligible (Anderson and Record, 1993)], ion distribution functions generated by solving the cylindrical PB equation can be integrated numerically to determine C, as a function of C,. At sufficiently low C,, where this functionality becomes linear, rit or can be evaluated directly as the slope of a plot of C, vs. C, (Mills et al., 1986). For model systems represented with varying degrees of structural detail, the excess electrostatic free energy attributable to the interactions of mobile salt ions with the multiple charges on some rigid model nucleic acid can be evaluated by appropriate numerical integrations and then differentiated (again numerically) with respect to salt concentration to obtain a function that is approximately equivalent to (Misra et al., 1994a,b;Sharp et al., 1995). All calculations based on the PB equation, for any model, are potentially subject to error insofar as certain interionic correlations are neglected a priori in the statistical mechanical formulation of this equation. At sufficiently high dilutions of salt solutions containing cylindrical polyions, this “PB approximation” does become exact (Fixman, 19’79). Its accuracy over the usual experimental range of concentrations can be assessed best from a theoretical standpoint by comparisons with the results of MC simulations for systems modeled using the same assumptions. The accuracy of the PB approximation and of the underlying model assumptions for which the PB equation is solved varies considerably with the system, the solution conditions, and the property to be calculated (Anderson and Record, 1990,1995; H. Ni, unpublished). The ultimate criterion for the reliability of any theoretical calculation must be based on comparisons with experimental measurements. The acquisition of data appropriate for such comparisons merits high priority, because it will advance our understanding of the physical factors that govern the thermodynamic consequences of preferential interactions. Mills et al. (1986) and Olmsted et al. (1989, 1991, 1995) developed methods based on GCMC simulations to calculate TS,,for cylindrical models of polymeric and oligomeric electrolytes. By analogy to the third described above, r?: = (a C,/a C,) can be evaluway of calculating ated as the slope of a plot of C, vs. C,, as determined by an appropriate series of GCMC simulations.For any model that neglects the molecularity of the solvent, neither osmolarity nor pressure can be explicitly controlled while C,, is varied. Thus, certain approximations are entailed in the identification of (or rC) with These have been discussed in detail by Anderson and Record (1993), who examined the thermody-
rt:
rgt
rgt
:!r
rS,,.
316
M. THOMAS RECORD, JR, ET AL.
namic foundation for the use of GCMC simulations as a means of calculating preferential interaction coefficients for a three-component poly-(or oligo-)electrolyte-salt solution, modeled according to the standard set of assumptions, which neglect the molecularity of solvent water. The GCMC simulations reported by Olmsted et al. (1989) for a homologous series of oligoelectrolytes (model DNA oligomers, designated N mers, having I&,[ = Ncharges) predict that the positive quantity - r 3 , u ( N ) (the negative of the preferential interaction coefficient of the Nmer expressed per oligomer charge) decreases with increasing N. For large enough values of N, r5,J N> approaches the corresponding polyelectrolyte value [here designated r 3 , u ( w ) ]as a linear function of N ‘ ,as shown in Fig. 4 for simulations at two salt activities (1.8 and 12.3 mM):
Here the Nindependent parameter a depends on a+ and on the structural characteristics common to each (model) Nmer in the series (Olmsted et al., 1991, 1995). The magnitude of (Y is a measure of the oligo “end effect” on T,,(N) (expressed per monomer), which causes deviations from the value characteristic of the corresponding polymer.
0.35 i
I
1
I
0.10
0.00 0.01
I
I
I
I
I
I
i
0.02 0.03 0.04 0.05 1/N
FIG.4. Grand canonical Monte Carlo predictionsof the percharge preferential interaction coefficient [plotted as -r3,“(N), where N = lZl] of model EDNA oligomers in univalent salt, as a function of 1/Nat two salt activities: a, = 1.8 mM ( O ) ,and at = 12.3 mM (0). The solid lines were obtained by linear regression. (Modified from Olmsted et al., 1995.)
[SALT] AND [SOLUTE] EFFXCTS ON BIOPOLYMER EQUILIBRIA
317
As noted in Section 111, values of for interactions of nonelectrolyte solutes with weakly charged proteins are observed to be proportional to solute concentration and hence approach zero at 9-+ 0. In contrast, experimental observations indicate that r3,u, predicted by Eqs. (29)- (31), approaches a nonzero limiting law value as the salt concentration is reduced, and that at higher salt concentrations it exhibits a moderate dependence on electrolyte concentration that is more complicated than simple proportionality. In marked contrast, values of for “Hofmeister-like” interactions of electrolytes with weakly charged protein polyampholytes (discussed further in the following subsection) generally are found to be directly proportional to the electrolyte concentration, like the coefficients that characterize the preferential interactions of uncharged solutes with proteins.
C. Two-Domain Interpretation of Single-lon and Salt-Component R#erential Interaction CoefJients for Interactions of Ehctrolytes with Weak4 Charged Biopolymers (Hofmeister Salt Effects) Using a twodomain (local-bulk)model of electrolyte-polymer preferential interactions, Record and Anderson (1995) interpreted the singleion Donnan distribution coefficients I‘-,and rtJdefined by Eqs. (15) and (16).The number of moles of cations (Bt ) and anions ( B - ) associated per mole of charged polymer in the twodomain model each are related to differences between total and bulk quantities: ,lo,
Bt -
+(a)-
n2
,,ptal +(a)
nbulk +(a)
m2
n2
and
- rn!‘$) _ -n?“ & n? =A (a)- (33) n2 m2 n2
where, by analogy to Eqs. (20) and (23), n$’$ = n?$t m$’$/ml, n!!’$, = n?$\ m’l”l,k)/ml,and n2 = n??;! %(m, - B 1 w ) - l . For the nonelectrolyte case, an ideal dialysis equilibrium condition was used to equate the molality of solute in the bulk domain to its molality in a polymer-free solution ( p ) in dialysis equilibrium with the polymer-containing solution (a)[cf. Eq. (21) above]. This analogy was extended to charged solutes in order to relate ion concentrations in the bulk phase in a and in the “reference” solution (0) by using an ideal Donnan dialysis equilibrium condition (i.e., -y!$!j = y 3 ( p ) )[cf. Eqs. (10)-(12)]: &3)
=
( m y $ )( m y $ )
(34)
By incorporation of this thermodynamic condition into the twodomain
318
M. THOMAS RECORD, JR., ET AI.
model, the contribution to nonideality from electrolyte-polymer interactions [ Y ~ cf. , ~Eq. ; (12)] is interpreted entirely in terms of the extent of accumulation of salt ions in the local domain surrounding the protein. Application of the electroneutrality condition to the entire solution (a) -but not to the local or bulk domains individually-and expansion of the square root of Eq. (34) for the condition of excess salt yield, in the limit of low m2 (where my?:) = mVk (a) = m d ,
r+,,= 0.5(12,1 + B- + B,)
BImS/ml T-,,= -0.5(lZ,l - B- - B,) - B I m 3 / m l= TS,* -
(35)
Equations (35)demonstrate that the thermodynamic two-domain model of ion-polymer nonideality describes the difference between each singleion preferential interaction coefficient and its “ideal” value in terms of the following: (1) the modification of the polymer structural charge by the net thermodynamic effect of accumulation (exclusion) of salt anions and cations; and ( 2 ) a hydration term, which contributes significantly only at moderate to high salt concentrations. The molecular picture implied by this two-domain expression, while certainly not literally descriptive of the long-range concentration gradients that characterize interactions of salt ions with highly charged polyions at low salt concentration, should be appropriate for short-range interactions between ions and biopolymers such as those likely to be responsible for Hofmeister salt effects at high salt concentration. The sum of single-ion preferential interaction coefficients for cation and anion is independent of 12,:l
Tt.j + r-,,= B,
+ B- - 2B1m3/m,
(36)
Both T+,, and T-,,,as well as their sum, depend on both B, and B-. As an interesting special case, potentially applicable to the treatment of Hofmeister salt effects on proteins near their isoelectric point, the protein may be uncharged in the absence of the salt (151 = 0) but the cation and anion of the salt may interact differently with the protein, so that B, # B-. Equation (35) for this case becomes
Since Eq. (36) is independent of
151, its form is unaffected if lZJl = 0.
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
319
VI. USEOF THREECOMPONENT PREFERENTIAL INTERACTION COEFFICIENTS TO ANALYZE EFFECTS OF SOLUTE CONCENTRATION ON EQUILIBRIUM OR FREEENERGY CHANGES OF CONSTANTS, TRANSITION TEMPERATURES, BIOPOLYMER PROCESSES
A. Thermodynamic Fundamentals In a solution containing only two solutes (designated 2 and 3) the following partial derivatives or combinations thereof, all taken at constant temperature and pressure, are equal to r3.2:
Equation (38) relates the solute-biopolymer preferential interaction coefficient directly to the dependence of the thermodynamic nonideality of the biopolymer (component 2) on the activity of the solute (component 3). Based on Eq. (38), evaluated at sufficiently low m,the effects of solute concentration on many biopolymer processes may be analyzed quantitatively, subject to certain approximations, generally valid when m3 %- m2, as discussed in detail by Anderson and Record (1993). If components 2 and 3 are, respectively, a biopolyelectrolyte or polyampholyte and an electrolyte, the activities a2and a3refer to the corresponding electroneutral solute components. In particular, if component 2 consists of a polyanionJ with 151charges and its charge-balancingcomplement of univalent cations, M+,then a2and y2in Eq. (38) can be expressed by a?, = a51 a/ and y2/ = 721yJ,respectively. Here the single-ion activity of the polyion is a/ = yJmj = yJm2/and that of its counterion is a d = yM+( mS + l a m J ) .If component 3 is a 1:l electrolyte (M+X-) then
+ li$mJ)
a3 = adax- = a9 = yqm3(m3
(39)
B. Definition of the Experimentally Observable Equilibnum Constant KdS and Its Dependence on Solute Activity fm Macromolecular Associations To avoid the complexity of a more generalized notation, the analysis reviewed here addresses explicitly effects of excess solute concentration
320
M. THOMAS RECORD, JR.,ET AL.
on the thermodynamics of an association process (involving one or more biopolymers) of the form
A+B+A The following analysis of this process exhibits all the essential features encountered in analyzing solute effects on any type of biopolymer process, including conformational changes, multimerization, and dissolving or precipitating the biopolymer. For the corresponding association process at equilibrium,
The thermodynamic equilibrium constant Ke, is obtained from the equilibrium condition p2 = piq p$q and defined as K,, = ( U ~ / U ~ ~a ) function only of temperature and pressure. The experimentally accessible equilibrium quotient Kobsis defined in terms of the molalities of the reactants and products:
+
Experimental values of K o bs typically are reported as quotients of the corresponding molarities. However, under the conditions of interest here and at this stage of the derivation, the distinction between molality and molarity can be ignored, not simply because of approximate numerical equality, but for reasons explained in detail elsewhere (Anderson and Record, 1993). The quotient Kobsis related to Keq by the corresponding quotient of activity coefficients K,:
Under typical in vitro (and, in some cases, in vivo) conditions, the concentrations of A, B, and AB are all sufficiently low, and the concentration of the perturbing solute 3 is sufficiently greater than those of A, B, and AB, so that the only significant contributions to the nonideality of these species, as reflected by yA,yB,ym, are due to various kinds of interactions of A, B, and AB with ions or molecules of the perturbing solute. There-
, ~ ,
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
32 1
fore, three-component preferential interaction coefficients of the type defined in Eq. ( 1 ) can be used to characterize the dependences of yA, yB,and yAB, and hence of 4, on the solute activity (Anderson and Record, 1993, 1998). This dependence can be represented by the derivative:
(a In Kobs/a In %)(m2)
=
-(a In K y / a In @3)(m9)
(42)
As elsewhere in this chapter, the subscripts T and P are omitted to simplify notation and the subscript {m2}indicates that mA, q,and mAB can be regarded as fixed when the dependences of Kobs(and K,) on a, are determined. This approximation can be justified, for example, when the perturbing solute is present in sufficient excess so that the actual changes in mA, %, and mm that must accompany a shift in the complexation equilibrium have no significant effect on yA,yB,and yABor on the activity coefficient of the perturbing solute (Anderson and Record, 1993). For a cooperative macromolecular conformational change, the derivative that expresses the solute-concentration dependence of Kc,bscan be related, using the Gibbs-Helmholtz equation, to the derivative that expresses the effect of solute concentration on the transition temperature T, of that conformational change (as explained, e.g., by Privalov et al., 1969):
AHEb dT, RP, dln a,
(43)
where AH,",, is evaluated at T,. For at least some biopolymer transitions (e.g., protein folding), is a strong function of temperature. At T,,
AHEbs = Accg,obs(Tm - TH)
(44)
Here Tl, is the characteristic temperature where A H o = 0 for the process (Baldwin, 1986; Schellman, 1987), and the heat capacity change ACP",obs is assumed to be independent of temperature. Both Ac,",ob,and Tl, may be functions of a?.
C. E f f t s of Changzng the Concentration of an Uncharged Solute on Equilibria of Uncharged Biopolymers Changes in the equilibrium extent of conversion of uncharged biopolymer reactants to products (i.e., in Kobs)that are driven by changes in
322
M. THOMAS RECORD, JR., ET AL.
the concentration of an uncharged perturbing solute can be analyzed directly in terms of the appropriate solute-biopolymer preferential interaction coefficients. To establish this relationship, the final expression for TS2in Eq. (38) is applied to each biopolymer activity coefficient in K y , as shown in Eq. (42):
Here ATs2is the stoichiometricallyweighted difference between values of T3,.. for products and reactants. When solute component 3 is in sufficient excess so that all macromolecular activity coefficients ( y 2 ) depend on m3 but on none of the m2,Eq. (45) implies that threecomponent preferential interaction coefficients of the type defined by Eqs. (1) and (38) suffice as a basis for the analysis of the dependence of Kohson for an equilibrating system involving four (or more) components. A version of Eq. (45) (expressed in terms of preferential interaction coefficients represented in a different form) was presented [as Eq. (7.5)]by Wyman (1964). For a conformational transition of an uncharged biopolymer, the dependence of T, on the concentration of an uncharged solute can be deduced from Eqs. (43)- (45),
D. E f f t s of Changing the Concentration of an Elech.olyte Solute on Equilibria of Charged Biopolymers For situations where some or all of the reactant(s) and product(s) are charged polymers (either polyelectrolytes or polyampholytes) and/ or where component 3 is an electrolyte, the form of Eq. (45) is not applicable to describe the effects of an electrolyte component 3 on Kobsfor every type of biopolymer equilibrium. Each of the preferential interaction coefficients of the type defined in Eq. (38) can be represented more explicitly by the following equation:
This expression, which follows from Eq. (39) and the analogous expres-
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
323
sion for a?/, incorporates an approximation, a+ a+, expected to be reasonable in excess 1:1 electrolyte ( m , m2). By substituting Eq. (47) for each single-ion activity coefficient derivative (I3 In yJ/d In u+)? in the derivative ( d In K,/d In u+)(,,+ we obtain
*
The derivation of Eq. (48) is expedited by the approximation embedded in Eq. (47), but no assumption about single-ion activity coefficients is required for the rigorous derivation of Eq. (48)(Anderson and Record, 1993). The preferential interaction coefficients for electroneutral components that appear in Eq. (48) can be calculated from GCMC simulations (Mills et al., 1986); in some cases they can be measured by appropriate methods such as osmometry (Zhang et al., 1996a).For typical experimental conditions, where m3 S m2 and where y2Jis a function of m3 but not of any of the m y Eq. (48) generally is applicable for the analysis and interpretation of effects of salt concentration on any process involving charged molecules, including polyelectrolyte, Hofmeister, and osmotic effects. With minor modification, an analogous equation can be derived to describe ionic strength salt effects on equilibria involving only charged solutes of low molar mass. For the situation where AlZ,l = 0, as when all reactants and products are uncharged, or more generally when the sum of 141for products is equal to the sum of 151for reactants, Eq. (48) reduces to Eq. (45) because d In a3 = 2d In a, for a 1:l salt. For example, for DNA helix formation from two complementary strands, AIZ,\ = 0. However, for binding of an oligocation ( L z + )to a DNA poly# 0, and in this case Eq. (48) must be used to analyze effects anion, of salt activity (or concentration) on Kobs. The relationship between Eq. (48), for the effects of a charged solute on equilibria involving charged polymers, and Eq. (45), for the nonelectrolyte case, is further clarified by transforming Eq. (48) using singleion preferential interaction coefficients (Record and Richey, 1988; Record and Anderson, 1995). Incorporation of r+J and r-J from Eq. (17) into Eq. (48)yields
For comparison, the nonelectrolyte expression [Eq. (45)] is (d In Kbs/ a In an)(m,) = AT,,. Changing the concentration of an electrolyte of
324
M. THOMAS RECORD, JR.. ET AL.
course changes both cation and anion concentrations, and Eq. (49) indicates that the resulting effect on Kohscan be expressed as a stoichiometric combination of contributions, each being a sum of cation and anion single-ion preferential interaction coefficients. Corresponding to Eqs. (43) and (48) or (49), the effect of mean ionic activity a, on T, of a biopolymer conformational change involving charged biopolymers can be expressed as
The functional form of Eq. (50) is directly applicable to analyses of effects of salt concentration ( S O . 1 M ) on conformational transitions of nucleic acids and other polyelectrolytes, whereas that of Eq. (51) a p pears to be directly applicable to analyses of Hofmeister effects of salts (20.1 M ) on conformational transitions of biopolymers.
E. Description of Solute Concentration Effects on AG for a Macromolecular Process The relevance of in vitro investigations of solute effects on equilibria to in vivo processes that are not at equilibrium has occasionally been questioned. Some cellular processes occurring at constant P and T a r e not at equilibrium with regard to reactant and/or product concentrations, but rather are in a steady state in which the concentration of each of the reactants and products involved in one process is maintained at a nonequilibrium level by other cell processes. In principle this condition can also be achieved in an in vitro reactor which adds reactants and removes products. At constant P and T, the free energy difference between initial and final states of the system determines whether a process converting the initial to the final state is favorable (i.e., is spontaneous and can be harnessed to do useful work) or unfavorable (i.e., requires coupling to a favorable process in order to occur). For the isothermal, isobaric association process
A+B+AB the free energy change AGis given by the difference in chemical potentials
[SALT] AND [SOLUTE] EFFECTS ON BIOPOLYMER EQUILIBRIA
where (for i = A, B, or AB) p i= p: (41), Eq. (52) can be expressed as
325
+ RT In yimi.On the basis of Eq.
In Eq. (53), Qobs and Qy represent stoichiometricallyweighted quotients of product and reactant concentrations and of activity coefficients, respectively, in the specified nonequilibrium steady state of the reacting system:
and &,s and Ky represent the corresponding ratios at equilibrium. In an in uitro reactor, Qobs is not a function of the concentration of a perturbing solute. (Each of the reactant and product molalities appearing in the definition of Qobscan be fixed at any arbitrary value, either in the presence or absence of the perturbing solute.) In contrast, from Eqs. (45) and (48) Kobsis a function of the concentration of the perturbing solute. If the process A + B + AB occurs under conditions of excess perturbing solute where the activity coefficients in Qy and Ky depend on the activity of this solute (%) but not on the concentrations of reactants or products (here A, B, or AB), then
Therefore, for an uncharged solute,
If the solute is an electrolyte, an analogous expression relates derivatives with respect to In a, of AGand In Kobs to A(lZ,l + 2Ts,2J).Consequently, for the system and conditions considered here, changes in solute concentration that would shift the position of equilibrium (and hence change
326
M. THOMAS RECORD, JR.,
ET AL.
Kobs)have the same effect on the thermodynamics (AG) for the corresponding process under the nonequilibrium, steady state conditions. PREDICTIONS OF FUNCTIONAL FORMS OF EFFECTS OF VII. TWO-DOMAIN NONELECTROLWE CONCENTRATION ON EQUILIBRIA (K (dmddmJT,p,,s. The effective site occupancy numbers, Bl and 4 [V, and V, in Tanford’s (1969) notation], must not be equated with numbers of water and cosolvent molecules actually in contact with the protein (or with a particular patch on a protein, such as an active site on an enzyme or the locus of the binding of a ligand). Equations (11) and (12) state that 4 and B, stem from the summation of a broad spectrum of thermodynamic perturbations of cosolvent and water molecules by the protein. These may range from actual site occupancy with its exchange free energy, or a strong repulsion from a locus on the protein surface, to a momentary perturbation of the rotational or translational motions of solvent molecules which never come into contactwith the protein surface (Lee et al., 1979). Each of these events makes a contribution to the free energy of interaction, i.e., for each solvent molecule, i: ( d p J d 9 ) ’ = (dAG/d%) and - ( q / m l ) ( d p l / d q ) ’ = (dAC&/dq). Each, when multiplied by (dps/d%),turns in a value of 4 or 4,which may be a small fractional number. By Eq. ( l l ) , the parameters & and B1 of Eq. (12) are summations of all of these numbers and, as such, are only descriptive parameters. Preferential binding is an expression of the thermodynamic interactions of the protein with solvent components. It is measured by equilib rium techniques such as equilibrium dialysis (Scatchard, 1946),isopiestic equilibrium (Hade and Tanford, 1967), light scattering (Katz, 1950), X-ray scattering (Timasheff, 1963), or sedimentation equilibrium (Kielley and Harrington, 1960).It does not give the number of protein-ligand contacts. That information must be obtained from nonthermodynamic techniques, such as calorimetric titration or the perturbation of a spectral property, which respond to contacts between protein and ligand molecules. Conversely the effective number of water molecules that hydrate a protein may be calculated by, e.g., the NMR (nuclear magnetic resonance) approach of Kuntz (1971). These contactdetecting techniques, in turn, cannot give a thermodynamic description of the interactions of a protein with a solvent system, nor can they detect cosolvent molecules that do not make contacts with the protein surface but are weakly perturbed by its proximity. A pictorial description of the process can be given in terms of a simple model which, however, does not correspond to the actual numbers of interacting sites nor to their occupancy. Such a simple model is presented in Fig. 4,in which the surface of the protein is depicted as a mosaic of loci that interact with water and ligand with various affinities. In this model, the contacts have been grouped into
372
SERGE N. TIMASHEFF
/
FIG.4. Schematic representation of the thermodynamic state relative to bulk solvent of the surface of a protein dissolved in a mixed solvent. The total protein surface must make contact with water or cosolvent molecules. Depending on the free energy of interaction with the cosolvent system, three types of interactions may occur: (crosshatching) the cosolvent exchanges with water (these areas will be occupied by water or cosolvent depending on their relative affinities); (dotted areas) the cosolvent is excluded; (hatching, ////) thermodynamic indifference-the ratio of cosolvent to water molecules is the same as in the bulk-solvent medium at all solvent compositions. [Reprinted with permission from Timasheff (1992). Copyright 1997 American Chemical Society.]
three categories: (1) Those that have a significant affinity for ligand and undergo water-ligand exchange; they may manifest preferential binding, B3/B1 > m 3 / m l , or preferential exclusion, B3/B1 < m 3 / m 1 , depending on the relative affinities for water and the ligand. (2)Those that have essentially no affinity for ligand; they do not undergo waterligand exchange and are occupied predominantly by water, B3/B1 < -/ml; they always display negative values of (dm3/dm2)T,p,,9. (3) Those that are indifferent to contact with water or cosolvent; for them B 3 / B 1 = m3/ml and ( d - / 1 3 % ) ~ , ~ , ~= ~ 0 (thermodynamic neutrality); it must be noted, however, that contact-detecting techniques will report these last sites. The affinities of the various sites for the solvent components vary with solvent composition because of the nonideality of the cosolvents
CONTROL OF PROTEIN STABILITY AND REACTIONS
373
(Schellman, 1990, 1993). Hence their distribution will change with solvent composition. Each site of the first kind will be occupied at any moment either by ligand or by water, depending on the relative affinities. As a time average, the occupancy of the n exchangeable sites, which may have nonidentical ligand affinity, will be & molecules of ligand and B Y h molecules of water. The sites of the second category make no contribution to B,; these will be occupied by water molecules that are not exchangeable with ligand, B p . Sites of the third category will not be detected by any equilibrium thermodynamic approach, whether they are occupied by water or cosolvent. The measured dialysis equilibrium binding being the sum of all three occupancies, Eq. (12) becomes
How can these types of sites be resolved? At present there is no way of sorting out these weak surface interactions. A description of the situation for any real protein-cosolvent system can be obtained, however, from a combination of dialysis equilibrium binding results with those of a protein-ligand contact-detecting technique with the unrealistic assumption that the affinity of all n exchangeable sites is identical. Then the contact-counting techniques yield an apparent value of CB,. If the values so obtained as a function of cosolvent concentration are plotted in terms of a Scatchard or similar plot, it is possible to obtain values of n, the descriptive number of exchangeable sites, and of an apparent binding constant, KaPP, which will differ greatly from the true values (Schellman, 1994). Assignment of a model of exchange, such as the number, r, of water molecules displaced by each cosolvent molecule, gives effective values of B Y h = T( n - 4).Combination with the dialysis equilibrium binding results gives, by Eq. (13),BP. Alternately, the effective number of water molecules that interact with the protein (ZBYh+ BNex 1 ) can be estimated by the NMR technique of Kuntz (1971; see also Kuntz and Kauzmann, 1974) or the vapor pressure measurements of Bull and Breese (1974). Combination with measured values of ( d m , / d m , ) T,p,c3 gives, by Eq. (13), the value of &. Examples of such calculations are given in what follows (Sections IV,B,2 and IV,B,3) for the urea denaturation of RNase A.
E. Exchange at Sates While combination of preferential binding with counting of effective contacts gives a descriptive value of B3 and Bl within the model of
374
SERGE N. TIMASHEFF
identical exchangeable sites, further classification of the effective water molecules into exchangeable and nonexchangeable requires a model for the exchange at sites. The question of exchange has been addressed by Schellman (1987,1990,1993,1994)in aseries ofimportant theoretical papers. Treating explicitly the simple model in which one ligand molecule replaces one water molecule at totally independent loci on the protein surface, Schellman has derived the relation between the exchange constant, & [see Eq. ( 5 ) ] , and preferential binding:
where K:, is the exchange constant in mole-fraction units; K,,, the constant in molal units; and X3, the mole fraction of cosolvent. The values of the exchange constants as expressed in Eq. (14) vary with solvent composition because of nonideality. The relation to the intrinsic exchange constant is K:, = Killu(fi/fs), where fi and fs are the activity coefficients of water and cosolvent, respectively. Hence, the cosolvent concentration dependence of preferential binding, which may change sign while Kin[,remains the same (Schellman, 1990). Summation over all the n exchangeable sites of identical K , and the nonexchangeable sites and combination of Eqs. (13) and (14) gives
Equation (15) indicates that neglect of the nonexchangeable sites must always give values that are greater than or equal to the measured preferential binding. The concept of exchange gives a ready explanation for the observed negative stoichiometries of binding measured by dialysis equilibrium for a large number of structure stabilizing and salting out cosolvents. The exchange equilibrium of Eq. (4) is described in molecular terms by Eqs. (12) and (13). The Schellman exchange analysis of Eq. (14) shows how a positive interaction constant, K,,, can result in negative dialysis equilibrium stoichiometries of binding. Classical treatment leads to the absurd conclusion that there must be a negative equilibrium constant, as first pointed out by von Hippel et al. (1973). The introduction of ( K e x- l / m l ) in the numerator of Eq. (14) takes into account specifically the binding of water at the exchangeable sites. This satisfies the require-
375
CONTROL OF PROTEIN STABILITY AND REACTIONS
ment that an equilibrium constant must be positive, even though the measured thermodynamic binding is negative. The exchange equilibrium criterion, for the model treated by Schellman, is established by Eq. (14). It is the magnitude of K,, relative to l/ml. Since ml = 55.56 mol of H 2 0 per 1000 g, l / m l = 0.018 rn-l This means that when K,, < 0.018 m-I, there is preferential hydration, i.e., a negative binding stoichiometry; when K,, > 0.018 m-l, the observation is that of positive binding; K,, = 0.18 m-l is the point of no dialysis equilibrium binding. If we return to Fig. 2, this means that within this model, for the MgC12 system, K,, < 0.018 m-l below 2.5 M salt, and K,, > 0.018 m-l above that concentration, with K,, = 0.018 m-l at 2.5 MMgC12.For cosolvents, the exchange equilibrium at sites corresponds to free energy changes of 51501 cal when expressed on the molal scale.
F. Additivity; Compensation The observed dialysis equilibrium binding is the summation of interactions at a large number of sites, each characterized by its own set of affinities for water and the cosolvent. In this sense, the preferential interactions are additive. Additivity is found also when the cosolvent system contains more than one type of particle. This is true of all salts which dissociate into their cations and anions, as well as of mixed cosolvents. Analysis of the preferential interactions of salts has shown that they follow the Hofmeisterseriesbothfor anions and cations. The induction of preferential hydration decrease in the orders SOq- > CH3CO; > C1- > SCN- for anions and Na+ > X2+ > Gua+ for cations (Arakawa and Timasheff, 1984b). When paired in a salt, the effects are additive, as shown in Table I. For example, guanidinium ions are preferentially TABLE I Hofmeiter Progression ofPrt$iential Hydration of BSA in I M Salts at 20°C (PH 4.5-5.6)" Anion Cation Nat M$+, Ca2+,Ba2+ Gua+
c1OAcsoq[(dgl/8g.Jrp,,pa g r a m s of water per gram of protein] 0.24 0.04 to -0.04 -0.24
"Data from Arakawa and Timasheff (1984b).
0.31 0.11 to 0.18 0.08
0.52 0.34 0.21
376
SERGE N. TIMASHEFF
bound to proteins; C1- is weakly excluded; the pair GuaHCl is preferentially bound. Yet, in 1 M salt, Gua+ is not capable of overwhelming the effect of SO:-, which is strongly excluded. As a consequence, Gua,S04 is preferentially excluded, although to a lesser extent than Na2S04. The progression of salt preferential hydration is reflected in their effects on macromolecules. Thus, Na2S04is a strong salting-out agent, while NaSCN is a protein solubilizer (von Hippel and Schleich, 1969); GuaHCl and GuaSCN are among the best protein denaturants; Gua2S04 stabilizes the native structure of RNase A. The additivity is strikingly evident in the two examples of Fig. 5. Figure 5A shows that the ability of Gua+ salts to denature proteins follows the Hofmeister series, as do
A 'OI
-u
L
E
L
40
-
30-
20
0
I
I
I
1
1
2
3
4
Concwwarlon Imol~s/ll~rrl
FIG.5. See facing page for legend.
CONTROL OF PROTEIN STABILITY AND REACTIONS
377
their preferential interactions (von Hippel and Wong, 1965). In the other example, the effect of salts on the solubility of acetyltetraglycine ethyl ester is found to be totally additive between cations and anions (Robinson and Jencks, 1965a,b). As shown in Fig. 5B, LiBr is a saltingin agent for this model tetrapeptide; NaCl is a salting-out salt. Both NaBr and LiCl had no effect, as each ion compensated for the effect of the other one in an additive manner. Two particularly interesting salts are MgC12and Gua2S04.As shown in Fig. 2,MgC12is strongly excluded from P-lactoglobulin at pH 3. This becomes reversed at 3 M salt. In the case of Gua2S04,the situation is just the opposite (see Table I1 below, in
..
I .o
.
0.8
0.6
0.5
t I
2
MOLARITY FIG.5. Additivity and compensation of Hofmeister ions. (A) Effect of guanidinium salts and urea on the transition temperature of RNase A (pH 7.0, 0.013 M sodium cacodylate, 0.15 M NaC1). (B) Salting in and salting out of acetyl tetraglycine ethyl ester (ATGEE) by varying concentrations of halide salts at 25"C, expressed as the ratio of the pentapeptide solubility (S) in water to that in salt. [Figure 5A reprinted with permission from von Hippel and Wong (1965).Copyright 1997 The American Society for Biochemistry & Molecular Biology. Figure 5B reprinted with permission from Robinson and Jencks (1965a). Copyright 1997 American Chemical Society.]
378
SERGE N. TIMASHEFF
Section IV,A,l). At 0.5 M, it is preferentially bound to BSA, but it is excluded above that concentration. The two observations may be explained in terms of compensation. Both the Mg2+and Gus+ ions have affinity for sites on protein molecules. Yet both C1- and SO:- ions promote preferential exclusion, SO$- doing so more strongly than C1-. Nevertheless, at high MgC12 concentration, MgZt ion binding becomes sufficientlyhigh to overcome the opposing effect. In the case of Gua2S04, the SO$- anion is very strongly excluded and overcomes the attractive interaction of Gua+, the net result being preferential exclusion. Compensation and additivity are also found in mixtures of neutral cosolvents. An example of this is the mixture urea-trimethylamine-N oxide (TMAO). Following the observation that some fish store urea and methylamines (principally TMAO) as osmolytes (Somero, 1986),Yancey and Somero (1979, 1980) have examined their effects individually and in a mixture on the K , of some fish enzymes, as well as on the stability (T,) of ribonuclease A. In both studies, urea and TMAO had opposite effects on the observed properties, while their mixture in the 2 : 1 urea/ TMAO molar ratio, as found in the fish, gave full compensation: K,,, maintained its dilute buffer values. The compensation between these two cosolvents is shown in Fig. 6A for the melting of RNase A (Yancey and Somero, 1979). It is seen that the two cosolvents displace T, in opposite directions. The effect of their mixture on T,, however, is the arithmetic sum of the individual effects. Similar results were obtained with RNase TI (Lin and Timasheff, 1994). Figure 6B shows the preferential binding of urea to native and unfolded RNase TI in the absence and the presence of TMAO (Lin and Timasheff, 1994). The lack of effect of the methylamine on the preferential binding of urea indicates that the two cosolvents interact with the protein in an additive noncooperative manner.
111. W
w LINKAGES IN PREFERENTIAL INTERACTIONS The very weak interactions of cosolvents with proteins and other biological systems can modulate a vast spectrum of biochemical reactions and transitions undergone by biological macromolecules and assembled organelles. This is the consequence of the additivity of interactions at multiple sites (Schellman, 1975, 1987, 1990). These reactions can be classified under five general categories: (1) effect on protein stability (stabilization-denaturation) ; (2) solubility (salting in-salting out) ; (3) self-assembly of subunit systems; (4) modulation of enzymic and other biochemical reactions: ( 5 ) binding of ligands.
I
200
0
400
600
800
I 400
lUreal (mM)
I
I
I
I
I
0
100
200
300
IOthcr solutes] (mM)
"
I
.
0.00 0.03
"
.
I
l
"
1
.
0.06
"
.
I
.
0.09
.
I
.
0.12
.
0.15
93
FIG.6. Compensation of the effects of urea and TMAO. (A) Midpoint of the thermal unfolding transition of RNase A at pH 7: control (0);urea (A); TMAO (0); betaine (M); palanine (A);sarcosine (0); taurine (e); urea and trimethylamine N-oxide (0).(B) Cosolvent concentration dependence of the preferential binding of cosolvent to protein: RNase TI in urea (0);RCMTI in urea ( 0 ) ;RNase TI in TMAO (0);RCM-TI in TMAO (M); RNase TI in solutions of urea and TMAO with a molar ratio of 2:l (V); RCM-TI in solutions of urea and TMAO with a molar ratio of 2:l (v).The gsvalues for the ternary solvents indicate the concentration of urea. In the mixed cosolvents experiments the preferential binding of urea only is detected. It is seen to be identical whether TMAO is present or not. [Figure 6A reprinted from Yancey and Somero (1979). Figure 6B reprinted with permission from Lin and Timasheff (1994). Copyright 1997 American Chemical Society.]
380
SERGE N. TIMASHEFF
A. General Considerations The effect of a cosolvent on any reaction in equilibrium can be examined with respect to two reference states: (1) a solvent of the given composition and (2) water (in practice dilute buffer). When the reference state is a solvent of the given composition used in the experiment, the following question is asked: In what direction will the addition of an infinitesimal amount of cosolvent orient the reaction? The control exercised by the cosolvent is given by the Wyman linkage equation [Eq. (1)3, expressed in terms of preferential binding [Eq. (16a)] or of effective site occupancy [Eq. (16b)l (Tanford, 1969; Aune et aL, 1971):
When the reference state is water, i.e., the equilibrium in cosolvent of concentration m3 is compared to that in water, then the influence of the cosolvent on the reaction is expressed by the difference between the transfer free energies of the products and the reactants:
where AG; and AG; are the standard free energies of the reaction in solvent of composition m3 and in water, respectively. Equation (17) is the integral of the Wyman linkage relation [Eq. (16a)l if the last is expressed as the perturbation of the chemical potential of the protein by addition of the solvent. Then, by Eq. (7),
Integration of Eq. (18) gives (Schellman, 1975; Vlachy and Lapanje, 1978)
CONTROL OF PROTEIN STABILITY AND REACTIONS
381
If the preferential interactions are expressed in terms of the effective site occupancies, Eqs. (19b and c) become, by introduction of Eq. (121,
B.
Control Is Exercised @ the Difference in Prejkrerential Interactions Whatever Their Sign
As stated above, cosolvents can exert an action on a protein reaction only as a consequence of the fact that their action reflects the sum of interactions at a multitude of sites on the protein, as depicted on Fig. 4.For the I sites on the protein surface, the changes in Wyman slope [Eq. (IS)] and transfer free energy [Eq. (17)] are summations of the changes at each site, i:
Therefore, additivity and compensation are functional also in the control of biochemical reactions. Equations (21a) and (21b) tell that the local interactions at each individual site can favor one or the other end state of the reaction, or they may be neutral. The overall effect, however, is the final balance of the many weak local linkages. It is the opposing driving forces at different sites that frequently lead to the
382
SERGE N. TIMASHEFT
requirement that effectors must be present at very high concentration (e.g., 8 M urea). What are the consequences of these relations? If we take the linkage equation [Eq. (16a)l and plot the equilibrium constant logarithmically as a function of the cosolvent activity, the slope at any point defines the direction in which a cosolvent displaces the equilibrium at that solvent composition: if 6(dms/dm2) is po~itive,~ the cosolvent is an activator of the reaction; if it is negative, the cosolvent is an inhibitor of the reaction; if it is zero, the cosolvent has no effect on the reaction-it is inert. If the reference state is water, the requirement is that there must be a change in transfer free energy during the course of the reaction. These relations are illustrated in Fig. 7A and B (see p. 384) for three possible situations: panel A presents the dependence of the linkage slopes on cosolvent concentration; panel B gives the corresponding change in transfer free energy. In the first case, the Wyman slope is positive at all concentrations of cosolvent. This means that at any solvent composition, addition of an infinitesimal amount of cosolvent will enhance the reaction. The corresponding change in transfer free energy shows a monotone increase in enhancement relative to water. Hence, addition of the cosolvent will drive the reaction at all concentrations. The second case is the converse, i.e., increasing inhibition with cosolvent concentration, whether at any individual point or relative to water. The third case illustrates what might strike one as an apparent contradiction between the two types of analyses. In it, the slope is positive at low ligand concentrations, which means activation. It increases up to a maximal point, then passes through zero and assumes negative values. The corresponding change in transfer free energy, however, indicates that, relative to water, the cosolvent activates the reaction at all concentrations. This activation attains a maximal value and then starts weakening. This is reflected by the changes in the slope shown in Fig. '7A. The point at which the Wyman slope seems to indicate maximal activation is only the point of the maximal increase in activation relative to water. The maximal activation occurs at the concentration at which the Wyman slope is zero.At higher concentrations, the negative values of S( drn3/dm2) indicate that each infinitesimal addition of cosolvent progressively weakens the reaction; i.e., the strength of activation relative to water becomes smaller. Reversal of the signs of the ordinate for the dotted lines (example 3) would describe the converse situation, in which the cosolvent acts as an inhibitor of the process relative to water at all solvent compositions but becomes an activator in the differential form at high concentrations. From this point on, the subscripts on the partial derivativeswill be dropped in the text.
CONTROL OF PROTEIN STABILITY AND REACTIONS
383
The difference between the preferential binding of the cosolvent to the reactant and to the product may be generated in a variety of ways. Since this seems to induce some confusion when dealt with in terms of preferential interactions, we shall now list some possibilities, as depicted in Fig. 7C and D. Specifically, let us take the inhibition of a reaction, e.g., protein stabilization by cosolvents, i.e., inhibition of the native P denatured ( N 8 D) equilibrium. Inhibition requires that 6(dm3/dmz) be negative. A positive value means promotion. Since stabilizing cosolvents are preferentially excluded from proteins at 20"C,the first case is the one expected. In it, the preferential exclusion is increasingly greater for the denatured state than for the native protein. This is the expected situation with sugars at 20°C (Lee and Timasheff, 1981; Arakawa and Timasheff, 1984a),although preferential binding values for the unfolded proteins can only be inferred. The same result can be obtained when both end states of the protein preferentially bind the cosolvent, binding to the end state (denatured) being less than that to the starting material (case 2); the change in (dm3/amz)is negative, hence the process is inhibited even though dialysis equilibrium gives preferential binding. In some situations the preferential exclusion from the native protein has been found to decrease with cosolvent concentration and actually to change to preferential binding at high concentration (case 3). This is the case of MgClZinteraction with P-lactoglobulin at pH 3.0, shown in Fig. 2.Nevertheless the cosolvent acts as a protein structure stabilizer, i.e., an inhibitor of the N D equilibrium (Arakawa et al., 1990b). The inference is that exclusion from the denatured end state of the equilibrium is greater, or binding to it is smaller. An interesting situation, described by pattern 4, is when a cosolvent reverses its action on the equilibrium. This is akin to the dashed line of Fig. 7A. In the example depicted here, cosolvent preferentially binds at all concentrations to both end states of the reaction. However, at low concentration, the binding is greater to the native than to the denatured state. This results in inhibition of the equilibrium. As cosolvent concentration is raised, the pattern reverses itself, binding becomes stronger to the end state of the protein, and the cosolvent becomes a promoter of the reaction. Similar variations can be described for a single cosolvent concentration, but with temperature as the variable parameter.
*
C. Specajic Systems Controlled
Weak Interactions with Cosolvents
The influence of a cosolvent on a reaction relative to water is expressed by the change in standard free energy, A G O , and by Eq. (17),in transfer free energy, Ap2,,,.For the sake of analysis, a convenient way to express
384
SERGE N. TIMASHEFF
A
Inhibitor
octivnlor 1
octivotor
111
FIG.7. Patterns of preferential interactions in modulating biochemical reactions. (A,B) Variations of the change in preferential interactions linked to an equilibrium: (A) slope in Wyman linkage relation [Eq. (16A)l; (B) corresponding variations of the transfer free energy (BAp?,,).Case 1, activator with respect to water at all cosolvent concentrations (m3); case 2, inhibitor relative to water at all m?;case 3, activator with respect to water, with a bell-shaped dependence on cosolvent concentration. At the inaximal value of C S A ~ the ~ . ~ sign , of the slope in the log K vs. log as plot reverses from positive (activation) to negative (inhibition); note that the zero point in the linkage plot corresponds to maximal activation relative to water, while the maximal slope of the linkage plot corresponds to the maximal rate of increase of - B A P ~ , ~ ~ .
CONTROL OF PROTEIN STABILITY AND REACTIONS
..
385
+
.h
2 L ' A
1
D
D
D
N
1 ni3 (or Icmpcnlure)
FIG.7. (continued) (C,D) Combinations of preferential binding of the two end states ofan equilibrium, here denoted for the inhibition of the N D reaction. The dependence is either on cosolvent concentration (ms) or temperature. Case 1, preferential exclusion is greater from the product than the reactant: case 2, preferential binding is greater to the native than to the denatured protein; case 3, for the native protein the interaction changes from exclusion to binding, and for the denatured form there is exclusion at all ms (or temperature); case 4, preferential binding is first smaller to the denatured than the native protein, but this relation reverses above a crossover point-there is inhibition at low m3 or temperature and promotion at high m, or temperature (e.g., stabilization changes to denaturation).
386
SERGE N. TIMASHEFF
Eq. (17) is through the thermodynamic cycle. For any equilibrium that is modulated by a cosolvent, that is the classical thermodynamic box [Eq. (2211:
4
4
ACLZ1
AFE1
(22)
\
\
1. Stabilization (denaturation) ( N P D) :
2. Self-assembly (P,
+ P * P,,+,):
where P refers to a protein subunit freely dispersed in solution, and P,,+, refers to the same subunit incorporated into the assembled structure, say, a microtubule o r a subunit enzyme. 3. Confmational transition (P P*):
*
Here the equilibrium refers to, say, the activation of an enzyme, with P being the inactive form and P* the active form whose formation is affected (promoted or inhibited) by the cosolvent. 4. Solubility [protein dispersed in solution (Sol) P precipitate (Pr)]:
where S2is the protein solubility. For salting-out salts, at high salt concentration solubility follows the empirical equation (Green, 1932):
CONTROL OF PROTEIN STABILITY AND REACTIONS
387
where p and K,$are empirical constants. The slope of this plot gives K,, the salting-out constant. Combination of Eqs. ( 2 6 ) and (8) shows that K , is a Wyman linkage description of the process of precipitation, as (Arakawa and Timasheff, 1985b; Timasheff and Arakawa, 1988)
It must be stressed that this describes only the phase separation and has no bearing on protein crystallization, which is a different, post-phase separation process that involves nucleation and growth. 5 . Ligund binding (P 3- L PL):
*
In this case the thermodynamic effect of the cosolvent on the free ligand must be taken into consideration explicitly. This can be measured either by vapor phase equilibrium or by solubility, as
where yt;- and y i are the activity coefficients of the ligand in water and in the cosolvent system, respectively.
IV. LINKAGE CONTROL OF PROTEIN STABILITY A. Protein Stabilization 1. Preferential Exclusion
The influence of cosolvents on protein stability has been known for many decades. Itwas accepted, however, that there were two distinct classes of effectors: protein structure stabilizers and protein denaturants (i.e., structure destabilizers),which operated by unrelated mechanisms. It was accepted that denaturants such as urea or GuaHCl act by binding in the classical sense to proteins. The rationale for stabilization by sucrose, glycerol, and other stabilizers was quite vague, and diverse explantations were proposedsuch as binding to some key sites on the protein or the formation
388
SERGE N. TIMASHEFF
of a protective shell around the protein molecule. In fact, as outlined in the preceding sections, whether a particular cosolvent will stabilize or destabilize proteins is a function strictly of its differential interactions with protein groups in contact with solvent in the two end states of the general N P D equilibrium. The general observation has been that those cosolvents that stabilize protein structure are preferentially excluded from the protein surface at 20°C, whereas those that induce unfolding in general interact favorably with the unfolded state of the protein. The converse, however, may not be necessarily true. A selected list of the interaction parameters of cosolvents with proteins is given in Table 11. The phenomenon of preferential exclusion and its interpretation in terms of multicomponent thermodynamics has been known' for a long time, both for the interaction of salts with DNA (Timasheff, 1963; Hearst, 1965; Cohen and Eisenberg, 1968) and with proteins (Cox and Shumaker, 1961; Ifft and Vinograd, 1962, 1966; Hade and Tanford, 1967; Aune and Timasheff, 1970). In an elegant study, von Hippel et al. (1973) have found that the binding of neutral salts to polyacrylamide columns gave negative values of the elution equilibrium constant. This led them to the conclusion that surface groups in the column were preferentially hydrated in the presence of the salts, which were unable to displace water molecules from the amide dipole. These early observations eventually found theoretical interpretation in the principles of exchange developed by Schellman (1978, 1987, 1990, 1993) and the classification of water molecules into nonexchangeable and exchangeable ones. A detailed analysis of preferential hydration at 20°C in a variety of solvent systems has led to the classification of preferentially excluded cosolvents into two general classes (Arakawa et al., 1990b). In the first class, the preferential hydration is essentially independent of cosolvent concentration and solvent pH. These cosolvents have always been o b served to stabilize the structure of proteins and to reduce protein solubility. In the second class, the preferential hydration is strongly dependent on cosolvent concentration or pH, or both. Their effect on protein stability cannot be predicted from their preferential interactions with the native protein, even though they may be good protein precipitants. The first class of cosolvents consists of sugars, some polyols including glycerol, small amino acids, and certain salting-out salts such as Na2S04, MgS04,and NaCl. Many of the neutral compounds are found in nature as osmolytes and cryoprotectants. The second class includes MgC12,some amino acid salts, PEGS, and MPDa5 It must be noted that errors of measurement of preferential interactions, under the best circumstances, are 50.004-0.010 in (ags/agJ).This, for the interaction of a protein of molecular weight 64,000 with,e.g., 1 M glucose, gives errors of 21 in B, [of Eq. (12)] and 255 in B,, i.e., the effective number of water molecules in contact with the protein surface.
389
CONTROL OF PROTEIN STABILITY AND REACTIONS
TABLE I1 Thermodynamics of Protein-Solvent Interactions in Weakly InteractinR Systemf
1M 3 M
CTGen PH 2
Glucoseb -0.080 0.394 0.205 -0.168
0.4 M 0.4 M
BSA, pH 6 BSA,pH 3
Lactose -0.048 -0.100
20% 40%
CTGen PH 2
5% 10%
BSA
20% 40 % 50%
RNase A pH 5.8 25°C
10%
BSA
50%
25°C. pH 2
25°C
PLg
0.4 M 1.5 M 0.2 M 1.0 M 1.5 M
Lysozyme PH 7
0.5 M 1.5 M
BSA
PH 2
BSA
pH 5.7
6.6 23.2
0.321 0.665
12.7 26.4
2.1 4.8
Glycerol' -0.081 -0.161
0.258 0.195
3.9 2.9
13.9 32.6
Inositold -0.021 -0.041
0.407 0.387
16.4 15.6
4.6 9.4
2-Methyl-2,4pentanediol (MPD)' -0.045 0.196 -0.474 0.810 -0.943 1.03
10% 20% 30%
pH 5.7
6.2 3.7
Propylene glycol' 0.138 -1.25 0.409 -0.44
Polyethylene glycol (PEG) 6009 -0.069 0.627 -0.112 0.464 -0.093 0.232
1.04 1.73 1.24
-50.3 -17.5
10.1 11.1 9.4
0.9 6.2 10.2
-70 -360
1.8 4.1 6.8
Arginine HClh -0.027 0.305 0.230 -0.093 0.010 -0.218 -0.028 0.114 -0.070 0.173
3.7 2.6 13.9 5.7 9.2
3.0 5.0 -3.4 -4.0 3.0
Sodium glutamate' -0.045 0.513 -0.133 0.457
40.6 36.2
22 67 ( continues)
390
SERGE N. TIMASHEFF
TABLE I1 (continued) Solvent concentration
Protein
BSA
(dgda&) (dg)
( a P da 4
/dg2) (g/g)
Arginyl glutamateh -0.024 0.361 -0.114 0.387
0.2 M 0.77 M
pH 5.7
0.25 M 1.0 M 0.25 M 1.0 M
Trimethylamine-Noxide ( T W O ) RNase T, 0.001 -0.044 pH 7; 25°C -0.045 0.553 RCM-Ti -0.024 1.25 pH 7; 25°C -0.007 0.084
kcal/mol'
Aktr (kcal/mol)
23.3 24.2
5 20
-0.3 3.0 8.2 0.5
-0.34 1.2 2.6 5.3
J
Na2SOt -0.021 -0.067
0.287 0.459
21.5 33.6
RNase pH 2.8 pH 5.5
MgSO," -0.047 -0.028 -0.066 -0.032
0.388 0.384 0.452 0.440
17.7 3.1 5.1 3.5
0.5 M 2.0 M 0.5 M 2.0 M
PLg pH 2.0 pH 5.1 pH 5.1
MgC12" -0.015 0.304 -0.030 0.148 0.002 -0.051 0.002 -0.007
21.6 23.9 - 3.6 - 1.2
13 47 -3.5 -2.5
2M 6M 8M 2M 6M 8M
pLg" pH 5.5
- 7.0
-15 -49 -60 4.4 13.0 16.1
0.5 M 1.0 M 2.0 M
BSA
0.5 M 1.0 M
BSA
1.0 M 0.6 M 1.2 M 0.6 M
BSA, pH 4.5
3.0 M 5.2 M 6.3 M
pH 4.5
Myoglobin" pH 7.0
pH 4.5
Urea 0.05 0.18 0.16 -0.03 -0.07 -0.05
-0.40 -0.36 -0.22 0.24 0.14 0.07
Gudnidine sulfate! 0.008 -0.071 -0.052 0.21 1 -0.184 0.316
Guanidine hydrochloride BSAI 0.200 -0.55 25°C 0.134 -0.17 0.1 MDTT 0.038 -0.035
-5.8 -3.4 1.9 1.0 0.5
-4.6 8.7 15.7
-22.4 -5.7 -0.9
-6.6
-5.1 19.1
- 127 - 181 -189 (continues)
CONTROL OF PROTEIN STABILITY AND REACTIONS
39 1
TABLEI1 (continued)
0.8 M 1.3 M 3.5 M 6.5 M
RNase A' pH 7.0
-0.013 0.004 0.044 0
0.141 -0.03 -0.099 0
0.82 -0.15 -0.38 0
0.92 1.05 -0.56 -1.15
"All values are 20"C, unless specified otherwise, extrapolated to zero protein concentration. Key: BSA = bovine serum albumin; DTT = dithiothreitol; P L g = plactoglobulin; RCM-T, = reduced and carboxylate form of RNase T,. "Arakawa and Timasheff. 'Arakawa and Timasheff, "Arakawa et al., 1990a. 1982a. 1985a. Poklar and Lapanje, 1992. 'Gekko and Timasheff, Kita et al., 1994. "Zerovnik and Lapanje, 1981. 'Arakawa and Tirnasheff, 1986. "Gekko and Morikawa, 1984c. P Arakawa and Timasheff, 1981. Lin and Timasheff, 1994. 1984b. Pittz and Timasheff, 1978. Arakawa and Timasheff, q Reisler et al., 1977. 'V. Prakash and S. N. Tima'Gekko and Koga, 1984. 1982b. 'Arakawa et al., 1990b. sheff, unpublished. J
2. Families of Preferentially Excluded Agents a. Sugars. Among sugars, preferential'interaction studies have been carried out on sucrose, trehalose, glucose, and lactose. The preferential hydration in the presence of sucrose between 0.1 and 1.0 M was found to be invariant with sugar concentration, with (ag,/agi) values of 0.32, 0.25, 0.45, and 0.24 g water per gram protein for chymotrypsinogen (CTGen), a-chymotrypsin, RNase A, and tubulin (Lee and Timasheff, 1981;Lee et al., 1975).When the (dgs/ag2)data were plotted as a function of g3 [Eq. (12a)], the straight line fits extrapolated to zero values of AS (and B s ) . This means that, in an aqueous sucrose solution within experimental error, the sugar does not occupy sites on the protein surface other than at thermodynamically indifferent loci. Hence, observably, there is total preferential exclusion, although neutral contacts are formed. The same was found for trehalose in its interactions with RNase A (pH 2.8 and 5.5) (Lin and Timasheff, 1996; Xie and Timasheff, 1997c) and for lactose in its preferential interactions with RNase A (pH 8.8), CTGen (pH 2.0), and bovine serum albumin (BSA) (pH 6.0) (Arakawa and Timasheff, 1982a). For BSA at pH 3.0 the preferential hydration was twice that measured at pH 6.0. This reflects the well-known expansion of BSA at acid pH (Yang and Foster, 1954; Tanford et d.,1955) and emphasizes the fact that preferential interactions are the sum of all site occupancies over the entire surface in contact with solvent. The same
392
SERGE N. TIMASHEFF
is not true of glucose (Arakawa and Timasheff, 1982a). Of the sugars that have been studied, glucose is the only one that shows decreasing values of (dgJdgi) over the concentration range between 0.5 and 2.0 M for four proteins. The values for CTGen are listed in Table 11. This indicates gradual formation of contacts between the sugar and the protein surface as the glucose concentration is increased. Integration of the preferential interactions to obtain Apz,lrgave monotonely increasing functions of sugar concentration for the totally excluded sugars. In the was also positive, but it tended to a maximal case of glucose, A/.L~,,~ value at high sugar concentration, above which contacts should become less unfavorable. b. Glycerol. Glycerol presents a complicated pattern of interactions. The preferential hydration was found to have invariant low values for four proteins, a-chymotrypsin, RNase A, b-lactoglobulin (p-Lg), and tubulin (Gekko and Timasheff, 1981), with (dgl/dgl) values of 0.18, 0.15, 0.14, and 0.24 g water per gram protein, respectively. The one exception was CTGen, for which the preferential hydration decreased with glycerol concentration (Table 11).The low values of the preferential hydration relative to the total hydration measured by Bull and Breese (1974) or calculated by the NMR method to Kuntz (1971) suggest some penetration of glycerol molecules to the surface of the protein. This was scrutinized in terms of Eq. (12) with the assumption that the protein hydration is the same as in pure water, since glycerol does not significantly affect the activity of water (Scatchard, 1946; Kozak et al., 1968). The increasing glycerol concentration dependence of the resulting values of & indicates increasing penetration with glycerol concentration, which is consistent with the Law of Mass Action for the formation of contacts at surface sites.
c. Polyols. Gekko and Morikawa (1981) have measured the preferential interactions of BSA with a series of polyols (ethylene glycerol, glycerol, xylitol, mannitol, sorbitol, and inositol). All except for inositol gave low values of preferential hydration. The strong preferential hydration of inositol (Table 11) was attributed to its strongly hydrophilic nature and high degree of hydration (Suggett, 1975). Gekko and Morikawa (1981) found a low preferential hydration of BSA in the presence of sorbitol (0.21 g water per gram protein). In contrast, when the protein was the highly polar RNase A, the preferential hydration was high but decreased with concentration (Xie and Timasheff, 1997a).The complexity of the correlation between cosolvent structure and preferential interactions is clearly seen in the difference between the values obtained for
CONTROL OF PROTEIN STABILITY AND REACTIONS
393
the interactions of BSA with ethylene glycol (Gekko and Morikawa, 1981) and propylene glycol (Gekko and Koga, 1984). Ethylene glycol was preferentially excluded up to a 60%concentration, with an invariant preferential hydration of 0.14 g water per gram protein. Propylene glycol, on the other hand, manifested strong preferential binding (Table 11). Yet, the more hydrophobic MPD was strongly preferentially excluded (Table 11) (Pittz and Timasheff, 1978). d. Salts. The two magnesium salts, MgS04 and MgC12,even though they share the same cation, belong to different classes. MgSO, is a good structure stabilizer,and its preferential hydration values are usually high. The complexity of the variation of the preferential interactions of MgC1, with its own concentration has been described above (Fig. 2). It is generally regarded as a salting-in agent and a structure destabilizer (von Hippel and Schleich, 1969; Collins and Washabaugh, 1985). Nevertheless, at some limited conditions of concentration and pH, it can salt out proteins (Arakawa et d.,1990a). At acid pH MgC12 raises the value of T, of RNase A, but to a much smaller extent than does MgS04. At pH 5.5, addition of MgC12has no effect on the stability. In fact the T, values of RNase A in MgC12and MgS04follow parallel variations with pH, the T, values in MgS04being higher by -6°C (Xie and Timasheff, 1997b). This is clearly a consequence of the difference in the preferential exclusion capacities of the SO!- and C1- ions that can compensate to different extents for the weak binding of Mg2+ions to negatively charged loci on the protein. Comparison of the Mg2+salts with the corresponding Na+ salts (Table 11) shows an increase in preferential hydration with salt concentration for both Na,S04 and MgSO,, while for MgC12,there is a strong decrease. The good protein solubilizer, CaC12, is preferentially bound to BSA, which reflects the favorable interaction free energy. e. PEG and MPD. Of particular interest are the PEGSand MPD. Both of these organic molecules are used as effective proteincrystallizing agents (McPherson 1982,1985; King et aL, 1956). In fact, at room temperature, MPD is used at concentrations as high as 60%. Yet, when the temperature is increased, both promote protein unfolding. Both PEG and MPD are hydrophobic in nature (Hammes and Schimmel, 1967;Pittz and Bello, 1971) and can interact with the additional protein nonpolar groups exposed on unfolding. Consistent with this is the finding by Lee and Lee (1987) that the decrease of T, induced by 15% PEG 1000 at pH 3.0 was linearly proportional to protein hydrophobicity. For 20% PEG 1000, these authors found that at 20°C, dialysis equilibrium gave a preferential exclusion of (de/dm.J = -1.0 mol PEG 1000 per mol
394
SERGE N. TIMASHEFF
of native CTGen and preferential binding of 1.3 mol PEG 1000 per mol of unfolded protein. As expected from the Wyman linkage relation, this change in preferential binding leads to the observed decrease in T,. A similar study carried out with MPD showed that addition of 30% cosolvent to an aqueous solution of RNase A lowered T, by 10°C (Arakawa et al., 1990b). Yet at 20"C, RNase A is preferentially hydrated in 30% MPD to the extent of 0.6 g H 2 0 per gram protein (Pittz and Timasheff, 19'78) (Table 11). Although there are no preferential interaction data for the denatured protein, the inference is that on unfolding the free energy of interaction should become stronger by GAp2,tr= -3.4 kcal mol-'.
6 Amino Acids, Amino Acid Salts, and Methylamines. Small neutral amino acids belong to the first category of stabilizing agents (Arakawa and Timasheff, 1983). Glycine, a-alanine, and Palanine display an essentially concentration independent degree of preferential hydration. The amino acid salts present a particularly interesting family of agents. Both sodium glutamate and potassium aspartate are strongly excluded (Arakawa and Timasheff, 1 9 8 4 ~ )The . data for the NaGlu-BSA system at pH 7.0 is given in Table 11. The preferential hydration is weakly concentration dependent. With lysozyme, the preferential hydration is lower. This reflects some attraction between the anionic amino acid and the positively charged protein. Similar observations were made with potassium aspartate (Kita et al., 1994). Lysine hydrochloride displays an exactly opposite picture. Preferential exclusion from lysozyme is now much greater than from BSA (Arakawa and Timasheff, 1984~). Arginine hydrochloride is drastically different (Etaet al., 1994). The data (shown in Table 11) indicate preferential binding to BSA at 0.2 M salt and an increasing preferential hydration above 0.7 M salt. This reflects a monotonely decreasing favorable free energy of interaction, as Ap2,tr changes from -3.4 kcal mol-' at 0.2 M salt to +3.0 kcal niol-' at 1.5 M Arg HCI. When the protein is the positively charged lysozyme, the interaction is that of preferential hydration at all salt concentrations. These observationscan be interpreted in terms of compensation between binding and exclusion. ArgHCl raises the surface tension of water (Kita et al., 19941, which leads to preferential hydration (Lee and Timasheff, 1981). On the other hand, the Arg+ ion should enter into favorable interactions with amide and peptide groups on the protein. Arginyl glutamate provides a striking example of compensation. As seen in Table 11, the interaction is that of a salt concentration independent preferential hydration, but the values are smaller than those characteristic of NaGlu. Glutamate is a protein structure stabilizing agent, as shown for tubulin and the tubulin-colchicine complex (Wilson, 1970), and it en-
CONTROI. OF PROTEIN STABILITY AND REACTIONS
395
hances tubulin self-association into microtubules (Hamel and Lin, 1981; Hamel et al., 1982). Its strong preferential exclusion from the protein surface evidently can compensate completely for the binding tendency of Arg+. Thus, its action is akin to that of the SO$- ion when coupled to Gua+. Hence, the close parallel between the Gua2S04-GuaC1 and ArgGlu-ArgC1 pairs. Methylamines such as sarcosine, betaine, and TMAO are known to stabilize the structure of proteins (Arakawa and Timasheff, 1985c) and to act as osmolytes in living systems (Yancey et al., 1982). Their patterns of preferential interactions may, however, be complex. Betaine induces a high level of preferential hydration with a small concentration dependence (Arakawa and Timasheff, 1983). Sarcosine, 1 M, was strongly excluded from lysozyme (Arakawa and Timasheff, 1 9 8 5 ~ )A. study has been carried out on the interactions of TMAO with RNase TI and the reduced and carboxymethylated form of the enzyme (RCM-TI) (Lin and Timasheff, 1994), which exists in a fully unfolded state (Oobatake et al., 1979; Pace et al., 1988). The preferential interaction values, given in Fig. 6B, show opposite trends with TMAO concentration. Preferential exclusion increases for the native protein and decreases for the unfolded form. While it is not known what forces cause methylamines to be excluded from the surface of proteins, unfolding causes the exposure to solvent both of nonpolar residues and peptide bonds.
3. Linkage Controls of Protein Stabilization A full understanding of the effect of preferential interactions on protein stability requires a knowledge of (dm,/dm,) and Appst,at both ends of the N P D equilibrium. Until recently, values of preferential interactions and transfer free energies in stabilizing cosolvents were known only for native proteins, and all inferences on the change in those interactions during the unfolding reaction were based on the statement of the Wyman linkage relation [Eqs. (16) and (17)]. The performance of preferential interaction measurements on the RNase A-sorbitol and RNase A-trehalose systems with the protein in the two end states of the N D equilibrium has permitted a complete linkage examination of structure stabilizing systems to be done with respect to both water and cosolvent of any given concentration as reference states (Xie and Timasheff, 1997a,c). The 6AG& values were determined for both systems via thermal transition experiments. The proper determination of 6Ap2requires that the preferential interactions be measured with both the native and denatured forms of the protein at identical temperature, which means that another solution variable must be different. In both systems, this was
*
396
SERGE N. TIMASHEFF
chosen as pH. Thermal transition experiments showed that in the case of sorbitol at 48°C RNase was native at pH 5.5 and denatured at pH 2.0. For trehalose, the same situation was true at 52°C;the protein was native at pH 5.5 and denatured at pH 2.8.The results of the dialysis equilibrium experiments in the two solvent systems at both temperatures and both values of pH are presented in Fig. 8.The demonstration that the interactions of the cosolvents with the native protein were indistinguishable at the acid pH and pH 5.5 at 20°C was the basis for the assumption that the native states of this protein at the two pH values are identical. At the high temperatures, the results obtained for the two solvent systems follow closely parallel patterns. If we take sorbitol as the example, the observations are as follows: at 48°C and at pH 2.0 (denatured protein), the (d m 3 / d m 2 )conform to weakly increasing preferential exclusion with cosolvent concentration, accompanied by a sharp decline in preferential hydration, at 48°C and pH 5.5 (native protein), the preferential binding is negative up to 1.3 m sorbitol, above which it becomes positive (i.e., there is a shift from preferential exclusion to preferential binding-in other words, the interaction changes from unfavorable to favorable). This is reflected by the change in sign of (dm,/dm2) (Fig. 8B).Integration under the preferential binding results, according to Eq. (19), gave the Apn,t,curves of Fig. 8C. For sorbitol, at both pH 2.0 and pH 5.5 and both temperatures (20°Cand 48"C),the values are positive at all concentrations, meaning that the interactions are thermodynamically unfavorable (Xie and Timasheff, 1997a). The shift in sign of (dm3/dmz)at 48°C for the native protein had no qualitative effect on the stabilization since SAp2,JD-N) and S(dm,/dm,) (D-N) are positive. As shown in Fig. 9A,
FIG.8. Preferential interactions of sorbitol and trehalose with RNase A at 20°C and at high temperature at pH 5.5 and low pH. The dialysis equilibrium binding results are presented as preferential binding (A and D) and preferential hydration (B and E). The variations of the transfer free energy, A P ~ , , ~are , derived from the preferential binding by integration as described in the text. For both cosolvents the unfavorable interactions are identical at 20"C, pH 5.5 and low pH. At high temperature, the low pH values (denatured protein) display more unfavorable interactions than d o the pH 5.5 (native protein) values. It is seen that in 1 M sorbitol stabilization is afforded by an increase in preferential exclusion on unfolding, whereas in trehalose the stabilization is due to lower dialysis equilibrium binding to the denatured protein than to its native form. The interactions relative to water ( A P ~ , , ~are ) unfavorable for both the native and denatured protein with sorbitol at 48°C and favorable for both forms of the protein with trehalose at 52°C. The symbols in all the plots refer to the conditions given in panels A and D (Xie and Timasheff, 1997a,c.)
I-
I
N 7
m
'
7
7
O
'
0
U
J
I
0
I
0 *0 17f'I(Zwe/ N
I
c
1
1
D
I
1
0
7
,
~
-
N
I
I
0 0 N
A
w
0I N
'0,
.-
,-- 4.2
t 0
2
A
a,
W
m 0 -
0 0
u
i
*
0 0
C'"'1(Zue,
0 0
N
0
I
1
0
I
0
m
l
0
l
* I
0
0
L U
c
?
L
0
0 -
v)
a,
W
I
A
U ?
0
0
a,
c
L
I) )
0
N
L
I
? ?
I "
0
LDV) 0
?
0 O
'0 N
I
-
W
? N
'0
V
T -+ o .0
c
?
0
? 0
398
SERGE N. TIMASHEFF
B
A
Nsor
A F 2 D = 3.85
30% s o r b i t o l
0.7 M trehalose
FIG.9. Thermodynamic boxes of the stabilization of RNase A by (A) 30% sorbitol at 48°C (Xie and Timasheff, 1997a) and (B) 0.7 M trehalose at 52°C (Xie and Timasheff, 1 9 9 7 ~ )(The . values are in kcdl/mol.)
in 30% sorbitol at 48"C, the thermodynamic box closes within 0.11 kcal mol-I. A similar situation prevails with the RNase A-trehalose system. The variation of the parameters with sugar concentration for the native and denatured forms of the protein are shown in Fig. SD, E, and F. In this case, however, at 52°C trehalose is preferentially bound both to the native and denatured protein at all sugar concentrations used. The key observation is that the binding is always greater to the native than to the denatured protein. For example, in 0.5 M trehalose, S ( d m 3 / d m 2 )= 1.56 - 3.01 = -1.45. The slope of the Wyman linkage plot (log K vs. log a:\) at the same conditions is -1.40. Hence, the change in the directly measured preferential binding can account for the trend in the equilibrium. The thermodynamic box presented for 0.7 M trehalose in Fig. 9B closes within 0.21 kcal mol-' (Xie and Timasheff, 1 9 9 7 ~ ) . The closing of the thermodynamic box for both the sorbitol and trehalose systems signifies that the stabilization of RNase A by both agents is due to a strictly nonspecific thermodynamic effect, namely, the
CONTROL OF PROTEIN STABILIlY AND REACTIONS
399
weak interactions of exchanging water and cosolvent molecules with protein loci exposed to solvent in both states of the protein. There is no evidence of any specific reactions such as a local conformational change induced by the cosolvent. The shift in preferential binding in the case of sorbitol from negative at 20°C to positive at 48°C does not have any dramatic significance. It is a trivial reflection of subtle shifts in the relative affinities of water and trehalose for some loci on the protein surface (by less than 0.05 kcal mol-I), defined by the exchange enthalpies at individual sites. Similarly, the finding that at high temperature stabilization is determined by a decrease in preferential binding on denaturation for trehalose and a shift from preferential binding to preferential exclusion for sorbitol, whereas at low temperature stabilization is determined by an increase in preferential exclusion on denaturation, does not imply any change in the mechanism of stabiliztction. Referring to Fig. 7C and D, we note that the stabilization of RNase A by sorbitol and trehalose at 20°C is an example of case 1, that by trehalose at 52°C represents case 2, and that by sorbitol at 48°C represents case 3. All three conform to the criterion of the Wyman (1964) linkage relation [Eq. ( l ) ] that what matters is not the sign of thep-eferential binding but the sign of the difference in p-eferential bindings between the two end states. An interesting system is represented by RNase T, in GuaHCl (Mayr and Schmid, 1993). This protein is stabilized by 0.1 M GuaHCl (ATm = 0.4”C),but it is destabilized at higher GuaHCl concentrations. Although there are no preferential interaction measurements, these observations point to a smaller preferential binding of the salt to the denatured than the native protein at 0.1 M, with the trend reversing itself as the GuaHCl concentration increases and as preferential binding to the denatured form overcomes that to the native protein. This would render this system an example of case 4 of Fig. 7D. 4. Thermodynamics of Preferential Exclusion
A complete understanding of the thermodynamics of denaturation requires knowledge of the transfer enthalpies and transfer entropies. Equation (21) for the change in transfer free energy may be rewritten in terms of the transfer enthalpies, APz,,,, and transfer entropies, AS2.,,.,if it is recalled that A F ~ is, ~the ~ change in the partial molal free energy of the protein, A?&, when the protein is transferred from water to the cosolvent system. Then,
and
400
SERGE N. TIMASHEFF
The Wyman linkage relation is given by combining Eq. (32) with Eqs. (16) and (18). A complete analysis requires knowledge of the preferential interactions as a function of temperature at a number of cosolvent concentrations for both the native and the denatured states of the protein. Such an analysis has been carried out with the limited data available for the RNase-sorbitol and RNase-trehalose systems (Xie and Timasheff, 1997b,c). Figure 10 gives the variation of the preferential binding as a function of temperature for the 0.5 M trehalose system at pH 2.8 and 5.5. Up to 35"C, the two sets of values coincide and ( a m 3 / d m p )remains essentially invariant with temperature. Above that temperature at pH 5.5 (native protein), the preferential binding increases, assuming positive values above 45°C. At pH 2.8, the same upturn takes place, but only at 48°C. At that pH, the protein undergoes the denaturation transition, as 8 n
0
6
E
E (D
-2
\
n
E (73
-4
W
0
I
I
I
I
I
I
10
20
30
40
50
60
temperature
70
(OC>
FIG.10. Temperature dependence of the preferential binding of cosolvent to RNase A in 0.5 M trehalose solution: (0) pH 5.5, native protein; (0)pH 2.8 (the protein is native up to 35°C where the N + D transition starts; it is in the denatured form above 50°C) (Xie and Tirnasheff, 1 9 9 7 ~ ) .
CONTROL OF PROTEIN STABILIlY AND REACTIONS
40 1
T,, = 45°C. Hence, at pH 2.8 the values are characteristic of native protein below 35°C and of denatured protein at high temperature. The same pattern was obtained with sorbitol, MgClz, and MgS04 (Xie and Timasheff, 1997a,b). The variation of the partial molal enthalpy of the protein with cosolvent concentration was calculated from the temperature dependence of the variation of the partial molal free energy by applying the truncated form of the integrated van't Hoff equation (Glasstone, 1947; Xie and Timasheff, 1997b). The results for native RNase in 30% sorbitol are presented in Fig. 11. Figure 11A shows the van't Hoff plot of the preferential interaction parameter. Figure 11B gives the resulting values of ( d R , / d m 3 )7;cL,,p3,which are increasing with temperature with a slope of ca. 2.0 kcal deg-' (mol protein)-' (mol coso1vent)-'. Calculation of the transfer enthalpy requires knowledge of (dB,/ am,) 7;p,,ay as a function of concentration, since
This, in turn, requires knowledge of the concentration dependence of (d_cLz/am3)T.p,m,at several temperatures. Approximate values of mz,tr and calculated from the limited data available, are shown in Figure 11D and F (Xie and Timasheff, 1997b). While these numbers must be regarded as illustrative, they nevertheless indicate that the transfer of RNase A from water into aqueous sorbitol is characterized by positive enthalpy and entropy changes. Similar results were obtained from native RNase in aqueous trehalose, MgSO.,, and MgC12. Gekko and Morikawa (1981) have performed the same measurements for the interactions of sorbitol and glycerol with native BSA. They also obtained positive values of ( d R 2 / d m 3 ) 7 ; p ,and m , (d3z/dm3)7;p,9.The similarity of the values of the transfer parameters is remarkable, since they encompass five systems, namely, a sugar, two polyols, and two salts. The availability of these values, albeit approximate, renders possible an evaluation of the relative contribution of the cosolvents to the enthalpy of denaturation of proteins [see Eq. (31)]. For RNase in 30% sorbitol at 48°C (pH 5.5), = 11.3 kcal mol-' (Xie and Timasheff, 1997a), whereas = 131 kcal mol-' (from Fig. 11D). Similar values characterize the RNase A-trehalose system (Xie and Timasheff, 1997~).This leads to the conclusion that the increment of the standard enthalpy of denaturation due to the transfer of the protein from water to the cosolvent system is small relative to both the change in the standard enthalpy of denaturation in water and the transfer enthalpy of the native protein from water to the aqueous sorbitol-or trehalose-medium. The small
m!,,
402
SERGE N. TIMASHEFF
L
4 -
\
2 -
4
0
4 rl
&
2
-2 -4
-
i A
100
80
60
-
-6-
(D
v
V
1 - 8 -
-10
3.0
3.2
3.4
3.6
40
20 0 -20
-40
3.8
I
I
0
10
I/T
1
'
1
.
250
1
I
I
I
I
I
30 40 5 0 Temperature('C) 20
I
,
'
I
I
60
I
200 150 100
50
2O0C
0 I
-50
0
2
1
3
0
I
2
1
I
3
m3
m3
300 01 -
0
250
1 '.
200
2
150
E.
100
E
a Ac1 02 €
'z
v
50
0 -50 -100
0
10
20
30
40
50
Temperature('C)
60
-1001 0
"
'
I
2
1
'
I
1
3
m3
FIG.11. Thermodynamics of the preferential interactions of aqueous sorbitol solutions with native RNase A. (A) The vaii't Hoff plot of the transfer free energy variation with sorbitol concentration: 10% sorbitol (V);20% sorbitol ( V ) ; 30% sorbitol (0);40% (B) Temperature dependence of the variation of the transfer enthalpy with sorbitol (0). sorbitol concentration at 10% 20% (V),30% (O), 40% (0) sorbitol. Numbers on
(v),
CONTROL OF PROTEIN STABILITY AND REACTIONS
403
m2,rr
increment in on denaturation suggests that, on unfolding, either the number of new solvent-protein contacts made is small or the newly formed contacts are characterized by a high degree of compensation of the transfer enthalpies between individual sites. B. Protein Destabilization 1. Preferential Binding
The principal protein denaturants are urea and guanidine hydrochloride, which induce a random coil state in proteins (Tanford et al., 1967; Nozaki and Tanford, 1967; Tanford, 1968, 1970). Sodium dodecyl sulfate, alcohols, and some other organic solvents induce a transition into structures rich in a-helices (Tanford et aL, 1960; Tanford and De, 1961; Inoue and Timasheff, 1968; Reynolds and Tanford, 1970). Some early measurements on the preferential interactions of denaturants with proteins have shown positive values of ( d m , / a m , ) for the interaction of various proteins with 2-chloroethanol and methoxyethanol ( Inoue and Timasheff, 1968), GuaHCl with BSA (Noelken and Timasheff, 1967) and with aldolase (Reisler and Eisenberg, 1969), and urea with @-lactoglobulin and chymotrypsinogen (Span and Lapanje, 1973). In their studies of the preferential interactions of 6 M GuaHCl with a number of proteins, Hade and Tanford (1967) have found that the values of preferential binding are very small. They have interpreted these results in terms of the difference in relative affinities of the various amino acid side chains for water and GuaHCl, since in 6 M GuaHCl all of the proteins examined were fully unfolded. 2. Urea The preferential interactions of urea with proteins have been examined in detail by Lapanje and colleagues (see Span and Lapanje, 1973;
the figure are the slopes, which represent (d~p,2/dm3)zp,mj in cal deg-' mol-I, where Ep,p.2 is the partial molal heat capacity of the protein at constant pressure. (C) Dependence on sorbitol concentration of the transfer enthalpy variation with sorbitol concentration at 20°C and 48°C. The dotted line is the parameter calculated at 48°C for the denatured protein. (D) Dependence of the transfer enthalpy on sorbitol concentration at 20°C and 48°C. The dotted line is the parameter calculated at 48°C for the denatured protein. (E) Temperature dependence of the variation of the transfer entropywith sorbitol concentration at 30% sorbitol. (F) Dependence of the transfer entropy on sorbitol concentration at 20°C and 48°C. [From Xie and Timasheff (1997b). Reprinted with the permission of Cambridge University Press.]
404
SERGE N. TIMASHEFF
Zerovnik and Lapanje, 1986; Poklar and Lapanje, 1992; and Prakash et al., to be published). Representative values of the interaction parameters are listed in Table 11. It is clear that urea can be both preferentially bound to and preferentially excluded from proteins. Thus, for /3-Lg and CTGen, (d-/dm) has positive values at all urea concentrations up to 8 M (Span and Lapanje, 1973; Poklar and Lapanje, 1992).For myoglobin, on the other hand, urea is preferentially excluded at all concentrations (Zerovnik and Lapanje, 1986). A particularly interesting case is found in RNase A, since its interactions shift from preferential hydration to preferential binding. Figure 12 shows the preferential binding of urea to RNase A. The pattern of interactions is complex: ( d m 3 / d m 2 )is negative at low urea concentration; it assumes positive values at higher concentrations, which pass through a maximum before falling to zero at ca. 8.0 M urea. In the case of lysozyme, the preferential binding increases monotonely up to ca. 5 M urea, at which point it seems to reach a plateau value, followed by a weak decline. The complex pattern found with RNase A cannot be attributed to the effect 20
1
A
R
I
I
B
5
I
RNase A
15
,---.
5 ? c
0
E
10
'1-
A
-5
0
N
u
E
Y
R3
2
-.
0
v
2
5
-10
y"
R3
a
v
-15 01
-20
-5 0
2
4
6
8
10
0
Urea ( M )
I
I
1
I
2
4
6
0
Urea (M)
FIG.12. Preferential interactions of urea with RNase A (0) and lysozyme (0)at pH 7.0. (A) Preferential binding. (B) Transfer free energy with native protein in water as the point of reference (Prakash et al., to be published).
11
CONTROL OF PROTEIN STABILITY AND REACTIONS
405
*
of denaturation alone. The N D transition sets in at 5 Murea (Greene and Pace, 1974). Hence, the observed pattern is characteristic of the native protein. These preferential binding patterns reflect the variation of the transfer free energy with urea concentration shown in Fig. 12B. With native RNase in water as the reference state, it is seen that App,tr remains positive right up to 9.3 M urea. For the native protein, the unfavorable thermodynamic interaction is an increasing function which reaches a maximum at 3.8 Murea. In the transition region, the measured free energy of interaction is complex, since
where fN and j ,are the fractions of protein in the native and denatured states. The term 6 AGZ(N’D) reflects the fact that the reference state for the experimental values is native protein in water whereas that of Ap&, is denatured protein in water. Above 8 M urea, fN = 0. Since at pH 7.0, 6AG,”(N”D)= 7.7 kcal mol-’ (Greene and Pace, 1974), Ap& = 2.0 - 7.7 = -5.5 kcal mol-’, which indicates a significant favorable interaction, even though the preferential binding, (dm,/dm,) 7;a,,p, = 0. In the case of lysozyme at pH 7.0, the situation is much simpler. The protein is native throughout the measurement. Hence, the positive values of ( drn3/drn2), which signify favorable interaction, represent the variation of A&, which is increasingly favorable with urea concentration. The shape of the interaction parameters of urea with lysozyme may be attributed in large part to the variation of urea nonideality, as the pattern mimics closely a theoretical one calculated by Schellman (1990; see Fig. 4A therein). The RNase pattern cannot be reproduced in terms of the variation of the activity coefficient of urea. It must reflect changes with urea concentration of the exchange affinity at some sites. Lapanje and co-workers have examined the preferential interactions of some alkylureas with /?-Lg (Poklar and Lapanje, 1992) and myoglobin (Zerovnik and Lapanje, 1986). The interesting observations are that with p-Lg as protein, methylurea is preferentially bound, whereas for ethylurea and N,N’dimethylurea preferential binding at low concentrations shifts to preferential exclusion at high concentration. The pattern is just the opposite with myoglobin. The alkyl ureas are preferentially excluded at low concentrations. At high concentration, the interaction is that of preferential binding. For the p-Lg systems, the calculated transfer free energies were found to be negative for urea and all the alkyl ureas. Transfer enthalpies measured calorimetrically for the P L g system (Lapanje and Kranjc, 1982) were negative for urea and mostly positive for the alkyl ureas, which gave the same pattern for the transfer
406
SERGE N. TIMASHEFF
entropies. This was interpreted in terms of the hydrophobic nature of the alkyl groups. What do these measured values of low positive or negative preferential binding mean in terms of site occupancy? Referring to Eqs. (12) and (13), it is evident that the values of Table I1 and Fig. 12 are much smaller than the total number of surface sites on the protein molecule, as well as the number of sites with which ligand molecules make contacts. The $fictive number of ligand molecules that make contact with the protein has been determined in a careful calorimetric titration study by Makhatadze and Privalov (1992) for several proteins in the urea and GuaHCl systems. This contact-detecting technique, being nonthermodynamic in nature, cannot give equilibrium thermodynamic binding, but only effective values of site occupancy, i.e., B3 of Eq. (12). Furthermore, these authors have used classical binding theory to analyze their results, neglecting the concepts of exchange and preferential binding. Therefore, their reported values of AGare only descriptive and their thermodynamic significance is uncertain. Nevertheless, within the assumption that all the exchangeable sites are characterized by identical contact enthalpies and have identical exchange affinities, these results are a descriptive representation of site occupancy. It therefore seems useful to carry this descriptive analysis further by calculating effective values of & and B1. The resulting picture can convey a qualitative appreciation of what is occurring, though it cannot uncover the actual physical situation (Schellman, 1994; Schellman and Gassner, 1996). As an illustration, let us take native RNase A at pH 7.0 in 1.0 M urea. The pertinent parameters are as follows: ( d w ~ . J d r n ~ ) ~ ; ~= , , +-3.8 (Fig. 12); the effective total number of exchangeable sites, n = 122, and the site occupancy by urea, B3 = 7, are the values deduced by Makhatadze and Privalov (1992) from a Scatchard analysis of their calorimetric data. We must note that the site occupancy and thermodynamic binding stoichiometries carry opposite signs. These values lead to B, = 570, i.e., 5 water molecules per putative available site, if all are exchangeable. However, if the number of water molecules replaced by one urea molecule is less than 5, e.g., if it is 3, then the protein surface would contain effectively 345 exchangeable water molecules ( B y h ) and 225 nonexchangeable ones (Br]’”).
3. Urea Denaturation Keeping in mind the difference between the thermodynamic, exchange concepts of binding and the site occupancy, classical concept, it is possible to explain the fact that urea and GuaHCl are required at very high concentrations to denature proteins. If we take RNase at pH 7.0 as an example, the denaturation occurs with a midpoint at 7.0 M
CONTROL OF PROTEIN STABILIIY AND REACTIONS
407
urea. The slope of the Wyman plot [Eq. (16a)l is Anurea= 12, and the urea contribution to the free energy of denaturation in 7 M urea is SAG" = -7.7 kcal mol-' (Greene and Pace, 1974). Classical analysis, which is still frequently used, would attribute a binding free energy SAG"/An = -0.64 kcal mol-' for each additional urea molecule bound during unfolding. This would correspond to a classical binding constant of ca. 3 M-', and the expectation that denaturation should occur at between 0.3 and 1 M urea, not at 7 M. The answer to this apparent contradiction is that in classical analysis exchange is neglected. This is illustrated schematicallyin Fig. 13, which describes the change of surface contacts during any conformational change, including protein unfolding (denaturation). It is shown that the formation of a newly exposed surface or, on the contrary, removal of a surface from contact with solvent is accompanied by changes in contacts with both water and cosolvent on the surface of the protein molecule. The slope of the linkage plot [Eq. (16a)l is the difference between preferential interactions, which is the summation of the changes in contacts depicted in Fig. 13. Let us take the Wyman linkage relation expressed in terms of site occupancy [Eq. (16b)l and carry out an illustrative calculation. The Makhatadze and Privalov (1992) values of effective site occupancy on denatured and native ribonuclease in 8 M urea give (B: - B t ) = 41 additional urea molecules occupying sites on an RNase molecule when it unfolds. This leads to (By - By) = 131 additional effective water molecules coming into contact with an RNase molecule. The site occupancy value renders the SAG" contribution 0.21 kcal mol-I per additional urea molecule "bound," or 0.05 kcal mol-* per additional site available to exchange, if the calculation is done in terms of the one-to-one exchange model.
Statc I
State 11
FIG.13. Schematic representation of the redistribution of solvent components on the surface of a protein during a conformational transition (including denaturation). The dashes represent water; the spirals represent cosolvent.
408
SERGE N. TIMASHEFF
It is evident that the need for a high concentration of urea is caused by the compensation between interactions of the same ligand with different sites on the protein. At each newly exposed site at which additional urea molecules are preferentially bound, they favor destabilization of the protein. On the other hand, newly exposed sites for which urea has little affinity and which are occupied by water make a positive free energy contribution and therefore stabilize the structure. This means that urea molecules act simultaneously both as destabilizers and stabilizers of the protein structure. The direction in which each urea molecule drives the reaction depends on the protein surface locus with which it is interacting. By the additivity principle they compensate each other’s action, which leads to the need for a high concentration of denaturant to induce protein unfolding. 4. Guanidinium Salts
The preferential interactions of 6 M GuaHCl with a variety of proteins (Hade and Tanford, 1967; Lee and Timasheff, 1974) were all found to be small, many approaching zero, even though GuaHCl is a strong denaturing agent. The values are listed in Table 111. A similar situation is true of proteins in 8 M urea (Prakash et al., 1981). This reflects the exchange with water and a near balance between sites favoring occupancy by GuaHCl and sites favoring water. In an attempt to explore the nature of the sites with which GuaHCl interacts (Lee and Timasheff, 1974), B3 values were calculated with Eq. (12) for the proteins of Table 111, with the application of B, values calculated by the NMR method of Kuntz (1971). The best correlation of B3 was found with the summation of [(total number of peptide bonds/2) total aromatic amino acids]. This
+
TABLE I11 Prejkrential Binding of 6 M GuHCl to Proteins at 20°C” Protein RNase A Lysoqme Tubulin Chymotrypsinogen a-Chyrnotrypsin BSA Carboxypeptidase A
Protein 0.00 (0.00) 0.09 (0.09) 0.10 0.15 0.17 0.06 (0.06) 0.05
Lactate dehydrogenase Catalase PLactogIobulin Lima bean trypsin inhibitor a-Lactalbumin Aldolase Ovalbumin
(dg?,/%z)
T .r ,,I”