NON-COVALENT INTERACTIONS IN PROTEINS s
Pil ,---: I
Andrey
Karshikoff Imperial College Press
NON-COVALENT INTERACTIONS IN PROTEINS
llfp World Scientific N E W JERSEY
• LONDON
• SINGAPORE
• BEIJING
• SHANGHAI
• HONG KONG
• TAIPEI
• CHENNAI
Published by Imperial College Press 57 Shelton Street Covent Garden London WC2H 9HE Distributed by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
NON-COVALENT INTERACTIONS IN PROTEINS Copyright © 2006 by Imperial College Press All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN 1-86094-707-7
Printed in Singapore by Mainland Press
to my zvife (Danuta
Contents
Preface 1. Introduction 1.1 Some Historical Notes 1.2 Overview of Protein Structural Elements and Basic Definitions 1.2.1 The amino acids 1.2.2 The polypeptide chain 1.3 Non-covalent Interactions and Structure-Function Relationships in Proteins 1.3.1 Some comments on Anfinsen's dogma 1.3.2 Experimental measurements of non-covalent interactions in proteins References 2. Van der Waals Interactions 2.1 Observation of van der Waals Interactions 2.2 Nature of van der Waals Interactions 2.2.1 Dispersion forces 2.2.2 Dipole-dipole interactions 2.2.3 Dipole-induced dipole interactions 2.2.4 Repulsive interactions 2.3 Potential Functions for Application in Proteins 2.4 Approximation for Polyatomic Systems References 3. Hydrogen Bonds 3.1 Nature of Hydrogen Bonds 3.1.1 Proton donors, electronegativity 3.1.2 Proton acceptors 3.2 Geometry and Strength of Hydrogen Bonds 3.2.1 Directionality 3.2.2 Hydrogen bond length 3.2.3 Hydrogen bond strength 3.2.4 Hydrogen bond potential functions vii
xi 1 4 12 12 17 19 20 21 22 25 26 28 29 37 42 44 46 48 50 51 51 52 55 56 57 64 67 71
viii
Contents
3.3 Hydrogen Bonds in Proteins 3.3.1 Secondary structure elements 3.3.2 Hydrogen bonds involving side chains 3.3.3 Salt bridges 3.3.4 Hydrogen bond networks 3.4 Hydrogen Bonds and Protein Stability 3.4.1 Hydrogen bonds within the polypeptide chain, role in folding 3.4.2 Hydrogen bonds involving side chain, role in stability References 4. Hydrophobic Interactions 4.1 Nature of Hydrophobic Interactions, Pseudo Forces 4.2 Water 4.2.1 Flickering clusters model of water 4.2.2 Hydrocarbons in water, iceberg model 4.3 Hydrophobic Effect 4.3.1 Oil drop in water 4.3.2 Experimental assessment of hydrophobic interaction 4.4 Hydrophobic Interactions in Proteins 4.4.1 Additivity of hydrophobic interactions 4.4.2 Solvent accessibility 4.4.3 Evaluation of hydrophobic interactions 4.4.4 Size of the hydrophobic core 4.4.5 Hydrophobic packing and packing defects References 5. Electrostatic Interactions 5.1 Debye-Huckel Theory 5.1.1 Poisson-Boltzmann equation 5.1.2 Parameter of Debye 5.1.3 The electrostatic potential of an ion in solution 5.1.4 Extension for proteins 5.2 Ion-Solvent Interactions 5.2.1 Born model 5.2.2 Application of the Born model for proteins: why do charges tend to be on protein surface? 5.2.3 Generalised Born theory for proteins 5.3 Calculation of Electrostatic Interactions in Proteins 5.3.1 The protein molecule as a dielectric material 5.3.2 Dielectric model for calculation of electrostatic interactions in proteins 5.3.3 Numerical solution of the Poisson-Boltzmann equation, finite difference method
73 73 76 78 80 83 84 86 89 91 91 93 93 96 98 98 100 102 104 105 Ill 116 121 127 129 130 130 135 137 139 140 140 144 146 151 151 157 159
Contents
ix
5.3.4 Boundary conditions 168 5.3.5 Electrostatic potential calculated by means of the finite difference method 171 References 175 6. Ionisation Equilibria in Proteins 177 6.1 Why Does One Need to Know Ionisation Equilibria? 179 6.2 Basic Definitions 180 6.2.1 Protonation/deprotonation equilibria 180 6.2.2 Henderson-Hasselbalch equation 182 6.2.3 Degree of deprotonation and degree of protonation 184 6.2.4 Ionisation equilibrium constants of model compounds 186 6.3 Factors Determining Ionisation Equilibria in Proteins 189 6.3.1 Desolvation 191 6.3.1.1 Born energy 191 6.3.1.2 Calculation of the Born energy 194 6.3.2 Interactions with the protein permanent charges 197 6.3.3 Definition of intrinsic pK 198 6.3.4 Charge-charge interactions 199 6.4 Combinatorial Problem 201 6.4.1 Solution based on the Boltzmann weighted sum 202 6.4.2 Solution based on the Monte Carlo simulation 206 6.5 Cooperative Ionisation 209 References 215 7. Conformational Flexibility 217 7.1 Allocation Variation of Polar Hydrogen Atoms 217 7.1.1 Titratable and pH-sensitive sites 218 7.1.2 Microscopic/)^ 219 7.1.3 Population of the microscopic states 224 7.2 Examples for pH-Dependent Hydrogen Bonding 229 7.2.1 Ionisation properties of Asp76 in ribonuclease Tj 229 7.2.2 Hydrogenbondrearrangement related to protein function ... 234 7.3 Conformational Flexibility Involving Non-hydrogen Atoms 239 7.3.1 Conformations generated by means of molecular dynamics simulation 241 7.3.2 Average p ^ values 246 7.3.3 Desolvation and charge-dipole energy compensation 249 7.3.4 Dynamics of salt bridges 252 References 254 8. Electrostatic Interactions and Stability of Proteins 255 8.1 Definitions 255 8.2 Unfolding Induced by pH 257 8.3 Modelling of Unfolded Proteins 262
x
Contents
8.3.1 Spherical model of unfolded proteins 8.3.2 Size of the dielectric sphere 8.3.3 Average distance between charges 8.3.4 lonisation equilibria in unfolded proteins 8.4 Thermal Stability of Proteins References Appendix A Basic Definitions of Thermodynamics and Statistical Thermodynamics Appendix B Electric Dipoles Appendix C Solution of Laplace and Poisson-Boltzmann Equation Index
264 265 270 273 277 281 283 311 319 329
Preface
This book represents the essential part of the course "Non-covalent Interactions in Proteins: Structure, Stability, Function" held as a part of the "Postgraduate Program in Nanobiology and Biological Physics" of Karolinska Institutet, Stockholm. As far as Karolinska Institutet is a medical university, one could expect that the course is adapted for students with background in biological sciences. This is partially true. Because the course is regularly visited by students from other universities in Stockholm, as well as from Uppsala University and Linkoping University, its content is adapted for students of different backgrounds and different interests. Textbooks on physics of condensed matter consider non-covalent interactions in detail, however their application for analysis of protein properties is often poorly presented or missing. On the other hand, books on biochemistry, molecular modelling or molecular simulation introduce these interactions in the context of the corresponding topic, which sometimes results in sparing of explanations of their nature. The aim of the present book is to unite the considerations of non-covalent interactions with the specificity of their application in protein sciences in a single reading. This includes comments on the nature of the different interactions and their manifestation in protein properties, derivation of the formulae most frequently used for the analysis of non-covalent interactions in proteins and the methods for their calculation. Although the derivation of the various formulae can be found in the specialised textbooks, here the derivations are presented step by step, sometimes even to a level that might look trivial. The purpose of this is to diminish the unnecessary fear of mathematics that some students have inherited
XI
Xll
Preface
from their previous education. In this way, the book can be a useful aid for students of biology, biochemistry, or biomedicine who want to extend their knowledge about how protein properties are described on a molecular level. At the same time, the present book can help students of physics or chemistry who have interests in biology and biophysics. Attention is paid on the terminology, which sometimes is differently used in the different disciplines of science, thus leading to ambiguity and misunderstandings. To make the material closer to the everyday language of biological sciences, and hence to the intuition of the reader, some of the terms do not meet exactly the requirements of the rigorous canons of physics. Thus, for instance, temperature is given in Celsius, although in thermodynamics the absolute temperature must be used. Hopefully, this can help the inexperienced reader to sharpen his or her attention when reading scientific literature, where the two temperature scales are used with a comparable occurrence. Due to the same reasons, the energy units are given in calories (cal/mol or kcal/mol), instead in Joules (J/mol or kJ/mol). The literature quoted refers to the works which to the best knowledge of the author are pioneering in the corresponding field. Last, but not least, the author would like to acknowledge the stimulation and the sincere support of Prof. Rudolf Ladenstein during preparation of the material. The author especially thanks Associate Professor Vladimir Pericliev for his valuable help in the preparation of the manuscript.
Andrey Karshikoff
Chapter 1
Introduction
Non-covalent interactions are weak interactions between atoms or molecules where no chemical reaction takes place. Because no formation or breaking of chemical bonds is induced, non-covalent interactions are often called non-bonded interactions. Formally, we distinguish three types of non-covalent interactions. The most common are the van der Waals interactions. They are short range interactions and occur always when two atoms or molecules come close to each other. We define as short range interactions the interactions which become relevant at distances comparable with the size of the interacting atoms. In this way, practically only neighbouring atoms are involved in these interactions. The Hydrogen bonds are interactions which are at the boundary between the chemical bonds and non-covalent interactions. They take place between pairs of atoms only if one of them is a proton donor and the other one is a proton acceptor. Electrostatic interactions are the third type of non-covalent interactions. In contrast to the other two types, electrostatic interactions are long range ones. This means that electrostatic interactions are also relevant beyond the limits of the closest neighbours. This makes their description somewhat more complicated. Therefore, a special attention will be paid to these interactions. Proteins became a subject of intensive investigations as a part of the colloid chemistry, since a number of their physical properties, such as sedimentation, diffusion, viscosity, light scattering, and many others are similar to those of the colloid particles. The colloid particles are molecular aggregates kept together by the delicate balance of attractive and repulsive forces, all resulting from the non-covalent interactions between the molecules comprising the colloid system. Let us set aside for 1
2
Introduction to Non-covalent Interactions in Proteins
the moment all we know about proteins and glance at the molecule presented in Fig. 1.1. This is Ribonuclease Tl, a small protein which binds and splits ribonucleic acids. The similarity of the molecule to a typical colloid particle is manifested in two aspects, at least. First, the molecule looks like an aggregate of atoms. Second, the surface of the molecule is rich of charges; depending on the physical conditions of the solution, the oxygen (red spheres) and nitrogen (blue spheres) atoms may be negatively or positively charged, respectively, or may have partial charges due to delocalisation of their electron clouds. As it will be shown below, the formation of the compact body seen in the figure, as well as the exposure of the charges on the surface of the molecule, are governed by the same forces responsible for the formation of the colloid particles: namely, the complex action of non-covalent interactions of different type. Proteins are not colloid particles, but molecules with properties, on the basis of which all known forms of living matter exist. Proteins bind and transport organic and inorganic compounds in this way regulating physiological processes or catalysing chemical reactions. These properties of proteins are referred to as functional properties. In the lower panel of Fig. 1.1, the complex of Ribonuclease Tl with the inhibitor guanylyl-2'-5'-guanosine is shown. As seen there, the molecule of the inhibitor is situated in a cleft formed by the protein. This cleft is the active site, i.e. the site where the substrate ribonucleic acid binds and the catalytic reaction takes place. It has a shape that matches the size and the conformation of the substrate or the inhibitor. In this way, the active site facilitates binding and at the same time makes it specific: compounds with other chemical composition or in "inappropriate" conformation do not bind. Another important feature which is not illustrated in the figure is that the atoms constituting the cleft create a micro environment facilitating the catalytic reaction when the substrate binds. Thus, the active sites, as well as the rest of the molecule, are not just aggregates of atoms, as the first glance at the molecule could suggest, but organised structures. Even small changes of this organisation may diminish or terminate the function of the protein molecule.
Introduction
3
Figure 1.1 Three-dimensional structure of ribonuclcasc Tl obtained by X-ray crystallography and deposited in Protein Data Bank1. The atoms are represented by spheres corresponding to their van der Waals radii and coloured according to their type: grey (carbon), blue (oxygen), red (nitrogen), and yellow (sulphur, partially seen at the right hand side of molecule). These colours will be used in all other figures, unless otherwise stated. The hydrogen atoms are omitted. Upper panel: inhibitor free form of ribonuclease Tl. Lower panel: complex of ribonuclease Tl with the inhibitor guanylyl2'-5'-guanosine. The inhibitor molecule is represented by sticks and in green in order to make the active site cleft of the protein clearly seen. All colour molecular images are reproduced using The PyMOL Executable Build (2005), DeLano Scientific LLC, South San Francisco, CA, USA, unless otherwise stated.
4
Introduction to Non-covalent Interactions in Proteins
1.1 Some Historical Notes The first idea for the structuring of proteins was given by Gerardus Johannes Mulder. In his famous paper2 "Tiber die Zusammensetzung einiger thierischen Substanzen" ("On the Composition of Some Animal Substances") Mulder investigated the atomic composition of three "albuminous substances", as proteins were then called, noticing that sulphur and phosphorus bind to an organic body with the composition C4ooH62oNiooOi2o. He named it protein, from the Greek 7tpcoTSio.
One needs only to substitute E2, which can be done by means of Eq. (B.10) deduced in Appendix B:
Ud.ind=-"f2
6 (3cos
2
fl + l).
If the dipoles are free to rotate the above relation becomes
ud-ind=—W^r-
( 2 - 23 )
167tV r 6 Equation (2.23) gives an expression for dipole-induced dipole interactions. We notice that, similarly to dispersion forces and dipoledipole interactions, the dipole-induced dipole interactions sharply decrease with the distance. The fact that all of these interactions, the
44
Introduction to Non-covalent Interactions in Proteins
dispersion forces, dipole-dipole interactions and dipole-induced dipole interactions decrease with the same order of magnitude, r"6, is a good reason to unite them into a single term, namely, to the van der Waals interactions. We should repeat here that such a merge is appropriate for liquids. For proteins it is inconvenient because of the restricted freedom of reorientation of the dipoles. 2.2.4 Repulsive interactions The parameter b in Eq. (2.2) reflects the fact that gas atoms or molecules have a finite size. We have already evaluated the connection of b with volume belonging to a single atom or molecule which is not available for the rest. This was done assuming that gas molecules are hard spheres. This assumption seems to be obvious, since two objects cannot occupy the same room at the same time. However, this is true only for macroscopic objects. If the objects of interest are atoms, this principle is violated. The formation of a chemical bond is such a violation, because the electron orbitals overlap. From quantum chemistry we know that this overlap occurs if the electrons have opposite spins, i.e. different quantum numbers. It can be shown that the probability for overlapping of electron orbitals with identical quantum numbers is zero. This is known as the principle of Pauli: two electrons in a system cannot exist if they have identical quantum numbers. In other words, two electrons with identical quantum numbers cannot occupy the same place at the same time. For instance, helium atoms have in their orbitals two electrons with spin 1/2 and -1/2. According to the Pauli principle no third electron can be introduced because its quantum number would coincide with one of those already present. As a result a repulsive force arises.
Van der Waals Interactions
0.50.43 0.3-
\ \ \
I 0.2-
\
0.10.04
0
45
\ 1
1 —
1 2
i
3
,—
4
r[k] Figure 2.3 Radial distribution function of the probability to find the electron at a distance r from the nucleus of the hydrogen atom. This function corresponds to n - 1 and / = 0.
The description of repulsive forces is a subject of quantum mechanics and is beyond the scope of this book. For the purposes of our considerations, we need just a brief overview of the origin of the distance dependence of these forces. The possible states that an electron can occupy can be found as solutions of the amplitude equation of Schrodinger [Eq. (2.5)]. The solution of this equation is usually presented as the product of three wave functions in spherical coordinates, R(r)&(0)
of the hydrogen bonds mostly found in proteins, as well as some between atoms with higher electronegativity, are listed in Table 3.1. The hydrogen bond length is mainly determined by the electronegativity of the proton donor and the proton acceptor atoms. Let us take the fluorine atoms as an example. We already know (see Fig. 3.1) that the fluorine atoms have a higher electronegativity than the oxygen and nitrogen atoms. Accordingly, the distance between the fluorine atom as proton acceptor and the hydrogen atom (row 9) is the shortest in comparison with the other hydrogen bond lengths listed in the table. The correlation between the electronegativity and the hydrogen bond length is also seen in the difference between the hydrogen bonds composed by oxygen atoms and those in which one of the partners is a nitrogen atom. The latter are characterised by a larger average hydrogen bond length. One can say that, as a rule, the higher the electronegativity of the atoms, the shorter the hydrogen bond. The value of r(A...H) depends on the nature of the functional groups themselves. If, for instance, the functional group containing proton donor has deficiency of electron density the hydrogen bond length reduces. A reduction of the hydrogen bond length also occurs if the proton acceptor functional group has an excess of electron density. Compare the hydrogen bonds in row 5 of Table 3.1. The hydrogen bond between carbonyl and carboxyl oxygen atoms has a length between 1.64 and 1.70 A. If the proton acceptor group is replaced by a carboxylate (the deprotonated form of the carboxyl group), the hydrogen bond length is reduced to 1.40 A. There is however an important exception which is worth mentioning. The hydrogen bond between water as acceptor and carboxyl group as donor (Table 3.1, row 8) has a shorter distance r(0-H) than in the case where the water molecule binds as proton donor to carboxylate, irrespective of the fact that the carboxylate group is negatively charged. One can say that water is a "good" proton acceptor and "bad" proton donor. Hydrogen bonds at which the hydrogen atom interacts with one proton acceptor are called two-centre hydrogen bonds. Such are the hydrogen bonds given on the left hand side of Table 3.1, rows 1 to 5. Two-centre hydrogen bonds are also the C = 0 " H - N hydrogen bond given in Figs. 3.8A and 3.8B. The carboxylate group can serve as an
Hydrogen Bonds
67
example of another type of hydrogen bonds, namely the three-centre {bifurcated) hydrogen bonds (Fig. 3.11). In this case the hydrogen atom interacts with two proton acceptors. The opposite configuration is also often observed, when one proton acceptor binds two hydrogen atoms. An illustration of such a hydrogen bond is given in Fig. 7.12 (Chapter 7). In fact, we have already considered the physical origin of the bifurcated hydrogen bonds when commenting the geometry of the C=0 •H-N binding (Fig. 3.8). The bifurcated hydrogen bonds are described by two distances r(A...H) which are not necessarily equal. They are somewhat longer than those of the two-centre hydrogen bond. „ — d
(A)
- ,,-H— NH2— R O"
(B) —Q/-
+ \0—-H—NH2—R
Figure 3.11 Hydrogen bond between carboxylate and amino groups. (A) Three-centre (bifurcated) hydrogen bond. (B) Two-centre hydrogen bond.
3.2.3 Hydrogen bond strength It is commonly accepted for hydrogen bonds with energy of formation with a magnitude exceeding -15 kcal/mol to be considered as strong bonds. Moderate or normal hydrogen bonds are those with energies between -15 and - 4 kcal/mol, whereas the hydrogen bonds with formation energy less than - 4 kcal/mol are treated as weak bonds. Strong hydrogen bonds are formed by atoms with extreme electronegativity in which the proton donor group is characterised by a deficiency of electron density and/or the proton acceptor partner has excess of electron density. The deficiency of electron density of the proton donor leads to an additional deshielding of the hydrogen atom nucleus and hence, to an increase of the polarity D-H bond. The excess of electron density of the proton acceptor leads to an increase of its negative partial charge and in this way facilitates electrostatic interactions with the hydrogen atom. Examples of strong hydrogen bond are the bonds formed by fluorine atoms. The bond energy of F~"H-F is
68
Introduction to Non-covalent Interactions in Proteins
about -39 kcal/mol. The hydrogen bond angle, 6, observed in different crystal structures, is between 170° and 180°, whereas the hydrogen bond length r(F. H> is between 1.13 and 1.70 A. If we compare these values with the relation given in Fig. 3.10, we will notice that they fall in the utmost left hand side of the graph. As a rule, strong hydrogen bonds are characterised with geometry close to the ideal one and with hydrogen bond length less than 1.7 A. In the literature, hydrogen bonds with I"(A-H)< 1.4 A are called very strong hydrogen bonds. Strong (or very strong) hydrogen bonds form water molecules when hydrated protons are involved. One of the characteristics of water solutions is the negative logarithm of hydrogen ion concentration, pH. In fact, this is the concentration of the oxonium (also called hydroxonium or hydronium) ions, H 3 0 + . For instance, the dissolution of hydrochloric acid in water leads to lowering of pH, or in other words, to the increase of the concentration of oxonium ions: HC1 + H 2 0 -»• CI" + H 3 0 + . The oxonium ion is formed by direct binding of the hydrogen ion to one of the lone electron pairs of the oxygen atom of the water molecule. This bonding, called coordinated bond, is strongly electrostatic. Oxonium ions form water clusters, which can be expressed by H 3 0 + «H 2 0, where n shows the number of water molecules participating in the cluster. The smallest one, H 5 0 2 + , is illustrated in Fig. 3.12. r(0—H)
"x 1 H
•»
| H r(0—H)
Figure 3.12 Hydrogen bond between oxonium ion and water.
The energy of the hydrogen bond between oxonium ion and water molecule in H 5 0 + is about 36 kcal/mol. This energy gradually increases with the increase of n. The hydrogen bond length, r (0 - H ), varies between 1.22 and 1.34 A, and is comparable with the length of the covalent bond, /•(O-H), which is between 1.10 and 1.22 A. The similar sizes of the two
Hydrogen Bonds
69
bonds suggest that the acceptor and the donor can exchange their roles, or in other words, the proton (the hydrogen ion) can migrate. Moderate hydrogen bonds are formed between neutral proton donor and acceptor groups. An exception is the ammonium ion, NH4+. As seen from Table 3.1 the hydrogen bonds of NH4+, RNH3+, and R2NH2+ with carboxylate have hydrogen bond lengths very similar to those formed with carbonyl groups. A moderate hydrogen bond forms also the ^ N + - H group. Moderate, also called normal, hydrogen bonds are characterised by a more varied geometry than the strong hydrogen bonds. The angle 6 in normal hydrogen bonds usually adopts values between 180° and 140° and in some cases even below this range. Declination from the linear configuration is most pronounced in intramolecular hydrogen bonds, as illustrated in Fig. 3.5B. The hydrogen bond length varies between 1.70 and 2.00 A. The importance of moderate hydrogen bonds evolves from the fact that they are typical for proteins and water. Nowadays, there are experimental methods, such as the neutron diffraction method, by means of which hydrogen atoms can be localised in the protein crystal structures. However, still the prevailing amount of data does not contain information about the hydrogen atoms which makes the determination of the hydrogen bonds geometry in proteins more complicated. If no structural information is available, another parameter for description of hydrogen bonds is used, namely the distance between the proton donor and the proton acceptor. Some values of this parameter are listed in Table 3.2. Table 3.2 Average distances between proton donor and proton acceptors in hydrogen bonds most relevant for proteins1. Bond O-H-0 O-H-O" O-H-N N+-H-0 N-H-O N-H-N
Distance, A 2.70 2.63 2.88 2.93 3.04 3.10
70
Introduction to Non-covalent Interactions in Proteins
The data given in Table 3.2 demonstrate one of the basic features of hydrogen bond, namely that the distances between the atoms linked by hydrogen bonds are shorter than those expected for van der Waals interactions. If we compare the distances given in Table 3.2 with the parameters given in Table 2.2, we shall see that the values of rm, the distance between two atoms corresponding to most favourable van der Waals interaction energy, are essentially larger than the distances between two hydrogen bound atoms. If we solve the Eq. (2.27) for ry at Uy = 0 using the parameters in Table 2.2, for two oxygen atoms we obtain ru=0 = 3.03 A. Between two nitrogen atoms this distance is 3.30 A. As seen, these values are also larger than those given in Table 3.2, meaning that the distances between the proton donor and the proton acceptor are even shorter than those tolerated by van der Waals interactions. The shortening of the proton donor-acceptor distance is more pronounced in the strong hydrogen bonds. The origin of this basic feature of the hydrogen bonds has been considered in Section 3.1. Weak hydrogen bonds occur if the proton donor has a comparable, yet lower, electronegativity than that of the hydrogen atom. This is the case of the C-H bond, where the carbon atom has a slightly lower electronegativity than the hydrogen atom. From the analysis of the data illustrated in Figs. 3.1 and 3.2 we have concluded that the deshielding of the hydrogen nucleus is negligible in the C-H bond. This is the reason groups, such as -CH 3 and -CH 2 -, to be considered as non-polar. On the other hand, we have mentioned that the magnitude of electron abstraction depends also on the chemical nature of the compound containing the X-H bound. Indeed, there is experimental evidence that the proton donor ability of R 3 C-H can increase by appropriate substitution of R. Also, deshielding of the hydrogen atom in the C-H bond takes place if the carbon atom has a multiple covalent bond. For instance, it has been shown that acetylene forms hydrogen bond with water. According to the experimental measurements and theoretical calculations the hydrogen bond length of the pair H-C=C-H-OH 2 is between 2.19 and 2.23 A and formation energy of -2.19 kcal/mol2. Even methane forms weak hydrogen bonds. These bonds are characterised with a marginal energy between 0.2 and 0.8 kcal/mol and distance between the donor and the acceptor of about 3.5 A.
Hydrogen Bonds
71
As seen, weak hydrogen bonds are characterised by low energy which is comparable with that of van der Waals interactions. Accordingly, the distances between the atoms involved in weak hydrogen bonds do not differ from those typical for the van der Waals contacts. The main difference between the weak hydrogen bonds and van der Waals interactions is the directionality of the former. 3.2.4 Hydrogen bond potential functions The geometry and the energy of formation of hydrogen bonds in principle can be calculated by means of quantum mechanics. This rigorous approach can be applied for calculations in gas phase. Using some approximations, quantum mechanical calculations can be carried out for more complicated systems, including proteins. Such calculations however cannot cover the whole protein molecule. Similarly to van der Waals interactions, the energetics of hydrogen bonds in protein is most often evaluated by means of empirical potential functions. The ideology for formulating the hydrogen bond potential functions does not differ from that for van der Waals interactions. This is not surprising taking into account the fact that the dispersion forces and the overlap repulsion are the dominating forces. The essential difference between the two types of potential functions is that the hydrogen bond potential functions should account for the effect of the rearrangement of the electron clouds of atoms participating in the hydrogen bond. This effect can be reflected simply by shortening of the interatomic distances. A function that gives a minimum at interatomic distance shorter than that of the Lennard-Jones potential [see Eq. (2.26)] is
UUB=Cr-n+Dr-w, where C and D are parameters depending on the atom pair and r is the hydrogen bond length. In order to meet the above requirement, the term corresponding to the dispersion forces is modified to have an exponent of -10. This function is symmetric and does not take into account the geometry of the hydrogen bonds.
72
Introduction to Non-covalent Interactions in Proteins
One possible way to take into account the geometry of the triad D-H-"A is the introduction of the hydrogen bond angle, 6, into the potential function. Such a function is UHB = cos0(Ar' n - Br"6) + (l-cos6)(A'r~ n - B'r~6),
(3.1)
where the parameters A and B correspond to the parameters describing the interactions between atoms i and j in the Lennard-Jones potential: 4=p
r12
R — ?p
r6
The parameters A' and B' are appropriately defined to reflect the shortening of the interatomic distances and the corresponding energy changes between the atoms that can form a hydrogen bond. This function requires four parameters for each atom. One can set A=A' and B = B' for atoms that do not belong to functional groups able to form hydrogen bonds. It is easy to see that in such a case Eq. (3.1) reduces to Eq. (2.26). The dependency of Eq. (3.1) on 0 is illustrated in Fig. 3.13. At 0 = 180°, i.e. at the ideal geometry of the hydrogen bond, the potential function is determined by the second term on the right hand side of Eq. (3.1) reduced by the Lennard-Jones term. With the decrease of #the contribution of the hydrogen bond term diminishes. Uw{9) {/HB(max'
Uu 0
90
180
e Figure 3.13 Dependence on 6>of the function defined by Eq. (3.1). The interval of 0for most frequently observed hydrogen bond geometry is marked by arrows.
Functions of the type of Eq. (3.1) have some disadvantages. The variable #is a geometrical parameter which is a result of the interactions
Hydrogen Bonds
73
between the atoms. In this context, its introduction is to a certain extent artificial. Modern computational approaches based on empirical functions do not use 6 as a parameter. Instead, the parameterisation of the proton donors, proton acceptors and the polar hydrogen atoms is improved, allowing hydrogen bonds to be specifically described by functions of type of Eq. (2.26). After adding a term describing electrostatic interactions — and other terms which will be briefly considered in Chapter 7 — the empirical potential functions give a sufficiently accurate account for the factors responsible for the variation of the hydrogen bond geometry. 3.3 Hydrogen Bonds in Proteins It is worth quoting the conclusion of Pauling and Mirsky about the role of hydrogen bonds in the structural organisation of proteins because it was made three decades before the first three-dimensional protein structure became available: "The molecule consists of one polypeptide chain which continues without interruption throughout the molecule (or, in certain cases, of two or more such chains); this chain is folded into a uniquely defined configuration, in which it is held by hydrogen bonds between the peptide nitrogen and oxygen atoms and also between the free amino and carboxyl groups of the diamino and dicarboxyl amino acid residues".3 3.3.1 Secondary structure elements The hydrogen bonds participating in stabilisation of the secondary structure elements are those between the peptide N-H and 0=C groups. We have already mentioned in Chapter 1 that the atoms in the peptide group are co-planar. Hence, in trans-conformation — most usual conformation in proteins — the proton donor and the proton acceptor are diametrically opposite (Fig. 3.14).
74
Introduction to Non-covalent Interactions in Proteins
donor H
t
-i- > f O
acceptor
t
Figure 3.14 Peptide group: the proton donor and proton acceptor are indicated by arrows.
This peculiarity of the peptide groups together with the directionality of the hydrogen bonds leads to a certain organisation if several peptides are linked by hydrogen bonds. In proteins, the intramolecular hydrogen bonds between the peptide groups result in a limited number of conformations of the backbone which we call secondary structure elements. The average values of the parameters of the hydrogen bonds forming the secondary structure elements are given in Table 3.3. Among them we recognise the a-helix and the (5-sheets which have already been considered in Chapter 1 (see Fig. 1.3). Another secondary structure element is the 3 ) 0 helix. The 3 i 0 helix is observed in short segments of protein backbones. Often 3 i 0 helices have the role of turns at which the polypeptide chain changes its direction. These fragments are also known as P-turns. An example of (3-turn is given in Fig. 3.6. Table 3.3 Average geometry parameters of the N - H " 0 = C hydrogen bonds forming secondary structure elements in proteins . N-H-0=C a-helix - main chain a-helix - N-terminus a-helix - C-terminus B-sheet - parallel B-sheet - antiparallel 3 10 helix B-turn
H-O.A 2.06 2.25 2.26 1.97 1.96 2.17 2.13
e
155 140 152 161 160 153 154
N - O distance, A 2.99
2.92 2.91 3.09 3.06
Hydrogen Bonds
75
The hydrogen bonds in a 3i0 helix connect every second peptide group (the N-H group of amino acid residue i connects the 0=C of residue i + 3), whereas in cc-helices the hydrogen bonds connect every third peptide group (residues i, i + 4). The geometrical parameters of these hydrogen bonds do not differ essentially. The hydrogen bonds formed at the C-termini of the oc-helices have geometry very similar to that of the 3 ] 0 helices. Accordingly, the conformation of the C-terminal segments of the a-helices often corresponds to 310 helix.
Figure 3.15 Cartoon presentation of the membrane protein porin from Rhodobacter capsulatus. The polypeptide chain fragments forming a (3-barrel are given as arrows.
The hydrogen bonds forming (3-sheets have geometry somewhat closer to the ideal one. If a P-sheet consists of more than two polypeptide chains, the resulting structure is called (3-pleated sheet (see Fig. 1.3) or (3barrel (Fig. 3.15). The (3-pleated sheets and the (3-barrels are characterised by a continuous hydrogen bond chain directed laterally across the polypeptide chains as shown in Fig. 3.16. The hydrogen bonds forming such structures are straightened by the cooperative effect known as resonance assisted bonding. Due to the high polarisability of the lone pair electron density, the formation of hydrogen bond between two peptide groups (Fig. 3.16, bond 1) is accompanied by a certain increase
76
Introduction to Non-covalent Interactions in Proteins
of the 0=C bond length. Consequently the electron density of the nitrogen atom shifts so that the C-N bond shortens and becomes "more double" in character. This makes the N-H in bond 2 a better donor. N-H—0=^C ~> N - H — 'I 1 'I *• 'I ---O-C N H---CKC I I Figure 3.16 Hydrogen bond crosslink in (3-pleated sheets and [3-barrels.
It is interesting to note that in (3-barrels some of these hydrogen chains may close to form a cycle, a factor stabilising the protein structure. The protein shown in Fig. 3.15 is an example of a P-barrel with a number of hydrogen bond chains closed in cycles. The hydrogen bonds in the helical structures do not show this cooperative effect because the hydrogen bond chain is interrupted by N-Coc and C a - C bonds. 3.3.2 Hydrogen bonds involving side chains A number of amino acid side chains have functional groups which can serve as proton donors or proton acceptors. These are the polar and charged amino acids which can easily be recognised in Table 1.1. In Table 3.4 these amino acids are sorted according to their function as proton acceptor or proton donor. This separation is however rather formal. The majority of the functional groups can act as both proton donors and proton acceptors. Pure proton donors are the lysine e-amino groups and arginine guanidinium groups. As pure proton acceptor can be considered the deprotonated form of the carboxyl group (the carboxylate). Each of the oxygen atoms in the carboxylate can form two hydrogen bonds (Fig. 3.17). Occupation of anti positions is relatively seldom observed in hydrogen bonds between side chains. However, all possible positions can participate in hydrogen bonding with water molecules.
Hydrogen Bonds
77
anti i /
6— s y n cr
Figure 3.17 Positions of the proton acceptor sites in carboxylates. Table 3.4 Proton acceptors and proton donors in protein side chains. Acceptor
Donor +
Aspartate Glutamate C-terminus
/
H-NH2-
HO.
Lysine N-terminus Aspartic acid Glutamic acid
,c—
O /,C—
H2Nx
/
Asparagine Glutamine
o
Arginine
VC-NH HN
Threonine Serine
Tyrosine (deprotonated)
Histidine (deprotonated)
-HO
O-
H-O-
H-O
„N
Tyrosine (protonated)
Histidine HN -—^ N
Aromatic rings Tyrosine, tryptophan
Threonine Serine
HN
HN )N I + )NH
Tryptophan
78
Introduction to Non-covalenl Interactions in Proteins
If the carboxyl group is protonated it can act also as proton donor. Its protonation state changes with pH, so that its proton donor properties are pH dependent. The change from the proton donor to proton acceptor functions applies to all ionisablc side chains. The possible impact of this property of the titratable groups on protein functional properties will be considered in Chapter 7. The amino acid classified as aromatic in Table 1.3 can participate in hydrogen bonding via the 7i-electron cloud of their aromatic rings. In such a hydrogen bond these groups are proton acceptors. This type of bonds is not often observed in proteins. However, they can play the role of an additional stabilisers of the spatial structure of proteins. Thus, for instance, the hydrogen bond between a tyrosine and asparagine side chains illustrated in Fig. 3.18 stabilises one turn of a oc-helix in the protein named transcription enhancer factor 2.
Figure 3.18 Hydrogen bond between asparagine and the aromatic ring of tyrosine in transcription enhancer factor 2.
3.3.3 Salt bridges Among the variety of hydrogen bonds, the binding of charged functional
groups is of particular interest. This type of hydrogen bonds is called salt bridge. Most often, salt bridges occur between the deprotonated carboxyl groups of the aspartic and glutamic acids, including the C-termini (proton acceptors) and cc-amino group of the N-termini, e-amino group of the lysines, guanidine group of the arginines, imidasole group of the
Hydrogen Bonds
79
histidines in their protonated form (proton donors). As we have pointed out, protonation states of these groups depend on pH so that the formation of salt bridges also depends on pH. The prediction and the effect of this dependency will be considered in Chapters 6 and 7. The deprotonated carboxyl groups have uncompensated electron density having a net electric charge of - 1 p.u. The proton donors mentioned above have one hydrogen atom whose electric charge is not compensated, i.e. they are characterised by a deficiency of electron density. Following the definition of a strong hydrogen bond, one can suppose that salt bridges are strong hydrogen bonds. On the other hand, comparing the geometry of the strong hydrogen bonds with that formed by carboxylates and ammonium ions (Table 3.1), we see that the latter belong rather to the moderate (normal) hydrogen bonds. Thus, it is important to note that salt bridges are not strong hydrogen bonds. It would be useful here to introduce some terminology which is often used in the literature. Figure 3.18 shows an example of side chain-side chain hydrogen bonding that links amino acid residues separated along the polypeptide backbone by three peptide groups. If the proton donor and the proton acceptor in a hydrogen bond belong to amino acid residues separated by a few peptide bonds along the polypeptide chain, it is considered as a local hydrogen bond. Obviously, there are no restrictions for formation of hydrogen bonds between side chains which are distant along the polypeptide chain. Hydrogen bonds connecting different secondary structural elements in proteins are very common. These hydrogen bonds, i.e. bonds between amino acid residues separated along the backbone by a large number of peptide units, are often called long-range hydrogen bonds. An example of long-range hydrogen bond is given in Fig. 7.12. This definition should not be confused with longrange interactions. It clearly follows from our discussion in the previous sections of this chapter that hydrogen bonds are short-range interactions. Thus, it is good to remember that the term "long-range hydrogen bond" is referring to the "distance" along the polypeptide chain. Salt bridges are hydrogen bonds which link charged partners. In this context they differ from the other hydrogen bonds in proteins. Due to electrostatic interactions between the net charges of the proton donor and the proton acceptor, salt bridges include long-range electrostatic
80
Introduction to Non-covalent Interactions in Proteins
interactions. Therefore, salt bridges are often recognised as ion pairs and are believed to have a role in the electrostatic stabilisation of the native structure in proteins. Again, it is worth noting that salt bridges are ion pairs, but not all ion pairs are salt bridges. Ion pairs with a distance between the proton donor and the proton acceptor more than 4 A lose directionality, hence they cannot be considered as hydrogen bonds. 3.3.4 Hydrogen bond networks Based on the analysis of a large number of three-dimensional structures of proteins it was found that the functional groups tend to fully satisfy their hydrogen bonding atoms. This trend is parallel with the fact that the majority of these groups can act as proton donors and proton acceptors simultaneously. Hence, the formation of hydrogen bond networks by the side chains in proteins must be a common feature. We have already analysed one type of hydrogen bond networks, namely the chains of hydrogen bonds linking the polypeptide main chains in (3-barrels. Among the variety of hydrogen bond networks, those involving salt bridges are of particular interest. Because the proton donor and proton acceptor groups have a net charge of +1 and - 1 p.u., respectively, these networks are often called ion clusters. This definition emphasizes the ionic character of the salt bridge networks only. We should keep in mind that the hydrogen bond properties, such as the directionality, have a dominating role. Salt bridges often link secondary structure elements, fixing in this way their mutual orientation, i.e. salt bridges are expected to stabilise the three-dimensional structure of proteins. The same is valid for salt bridge networks. Figure 3.19 shows a salt bridge network connecting the two subunits of the protein disulfide oxidoreductase from Pyrococcus furiosus. The network consists of ten functional groups. The total number of bonds is 14, out of which six are connecting the two subunits. These hydrogen bonds are indicated by green dotted lines in the figure. The rest of the hydrogen bonds are also relevant because they keep the interacting groups on the right positions and orientations, ensuring in this way energetically favourable, hence stabilising configuration of the partners. In this context, salt bridge networks are highly cooperative. Indeed, a
81
Hydrogen Bonds
number of experimental observations based on site-directed mutagenesis of the protein considered in our example show that the removal of one of the supporting side chains (those that do not make hydrogen bonds between the subunits) leads as a rule to the disturbance of the balance within the network and to its disintegration. (A)
(B)
Figure 3.19 (A) Salt bridge network connecting two subunits of disulphide oxidoreductase from Pyrococcus furiosus5. The hydrogen bonds connecting the subunits are given in green. (B) Cartoon presentation of the two subunits of disulphide oxidoreductase. The individual subunits are given in different colours. The region of the salt bridge network is indicated.
Large salt bridge networks are most often observed in proteins from hyperthermophilic organisms. The natural environment of these organisms is characterized by a temperature range between 80 and about 100°C. Obviously, the proteins from hyperthermophilic organisms must hold their biologically active three-dimensional structures at these extremely high temperatures. Their counterparts from mesophilic organisms (all plants or animals, for instance) as a rule denature at temperature around 60°C. The observed increase of the number and the size of salt bridge networks in the proteins from hyperthermophilic organisms becomes even more remarkable when we take into account the fact that the three-dimensional structures of the corresponding counterparts from mesophilic organisms do not differ essentially. One
82
Introduction to Non-covalent Interactions in Proteins
can conclude that salt bridge networks are a factor stabilising the protein structure at high temperatures. We will consider this hypothesis in Chapter 8.
Figure 3.20 Water channel in alcohol dehydrogenase from Drosophila lebanonc-nsis6. The carbon atoms of the substrate NAD* are given in light-grccn. The electron densities of the water molecules (contoured at 1-sigma) are presented in blue. The drawing was kindly provided by Dr. J. Benach, Columbia University, Dept. of Biological Sciences.
Hydrogen Bonds
83
Hydrogen bond networks occur not only between the functional groups of the proteins. Water molecules successfully compete to form hydrogen bonds and often participate in hydrogen bond networks. Water molecules can link two or more side chain functional groups via hydrogen bonds. In concave regions of the protein surface clusters involving hydrogen bound water molecules and protein polar groups can be formed. The fact that such clusters are observed in protein crystal structures suggests that the positions of the water molecules are energetically favourable and that they are occupied not only in the crystalline state, but also when proteins are in solution. This is most likely to be true for water clusters that form channels penetrating deep in the protein moiety. Such a water channel is illustrated in Fig. 3.20. It connects the active site of the protein alcohol dehydrogenase (upper part of the figure including the residues Tyrl51, Lysl55, and the co-factor NAD+) with the bulk solution (bottom part). The hydrogen bond network contains nine water molecules which occupy a cleft and form hydrogen bonds between themselves and with the polar groups lining the cleft. From a structural point of view, this hydrogen bond network connects different structural elements of the molecule and in this context it plays a stabilising role. It is also speculated that it can be involved in the catalytic functions of the protein ensuring access of the active site to the solvent. 3.4 Hydrogen Bonds and Protein Stability In the previous section we mentioned several times that the formation of hydrogen bonds and hydrogen bond networks has a stabilising impact on protein structure. The direct evaluation of the energetics of a distinct hydrogen bond in the protein molecule is not a straightforward task. Formation or breaking of a hydrogen bond is accompanied by changes of the interactions between other atoms, not participating in the hydrogen bond under consideration.
84
Introduction to Non-covalent Interactions in Proteins
3.4.1 Hydrogen bonds within the polypeptide chain, role infolding We have already seen that the formation of secondary structure elements is correlated with the formation of well defined patterns of hydrogen bonds. The question arises: is the formation of hydrogen bonds between the peptide groups a driving force for folding of the polypeptide chain and its stabilisation? In order to answer this question one needs to show that the energy of the hydrogen bond between -NH and 0=C groups is more favourable in a folded than in an unfolded protein. The solution of this difficult problem can be approached by investigations of hydrogen bonding of a model compound which is maximally similar to the peptide groups. Also, we should find appropriate environments (solvents) which are maximally similar to the environment of the peptide groups in folded and unfolded proteins. This set-up represents a primitive modelling of these two states of a protein molecule. A very instructive example for the evaluation of the contribution of the hydrogen bonds in the stabilisation of proteins is given in the book of Kozo Hamaguchi "The Protein Molecule"7. It is based on experimental measurements of methylacetamide, a compound very similar to the peptide group (Fig. 3.21). H H 3 C-N-(J-CH 3 O Figure 3.21 Structural formula of methylacetamide.
Assume that in unfolded state the peptide groups of the protein are fully hydrated. This means that the N-H and 0=C groups are free to form hydrogen bonds between themselves, as well as with the surrounding water molecules. This situation is simulated by a water solution of methylacetamide. The free energy of formation of a hydrogen bond between two methylacetamide molecules in water is AGHB,w= 3.1 kcal/mol. The positive value of AGHB,W shows that the formation of hydrogen bond between two methylacetamide molecules is unfavourable.
Hydrogen Bonds
85
It also shows that water molecules successfully compete for hydrogen bonding with the molecules of methylacetamide. In folded state of the protein molecule the peptide groups are usually inaccessible to the solvent. In this way the competition of water molecules for hydrogen bonding to N-H and 0=C groups is eliminated or essentially reduced. Such a situation can be simulated by substituting water as solvent by a non-polar solvent, for instance tetrachlormethane (carbon tetrachloride, CC14). Experimental measurements of methylacetamide hydrogen bonding in such a solvent give AGHB^non.poiar = -2.4 kcal/mol. This result shows that non-polar environment stimulates the formation of hydrogen bond between the N-H and 0=C groups. In the terms of our model the process of folding of a protein molecule can be regarded as a change of the environment of peptide groups: from fully hydrous to an anhydrous. Hence, we are interested in the behaviour of the hydrogen bond N - H " ' 0 = 0 upon this change. The free energy change of formation of the hydrogen bond due to the change of its environment, AGfoi, can be estimated by means of the thermodynamic cycle shown in Fig. 3.22 and the relation (see Appendix A and Fig. A.l) AGHB,W
+ AGfoM + (—AGHB,non-polar) + AGtr
= 0 ,
where AGtr is the free energy of transfer of methylacetamide from the medium of tetrachlormethane to that of water. In the above expression the term AGHBnon.poiar is taken with sign minus, because we have defined it as the free energy of formation of the hydrogen bond, whilst the free energy of dissociation participates in the thermodynamic cycle. The value of AGfoid calculated in this way is 0.62 kcal/mol. This value does not change essentially when tetrachlormethane is substituted by another non-polar solvent. It follows that the stability of the hydrogen bond N-H-"0=0 does not depend on the polarity of the solvent. Thus, according to this model, the stability of a hydrogen bond between two peptide groups is not essentially influenced by the change of the environment. This result suggests a negative answer of the question posed at the beginning of this section, namely that these hydrogen bonds do not contribute to the spontaneous folding of the protein molecule. In the light of the observation that the secondary structure always arises with formation of hydrogen bonds between the peptide groups such a
86
Introduction to Non-covalent Interactions in Proteins
result might seem, to a certain extent, surprising. It might be a consequence of certain shortcomings in the concept of the model. Indeed, the model used for evaluation of AGfoid is not an exact match of the environment of the peptide groups in the protein molecule. The experimental data used to calculate AGfoM are for diluted solutions. In the polypeptide chain, the peptide groups are enforced to be close to each other. This situation resembles a high concentration solution rather than the diluted solutions. Also, possible cooperative effects upon formation of hydrogen bonds in proteins are excluded.
(Water) N-H 0=C
AGHB.W N-H-0=C
AG'fold
AGtt N-H
(CCl4)£_£ To CC14—water CC14—water C2H6—water C5H12—water QH^—water
AH, (kcal/mol) -2.4 -1.7 0 -0.5 0
AS, (cal/mol/deg) -19 -18 -14 -25 -26
AG, (kcal/mol) 3.3 3.7 4.1 6.8 7.7
102
Introduction to Non-covalent Interactions in Proteins
Based on these data we can make some conclusions about the origin of hydrophobic interactions. The interactions which are characteristic of the molecules in the system are electrostatic interactions (also in hydrogen bonding) and dispersion forces (in van der Waals interactions and in hydrogen bonding). The association of the non-polar molecules, which is the essence of hydrophobic interactions, is a result of the tendency of the system to increase its entropy. Thus, hydrophobic interactions are an effect which results from the behaviour of the system, hence they are pseudo forces. There are no hydrophobic interactions between two molecules out of the context of a system. Therefore, it is correct to speak about hydrophobic effect, rather than about hydrophobic interactions. As far as the latter is commonly accepted, we will specify it as follows: the term "hydrophobic interactions" refers to the phenomenon of the tendency for association of non-polar molecules in aqueous medium. Our conclusions were made on the basis of data obtained at standard conditions. Both AH, and AS, increase with temperature. However, in the temperature interval we are interested in, namely the interval within which biological processes occur, the change of the free energy is both small and positive. This means that in this interval solubility of hydrocarbons slightly reduces with temperature. Accordingly, the hydrophobic effect slightly increases. 4.4 Hydrophobic Interactions in Proteins Due to the hydrophobic effect, the non-polar side chains avoid contact with water and tend to assemble close to each other. As a result, the polypeptide chain collapses so that the hydrophobic residues form the hydrophobic core of the protein molecule. It should be noted that this process occurs at certain conditions, such as appropriate temperature and pH, lack of denaturing co-solvents, etc. We assume that these conditions are fulfilled. A simplified presentation of a protein built by two types of amino acids, polar and non-polar (hydrophobic) is shown in Fig. 4.6. In the unfolded state of the protein molecule all amino acids are accessible to
Hydrophobic Interactions
103
the solvent, which, as usual, is water. Due to the low solubility in water the non-polar groups tend to collapse. Thus, the unfolded protein chain spontaneously folds so that the hydrophobic amino acid residues have minimum contact with water. The polar residues, on the contrary, are soluble in water, so they tend to stay on the protein surface, forming hydrogen bonds between themselves and with the surrounding water molecules. This organisation of the polar and hydrophobic amino acid residues in a folded protein is illustrated in the right hand side of Fig. 4.6. It resembles a clathrate structure around dissolved hydrocarbons. The elements of the "protein clathrate shells" are also parts of the molecule, ensuring in this way favourable interactions with the solvent, solubility and stability of the protein molecule. Here emerges the important role of the hydrogen bond networks, including amino acid side chains and water molecules we have considered in Section 3.3.4. One of the features of the hydrogen bond networks on the protein surface is the stabilisation of the polar shell insulating the hydrophobic core of the protein molecule. It should be noted that this picture is to a certain extent idealised. It would be incorrect to deem the hydrogen bond networks on the protein surface as a nutshell protecting its hydrophobic "kernel". A significant area of the protein/solvent interface is hydrophobic, as we shall see below. (A)
(B)
Figure 4.6 Simplified illustration of unfolded (A) and folded (B) protein molecule. Hydrophobic and polar amino acids are presented as circles and ellipses, respectively.
The picture will be incomplete if we do not mention the membrane proteins. The hydrophobic effect is clearly manifested in this class of proteins. The X-ray structures of membrane proteins show that the amino acid side chains which are in contact with the aliphatic moiety of the
104
Introduction to Non-covalent Interactions in Proteins
membrane are hydrophobic. The parts of the protein molecule protruding out of the membrane have the characteristics of the water soluble proteins: hydrophobic core surrounded by a shell of polar amino acid side chains. 4.4.1 Additivity of hydrophobic interactions It is desirable to have an expression that gives a quantitative measure of the magnitude of hydrophobic interactions in proteins. Because hydrophobic interactions appear as a result of the behaviour of the system, we need to investigate how the free energy of the system changes upon the formation of the hydrophobic core. That is, we need to evaluate the contribution of the hydrophobic interactions, AGh, to the free energy of transition of the system from state A to state B (panels A and B of Fig. 4.6, respectively). Direct measurements of AGh cannot be done. Measurements of the transition from state A (unfolded protein) to state B (folded protein), or vice versa, can be performed, however, the energy obtained by such experiments is the free energy of folding, AGU~^, or the free energy of unfolding, AG*""1, respectively. These quantities are not of interest at the moment. In order to estimate AGh we will use a model for which Eqs. (4.3) and (4.4) are applicable. Let us employ the approximation used by the evaluation of the role of the hydrogen bonds between the peptide groups in protein stability (Section 3.4.1). We assume that in the unfolded state the amino acid side chains are fully hydrated, whereas in folded state they are immersed in the protein interior and completely inaccessible to the solvent (water). As before, the protein interior is presented as a nonpolar material. This approximation is very rough, however, it allows us to use experimental data of solubility of amino acids in water and nonpolar solvents. The connection between solubility and energy of transfer from non-polar solvent to water is given by Eq. (4.4). The values of the energies of transfer, AG,, of several amino acids are listed in Table 4.3. The transfer energies of the individual amino acids are negative, reflecting the fact that they are soluble in water. Nozaki and Tanford5 have assumed that the free energy of transfer can be split into two additive parts: the free energy of transfer of the glycine and that of the
Hydrophobic Interactions
105
side chain. The latter is denoted by Ag, and is called hydrophobicity of the side chains. The hydrophobic side chains of the amino acids listed in the table have positive values of Ag, which corresponds to their expected low solubility in water. If the hypothesis for additivity of the free energy of transfer is valid, additivity should be applicable to any constituents of the amino acid, not only to the main chain and side chain parts. Indeed, the difference between the transfer energies of methane and ethane is equal to the difference between glycine and alanine: 0.73 kcal/mol. The difference in the chemical composition in both cases is just a CH3 group, indicating that the hypothesis for additivity holds. There are also other experimental observations supporting the hypothesis for additivity. On this basis we can partition the free energy of transfer of the individual amino acids and consider only these components which are involved in hydrophobic interactions. Table 4.3 Free energy of transfer AG,? and hydrophobicity Ag, for several amino acids. Amino acid Glycine Alanine Valine Leucine Isoleucine Phenylalanine Proline
AG, (kcal/mol) -4.63 -3.90 -2.94 -2.21 -1.69 -1.98 -2.09
Ag, (kcal/mol) 0 0.73 1.69 2.24 2.97 2.65 2.60
4.4.2 Solvent accessibility The above finding is appropriate for estimation of the hydrophobic interactions if the amino acids are fully immersed in the protein interior (folded state) or fully accessible to water (unfolded state). We know however that there are hydrophobic side chains in proteins which are not completely buried. For these cases, the model we are using is inadequate. We will refine it, using another important observation. As we have already noticed, oil drops in water, or assembly of hydrocarbons into aggregates, are accompanied by a reduction of the interface area between the solute and water. It is interesting to see whether there is a correlation between the observed tendency of
106
Introduction to Non-covalent Interactions in Proteins
reduction of solvent accessibility of the hydrocarbon aggregates and the magnitude of the hydrophobic interactions. A large number of experimental measurements of solubility and transfer energy of hydrocarbons with different lengths have convincingly shown that there is such a correlation. It turns out that hydrophobicity linearly depends on the solvent accessible surface of the hydrocarbon molecules. This allows us to introduce a specific quantity, Afh, corresponding to the transfer energy per unit solvent accessible area. The value of Afh is between 19 and 28 cal/mol/A2, depending on the estimates of the solvent accessible surface. 3.0
ile o leuO
2.5 |
2.0 O
ip ^
1.0 0.5 100
150
Solvent accessibility surface, A
200 2
Figure 4.7 Hydrophobicity versus solvent accessible area of the hydrophobic amino acids. Solid circles: data from Table 4.3; open circles: data from Fauchere and Pliska6.
This linearity is also observed for hydrophobicity of the amino acid side chains. In Fig. 4.7 the relation between Agt of the hydrophobic amino acid and their solvent accessible surface is given. The slope, Afh = 22 cal/mol/A2, falls within the range 1 9 - 2 8 cal/mol/A2 found for other hydrocarbons. Figure 4.7 also shows that data obtained by different experimental approaches can differ. This is clearly seen for the cases of proline and the couple leucine and isoleucine. This difference can be explained by the influence of the a-amino- and carboxyl groups, which under the experimental conditions are charged. The experiments (open circles in Fig. 4.7) performed with amino acids having these groups
Hydrophobic Interactions
107
blocked (acetyl-X-amide, where X is the amino acid side chain) eliminate this influence (see also Table 4.4 for comparison of Afh obtained by the two methods). Leucine and isoleucine differ in the position of branching of their side chains, and hence in its distance to the charged groups. After eliminating this influence the divergence in Ag, is reduced. A linear dependence of Agh on the solvent accessibility surface area is also observed for the side chains containing hydroxyl group. The slope in this case is Afh = 26 cal/mol/A2. Thus, one can conclude that the linear relation between Ag, and the solvent accessibility surface holds. Based on the results of the above analysis, we are now able to construct an expression relating Ag, and the solvent accessibility area. Table 4.4 Solvent accessibility7 in A2 and hydrophobicity5'6, Ag in kcal/mol of the amino acids. Residue ala arg asn asp cys gin glu giy his ile leu lys met phe pro ser thr trp tyr val
Total 113 241 158 151 140 189 183 85 194 182 180 211 204 218 143 122 146 259 229 160
Side chain 67 196 113 106 104 144 138
Non-polar 67 89 44 48 35 53 61
Polar
151 140 137 167 160 175 105 80 102 217 187 117
102 140 137 119 117 175 105 44 74 190 144 117
49
107 69 58 69 91 77
48 43
36 28 27 43
Ag5 0.5 0.0 0.5 -0.1 0.5 0.5 3.0 1.8 1.3 2.5 2.6 -0.3 0.4 3.4 2.3 1.5
Ag6 0.4 -1.4 -0.8 -1.1 2.1 -0.3 -0.9 0.2 2.5 2.3 -1.4 1.7 2.4 1.0 -0.1 0.4 3.1 1.3 1.7
We already know that the energy of transfer of amino acid side chains is additive and proportional to the surface area exposed to the solvent. This proportionality is linear with an average slope Afh, of about 24 cal/mol/A2. If we know the area,
108
Introduction to Non-covalent Interactions in Proteins
ASA = SA}h
-SA%,
of the hydrocarbon constituents of the side chains that becomes inaccessible to the solvent upon folding of the protein (the transition from state A to state B illustrated in Fig. 4.6), we can calculate AG using the simple relation AGh=AfhASAh.
(4.5)
h
Obviously ASAh < 0, so that AG < 0, in accordance with the fact that the burial of hydrophobic groups in the protein interior is a favourable process.
Figure 4.8 Solvent accessibility surface (left) and solvent contact surface (right).
The task that remains to be solved is the calculation of the solvent accessible area of the unfolded, SA^, and of the folded, SA^ , states of the protein. Solvent accessible surface is defined as the area described by the centre of a spherical solvent molecule, which rolls over the solute molecule (left hand side panel of Fig. 4.8). Solvent accessible surface is a purely geometrical term, therefore we are interested only in the size and mutual disposition of the atoms. Other properties, such as charge distribution or ability for hydrogen bonding, are ignored. In our case, the solvent molecule is water. Usually, the radius of the sphere representing the water molecule is taken to be 1.4 A. The shape of the solute molecule is determined by the van der Waals radii of the individual atoms (see end of Section 2.3 for definition). Solvent accessible surface should be
Hydrophobic Interactions
109
distinguished from the solvent contact surface. The latter is the area determined by the van der Waals envelope of the solute (right hand side panel of Fig. 4.8). Two issues should be borne in mind when SA is to be calculated. The first one concerns the van der Waals radii. There are no rigorous rules to follow when van der Waals radii are to be chosen. In Table 4.5 a few sets of van der Waals radii are given. One peculiarity of the radii listed in the table is that they are not explicit van der Waals radii. These radii are defined so that they take into account the hydrogen atoms bound to the "main" atom. For instance, the carbonyl oxygen in the data set of Getzoff has a radius of 1.40, whereas the hydroxyl oxygen atom is somewhat larger reflecting the presence of a bound hydrogen atom. The atoms with radii accounting for hydrogen atoms are called united atoms. United atoms are very useful because the majority of protein structural data do not contain information about the hydrogen atoms. Table 4.5 Van der Waals radii used for SA calculations. Carbon, not specified Tetrahedral carbon Trigonal carbon Nitrogen, not specified Tetrahedral nitrogen Trigonal nitrogen Oxygen, not specified Oxygen (carbonyl) Sulphur, not specified Sulphur (SH)
Ref. 8 1.80 1.70 (Ca) 1.80 1.80 1.55 1.80 1.52 1.8
Ref. 9 1.87 1.76 1.50 1.65 1.40 1.85
Ref. 10 2.00 1.86 (CH) 1.74 2.00 1.80 1.70 1.40 1.60 (OH) 1.80 1.85
Ref. 11 1.87
Ref. 12 2.0 1.7-1.86
1.65
1.40 1.85
2.0 1.5-1.7 1.4 1.5 (OH) 1.85 2.0
The second important point that should be noted is that SA is calculated for a fixed structure of the protein molecule. For that reason Lee and Richards8, the authors of the first algorithm for calculation of SA, called this quantity "static solvent accessibility". Usually, the fixed conformation is that of the protein crystalline state. In the unfolded state, the number of conformations that the main and the side chains can adopt is huge. According to the model of unfolded state we have accepted, the amino acid side chains are completely
110
Introduction to Non-covalent Interactions in Proteins
hydrated. This reduces the complexity arising from the large number of conformations, because a complete hydration corresponds to the extended conformations of both the amino acid side chains and the polypeptide backbone. Hence, the values of solvent accessibilities of the individual amino acids can be obtained by calculations based on a single conformation. Because the solvent accessibility of the side chains is of interest, the backbone chain is usually simulated by the tripeptide gly-X-gly in extended conformation, where X is an amino acid of a given type. This is another simplification of the task, because once calculated, the values of SA for the different types of amino acids can be tabulated (Table 4.4) and used to calculate SA^ for all proteins. The value of SA^ is just the sum of the solvent accessibility surfaces of the individual amino acid side chains according to the protein sequence. The solvent accessible area of the individual side chains in folded state depends on their environment and, of course, in order to calculate SA^ the three-dimensional structure of the protein should be known. The calculation of the solvent accessibility of the individual atoms in folded proteins can be performed using a very simple scheme. In Fig. 4.9, two atoms and their solvent accessible surface are shown as a pair of two co-centric spheres. The inner spheres represent the atoms, whereas the outer spheres their solvent accessible surfaces. The radius of sphere A, R = rvdw + rwater, is the sum of the van der Waals radius of the atom (the inner sphere) and the radius of the water molecule. The radius of sphere B is determined in the same way. The two radii can differ if the van der Waals radii of the atoms A and B differ. We can imagine that a large number of points are uniformly distributed on the surface of the outer spheres. To each point, a certain area of the sphere surface dSA = 47rR2/n belongs, where n is the total number of points distributed on one sphere. If the two atoms are at a distance at which a water molecule cannot be situated between them, the outer spheres overlap. All points on the overlapped hemispheres are then inaccessible to water. This gives a simple criterion for solvent inaccessibility (or alternatively, for solvent accessibility) of the points. Let us consider two points on the sphere B. If the distance between a point and the centre of the sphere A is
Hydrophobic Interactions
111
less than R the point is buried in the interior of the sphere A, hence inaccessible to the solvent. This is the case of point / from sphere B, for which dc_i < R. For this point SSA, = 0. Point j is accessible because dc_j > R and dSAj = 4flR2/n. The solvent accessible area of atom B is then SA B =]T5SA, . k
The same procedure can be applied for a set of large number of atoms, for instance the atoms of a protein with known three-dimensional structure. The only technical difference is that the criterion for solvent accessibility has to be checked for more than one neighbour. The total solvent accessibility is then the sum of accessibilities of the individual atoms.
Figure 4.9 Calculation of solvent accessibility.
4.4.3 Evaluation of hydrophobic interactions We already have all the tools needed to evaluate the energy contribution of hydrophobic interactions to the stability of a protein molecule. We shall make this evaluation using the molecule of human y-interferon as an example (Fig. 4.10). As seen from the figure, this molecule is a dimer forming two symmetrical domains.
II 2
Introduction to Non-COvalent Interactions in Proteins
Figure 4.10 Human y-interferon. Upper panel: Topology of the two subunits of the molecule. Subunit L and R are coloured in turquoise and brown, respectively. Each subunit contains six a-helices depicted as circles connected by non-helical segments. Lower panel: An alternative view of the molecule. Subunit L is illustrated as a cartoon drawing, whereas subunit R with full-space van der Waals spheres. The last helix of each subunit is rich in hydrophobic amino acids (pointed by the arrow) and is immersed in the hydrophobic core (marked with ellipse) of the other subunil. In this way the molecule is characterised by two domains (L and R) and two hydrophobic cores.
Taking into account the hydrophobic amino acids only (see Fig. 4.7), one calculates 5-4/ =786 A 2 for the first domain and SA[ =811 A 2 for the second domain. The difference between the solvent accessibility of the side chains is due to a small difference in their conformations. Although this difference is not relevant for the current evaluation, it is worth pointing out that we work with static solvent accessibilities. Static
Hydrophobic Interactions
113
solvent accessibility, or as one often reads in the literature, solvent accessibility, is sensitive to the conformation. The value of SAfr = 5202 A for the two domains is the same because it corresponds to a fully extended conformation of the backbone and the side chains. Applying Eq. (4.5) with Afh = 22 cal/mol/A2 (because we took into account only the amino acids with pure hydrocarbon side chains) for AG '(domain) we obtain -97.2 and -96.6 kcal/mol for the individual domains, respectively. The contribution of the burial of the pure hydrophobic side chains to the stability of the whole molecule is then the sum of the above two values: AGh = -193.8 kcal/mol. If we include in the calculations the hydrophobic constituents of all side chains, such as the aliphatic part of the lysines, and use the average value of Afh = 24 cal/mol/A2 we obtain for AGh a value of -405.6 kcal/mol. This result shows that the effect of burial of hydrophobic material in the protein interior, i.e. the contribution of hydrophobic interactions, is very large. It also shows that all amino acids but glycine contribute to the significant contribution of the hydrophobic interactions. The large value of AGh suggests that hydrophobic interactions should be the main contributor to the stabilisation of the native protein structure. Based on evaluations of AGh similar to that of our example, a common opinion has been formed fully supporting the conclusion of Kauzmann, namely that hydrophobic interactions are one of the driving forces of protein folding. This should be understood as a force driving the polypeptide chain to adopt those folds at which a hydrophobic core can be formed. Among these folds is the one we call native, the functionally active, three-dimensional structure of the protein molecule. The question about the interplay of the different interactions leading to this unique fold, i.e. the prediction of the three-dimensional structure encoded in the protein sequence, according to the Anfinsen's dogma, is still open. According to the model used for evaluation of the AGh, the unfolded state is assumed to be a fully extended conformation of the protein molecule with maximum solvent accessibility. This means that the values of AGh obtained on the basis of this model set the upper limit of the contribution of hydrophobic interactions. It however does not make the above conclusion less relevant. Unfolding experiments show that there is
114
Introduction to Non-covalent Interactions in Proteins
a significant increase of heat capacity upon unfolding which is explained by a large increase of the hydration (increase of the solvent accessibility) of the hydrophobic moiety of the unfolded protein. Hence, AGh is more likely less than, but yet close to, the magnitude estimated by this model. It is notable that in spite of the large value of AGh proteins have a relatively low, and in some cases, marginal stability. Thus for instance, the experimentally measured stability of human y-interferon is AG"~*f~-3 kcal/mol at pH 7. The reason for the low stability of this protein is not known. Usually, stability of proteins amount to values of AGu^f between -10 and -20 kcal/mol. Still, these values are essentially lower than AGh. We have seen that the assembly of hydrocarbons is driven by a favourable increase of the entropy of the system. This favourable entropy change is due to the release of water molecules from the clathrates upon assembly of the non-polar molecules, in this way increasing their degrees of freedom. The same applies for AGh, however we have to take into account an additional factor, namely the change of entropy arising from the reduction of the degrees of freedom of the polypeptide upon folding of the protein molecule. The entropy of a system in a given state is given by the expression (A.31, Appendix A) S = ~R^jPilnPi,
(4.6)
i
where Pt is the probability for the system to be in microscopic state i and R is the gas constant. The value of S is positive because lnP, < 0 when Pi< 1. Because the main contribution to the entropy change upon folding of the polypeptide arises from the loss of conformational degrees of freedom of the protein molecule, we will consider only this part of the entropy: the conformational entropy. In this case, the different microstates of the system become the different conformations of the polypeptide, including the conformations of the side chains, whilst Pt becomes the probability conformation / to be realised. The change of the conformational entropy upon folding is
K2hsLs-Konf
(4.7)
Hydrophobic Interactions
The evaluation of AS^J
115
is not an easy task. Nevertheless, in order to
get a feeling about the magnitude of its contribution we shall perform some calculations making a few simplifying assumptions. We shall assume that in denatured state all conformations have equal probability. This will reduce Eq. (4.6) to S = - / ? £ - l n - = / ? £ - l n L = .RlnL, . L L . L
(4.8)
where L is the number of conformations. Although the polypeptide chain is flexible and the combinations of the peptide angles
. The rest of the space, Zone III, is the medium of the
Electrostatic Interactions
131
solvent, which is characterised with dielectric constant ^ and a certain distribution of charge density p(r).
Ill Figure 5.1 Model and parameters of the Debye-Hiickel theory.
The value of p(r) is defined as the sum of the charge of all ions that reside in a certain volume element. Because p(r) is assumed to be spherically symmetrical, we can work with p(r), where r is the distance between the central ion and the position of the volume element. Also, due to the assumption for continuity, p{r) is continuous function of r in the whole region where it is defined, i.e. in Zone III. The total charge in Zone III is oo
\\7tr1 p{r)dr - -q a
which is an expression of the electrical neutrality. The electrostatic potential, (p, in the different zones is given by the following expressions: V2p = 0
(5.1)
&2
0, the potential in Zone I is reduced by subtracting from the Coulomb's term a fraction q 47t£0esR
KR \ + m'
which reflects the influence of the counterions via the Debye parameter. The larger the ionic strength, respectively K, the larger the subtracted fraction. Equations. (5.10b) and (5.10c) are read in the same way. At / = 0, the two equations are identical and equivalent to the Coulomb's law. At non-zero ionic strength, the electrostatic potential reduces due to the screening effect of the counterions, the measure of which is the Debye parameter. In this case the two expressions differ as the electrostatic potential in Zone II reduces more strongly with r in comparison with that in Zone III. 5.1.4 Extension for proteins We can directly employ the ideas of the Debye-Hiickel model for proteins. For instance, the protein molecule can be represented as a sphere with low dielectric constant immersed in the medium of the solvent, the latter being characterised with high dielectric constant. In this way a dielectric cavity is formed, giving the name of the model: dielectric cavity model. The charges of the protein can be considered as uniformly distributed on the surface of this sphere. Such a model was first proposed by Linderstr0m-Lang1 for calculation of protein titration curves (for titration curves see next chapter). An extended theory of the spherical dielectric cavity model has been proposed by Kirkwood2 and later adopted for proteins by Tanford and Kirkwood3. The fundamental difference between this and the DebyeHiickel model is that the charges on the surface of the dielectric cavity are represented as point charges with fixed spatial coordinates. The solution of the electrostatic potential for such a system is based on the same ideology as in the Debye-Hiickel theory, presented in some more detail in Appendix C; however, the lack of spherical symmetry makes it
140
Introduction to Non-covalent Interactions in Proteins
more complicated. We will skip studying this model because nowadays it has limited practical use. Instead, in Section 5.3 we shall consider a more general model which assumes dielectric cavity with an arbitrary shape. 5.2 Ion-Solvent Interactions It was shown in the previous chapter that non-polar constituents of the amino acid side chains tend to minimise contact to the solvent (water), thus forming a hydrophobic core. In this way polar and charged groups remain preferably on the surface of the protein molecule. It is interesting to see whether the expelling of polar and charged groups from the protein interior depend on the hydrophobic effect alone or there are other forces favouring this process. Consider NaCl, a compound with a partial ionic character of the covalent bond of about 75% (see also Fig. 3.2). In water NaCl spontaneously dissociates to Na+ and Cl~. The electrostatic field created by each of the ions causes reorientation of the surrounding water molecules. Their permanent dipoles take up a direction towards the ion, thus forming an organised structure tending to compensate its electrostatic field. This organised structure is called hydration shell. The formation of hydration shell is energetically favourable and drives the dissociation of the salt molecules. In the continuum model used in the previous section the compensation of the electrostatic field by the solvent molecules is accounted for by the dielectric constant, es. Solvents with different dielectric constants have a different compensatory effect. In the following discussion we keep on using this model, thus replacing properties of the solvent with its dielectric constant. 5.2.1 Born model We are interested in finding an expression allowing evaluation of the ion-solvent interactions. For this purpose we will use the Born model. This model is based on the assumptions of the Debye-Huckel theory, hence it is a continuum model.
Electrostatic Interactions
141
vacuum qJO)< •'» charging
AG solv
^•Jmass
[q^^mq=o solvent
Figure 5.5 Thermodynamic cycle for calculation of transfer of a charged sphere from vacuum to a solvent.
The dissociation of a salt molecule, such as NaCl, can be considered as a process of creation of charges. Where these charges come from is not important here, so that the description of their appearance is a matter of convenience. In the Born model it is assumed that the charges are transferred from vacuum to the medium of the solvent. The energy of this transfer is then the ion-solvent interaction energy. It can be obtained by means of the thermodynamic cycle shown in Fig. 5.5. The processes involved in this cycle are the following: The anti-clockwise direction begins with the process of charging of a sphere in vacuum then the charged sphere (the ion) is transferred from vacuum to the solvent medium. The clockwise direction of the cycle describes the process of transfer of the uncharged sphere from vacuum to the solvent medium and consequently, charging the sphere in the medium of the solvent. The two sides of the cycle are by definition equivalent (see Appendix A and Fig. A.l): W + AGsolv = AGmass + W, where AGsoiv is the ion-solvent interaction energy or the solvation energy, AGmass is the energy of transfer of the uncharged sphere, whereas W° and W the work needed to charge the sphere in vacuum and in solvent environment, respectively. All quantities have the meaning of free
142
Introduction to Non-covalent Interactions in Proteins
energies. Provided that only electrostatic interactions are relevant, AG,mass can be set to zero (which is an approximation). For AGsoiv one obtains: AGsolv = W-W°. (5.11) The quantities W° and W can be obtained by y
charging
J V**
0
where the electrostatic potential #>is given by Eq. (5.10a). Because we want to extract the ion-solvent interactions only, we set the ionic strength to be equal to zero. After performing the integration of the above equation for W° and W one obtains w
o='f_?_^ J
0
=
4^0/?
I_?l_
(512a)
2 4xe0R
and
w=[
?
^JtEQesR
dq=-
2
1
2 4xe0esR
,
(5.12b)
respectively. The quantities W° and W are called self energies. As seen, the self energy is always positive and increases with the decrease of the radius R. For a point charge it becomes infinitely large. Substituting the expressions for the self energies in Eq. (5.10) and performing a simple arithmetic for the solvation energy one obtains
AG
2
soiv=-~—a—). %7tEQR
,
(5.i3)
£s
This is the Bom formula for the interaction energy between the solvent and a single ion. It is important to note the minus sign in the right hand side of Eq. (5.13). All quantities on this side of the equation, including (l-l/£0, are positive, meaning that zlG so , v %ne0R
es
e; oT
(5 16)
-
144
Introduction to Non-covalent Interactions in Proteins
As a rule, the calculated values using Eq. (5.16) are somewhat larger than the experimental one. One can point out a few factors responsible for this discrepancy. First, the value of R can be too small, i.e. the radii of the ions observed in ionic crystals are smaller than those of the dissolved ions. Second, AGmms can be other than zero as it is assumed in the model. Third, the dielectric constant around the ion can be lower than that in the bulk due to non-linear effects. There are approaches that account to a certain extent for these effects, which are beyond the scope of our considerations. 5.2.2 Application of the Born model for proteins: why do charges tend to be on protein surface? The outcome of the Born model gives the answer to the question posed at the beginning of this section. The expelling of charged groups from the protein interior is an energetically favourable process. We shall investigate this process by analysing the burial of a charged group in the protein interior upon folding. In order to employ the Born theory directly we approximate the charged group to a sphere with radius R and charge evenly distributed on its surface. Let us make use of the model utilised for evaluation of the energetics of hydrogen bonds between the peptide groups in proteins (Section 3.4.1). Following the logic of this model, we assume that in unfolded state the charged groups are fully hydrated. This means that the charged sphere is entirely immersed in the medium of the solvent with dielectric constant £s. It is clear that such an assumption ignores the fact that even in a fully hydrated polypeptide chain the charged groups are not surrounded by homogenous material, as it is required in the Born theory. We will discuss this matter later in this section and in the next chapter. Here, we abide by this model to benefit from its simplicity. Assume that upon folding the charged groups become buried in the protein interior, which has a dielectric constant ep < es. This process we approximate as a transfer of a charged sphere from medium with dielectric constant es to a medium with dielectric constant £p. The energy of this transfer is obtained by Eq. (5.11):
Electrostatic Interactions
145
where 2 A7teQspR and 2 4x£0£sR are the self energies of the charged group in protein interior and in solvent, respectively. The energy ^ B o
r a
= ^ ( ^ - f ) %KE0K
£
(5-17)
£s
Because
(-!—L)>o. ^GBOITI >
0. It follows that the burial of a charged group in the protein interior is energetically unfavourable. For simplicity, the derivation of the expression for W5 was made for / = 0. It can be easily shown that the above result holds for / ^ 0, as well. The quantity ^GBorn is often called desolvation energy although it would be better to call it dehydration energy as far as the solvent is water. It is.also called Born energy, a name which we shall use from now on. Also, because AG^om is always positive and therefore appears to be a destabilising factor, one speaks about desolvation penalty. The above model can be further developed by representing the protein molecule as a sphere with low dielectric constant. We have already mentioned this presentment in Section 5.1.4 when we introduced the cavity model of Tanford and Kirkwood. This model is worth mentioning here, too, because it directly demonstrates that burying a charge in the protein interior is related to the increase of the electrostatic energy, i.e. the burial of a charge is energetically very costly. This theoretical observation remained unattended because charge-charge
146
Introduction to Non-covalent Interactions in Proteins
interaction calculations were a satisfactory basis for the interpretation of the experimental data available at the time. In several fundamental works4-6, Warshel et al. has recalled the phenomenon of desolvation by showing that the penetration of a charged sphere in the protein interior, represented as a spherical dielectric cavity, is about 35 kcal/mol. Warshel et al. stressed the fact that the high energy cost to bring a charge in the protein is more significant for the correct evaluation of electrostatic interactions in proteins than the charge-charge interactions between the ionisable groups. In the following discussions (see Chapter 6) we will have the opportunity to convince ourselves in the rightness of this statement. Summarising the above considerations, it becomes clear that due to the positive Born energy, the atoms or the groups having non-zero net charge prefer to be on the surface of the protein molecule, in this way reducing the desolvation penalty. In the context of the question posed at the beginning of this section we can conclude the following: The desolvation of the charges upon folding is unfavourable, i.e. it is a factor destabilising folded proteins. However, together with the hydrophobic effect, we see that the tendency of the non-polar constituents to form a hydrophobic core is parallel to the tendency of minimising the desolvation penalty by exposure of charged groups to the solvent. 5.2.3 Generalised Born theory for proteins Let us consider a system of charges as shown in Fig. 5.6. The energy needed to situate a charge % in the electrostatic field of the charge constellation (q\, q2, ...